Research

2,213 articles total

Research

Google rolls out Gemini in Chrome in 7 new countries

researchTechCrunch AI4/20/2026

Google is rolling out Gemini in Chrome in Australia, Indonesia, Japan, the Philippines, Singapore, South Korea, and Vietnam. The company is rolling this feature out to both desktop and iOS in all of these countries except Japan.

Claude Token Counter, now with model comparisons

researchSimon Willison's Weblog4/20/2026

<a href="https://tools.simonwillison.net/claude-token-counter">Claude Token Counter, now with model comparisons</a> I <a href="https://github.com/simonw/tools/pull/269">upgraded</a> my Claude Token Counter tool to add the ability to run the same count against different models in order to compare them. As far as I can tell Claude Opus 4.7 is the first model to change the tokenizer, so it's only worth running comparisons between 4.7 and 4.6. The Claude <a href="https

Changes in the system prompt between Claude Opus 4.6 and 4.7

researchSimon Willison's Weblog4/18/2026

Anthropic are the only major AI lab to <a href="https://platform.claude.com/docs/en/release-notes/system-prompts">publish the system prompts</a> for their user-facing chat systems. Their system prompt archive now dates all the way back to Claude 3 in July 2024 and it's always interesting to see how the system prompt evolves as they publish new models. Opus 4.7 shipped the other day (April 16, 2026) with a <a href="https://claude.ai/">Claude.ai</a> system prompt update since Opus 4.6 (F

Claude system prompts as a git timeline

researchSimon Willison's Weblog4/18/2026

Research: <a href="https://github.com/simonw/research/tree/main/extract-system-prompts#readme">Claude system prompts as a git timeline</a> Anthropic <a href="https://platform.claude.com/docs/en/release-notes/system-prompts">publish the system prompts</a> for Claude chat and make that page <a href="https://platform.claude.com/docs/en/release-notes/system-prompts.md">available as Markdown</a>. I had Claude Code turn that page into separate files for each model and model

OpenAI’s former Sora boss is leaving

researchThe Verge AI4/17/2026

Last month, OpenAI gave up on its Sora video generation tool, and on Friday, the Sora team's leader, Bill Peebles, announced that he is leaving the company. OpenAI has been shifting its priorities as part of an effort to avoid "side quests," and Peebles' departure is just one of many recent changes as the company […]

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

researchSimon Willison's Weblog4/16/2026

For anyone who has been (inadvisably) taking my <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/">pelican riding a bicycle benchmark</a> seriously as a robust way to test models, here are pelicans from this morning's two big model releases - <a href="https://qwen.ai/blog?id=qwen3.6-35b-a3b">Qwen3.6-35B-A3B from Alibaba</a> and <a href="https://www.anthropic.com/news/claude-opus-4-7">Claude Opus 4.7 from Anthropic</a>. Here's the Qwen 3.6 pelican, generated using <a hre

New ways to create personalized images in the Gemini app

researchGoogle AI Blog4/16/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Blog_post_header.max-600x600.format-webp.webp">Nano Banana 2 now uses your personal context and Google Photos to create images that reflect your unique life.

Treating enterprise AI as an operating layer

researchMIT Technology Review AI4/16/2026

There’s a fault line running through enterprise AI, and it’s not the one getting the most attention. The public conversation still tracks foundation models and benchmarks—GPT versus Gemini, reasoning scores, and marginal capability gains. But in practice, the more durable advantage is structural: who owns the operating layer where intelligence is applied, governed, and improved.…

Gemini 3.1 Flash TTS

researchSimon Willison's Weblog4/15/2026

<a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-tts/">Gemini 3.1 Flash TTS</a> Google released Gemini 3.1 Flash TTS today, a new text-to-speech model that can be directed using prompts. It's presented via the standard Gemini API using <code>gemini-3.1-flash-tts-preview</code> as the model ID, but can only output audio files. The <a href="https://ai.google.dev/gemini-api/docs/speech-generation#transcript-tags"

Gemini 3.1 Flash TTS

researchSimon Willison's Weblog4/15/2026

Tool: <a href="https://tools.simonwillison.net/gemini-flash-tts">Gemini 3.1 Flash TTS</a> See <a href="https://simonwillison.net/2026/Apr/15/gemini-31-flash-tts/">my notes</a> on Google's new Gemini 3.1 Flash TTS text-to-speech model. Tags: <a href="https://simonwillison.net/tags/gemini">gemini</a>, <a href="https://simonwillison.net/tags/google">google</a>

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

researchGoogle AI Blog4/15/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-3.1-flash-tts_blog_keywo.max-600x600.format-webp.webp">Gemini 3.1 Flash TTS is now available across Google products.

At the HumanX conference, everyone was talking about Claude

researchTechCrunch AI4/12/2026

Anthropic was the star of the show at San Francisco's AI-centric conference.

Anthropic temporarily banned OpenClaw’s creator from accessing Claude

researchTechCrunch AI4/10/2026

This ban took place after Claude's pricing changed for OpenClaw users last week.

Meta's new model is Muse Spark, and meta.ai chat has some interesting tools

researchSimon Willison's Weblog4/8/2026

Meta <a href="https://ai.meta.com/blog/introducing-muse-spark-msl/">announced Muse Spark</a> today, their first model release since Llama 4 <a href="https://simonwillison.net/2025/Apr/5/llama-4-notes/">almost exactly a year ago</a>. It's hosted, not open weights, and the API is currently "a private API preview to select users", but you can try it out today on <a href="https://meta.ai/">meta.ai</a> (Facebook or Instagram login required). Meta's self-reported benchmarks show it competiti

Cleanup Claude Code Paste

researchSimon Willison's Weblog4/6/2026

Tool: <a href="https://tools.simonwillison.net/cleanup-claude-code-paste">Cleanup Claude Code Paste</a> Super-niche tool this. I sometimes copy prompts out of the Claude Code terminal app and they come out with a bunch of weird additional whitespace. This tool cleans that up. <img alt="Screenshot of a web tool titled "Cleanup Claude Code Paste" with the subtitle "Paste terminal output to remove the ❯ prompt, fix wrapped-line whitespace, and join lines into clean

Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage

researchTechCrunch AI4/4/2026

It’s about to become more expensive for Claude Code subscribers to use Anthropic’s coding assistant with OpenClaw and other third-party tools.

Claude Code Found a Linux Vulnerability Hidden for 23 Years

researchHacker News (Best)4/3/2026

Article URL: https://mtlynch.io/claude-code-found-linux-vulnerability/ Comments URL: https://news.ycombinator.com/item?id=47633855 Points: 399 # Comments: 252

Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw

researchHacker News (Best)4/3/2026

Received the following email from Anthropic: Hi, Starting April 4 at 12pm PT / 8pm BST, you’ll no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw. You can still use them with your Claude account, but they will require extra usage, a pay-as-you-go option billed separately from your subscription. Your subscription still covers all Claude products, including Claude Code and Claude Cowork. To keep using third-party harnesses with your Claude login,

llm-gemini 0.30

researchSimon Willison's Weblog4/2/2026

Release: <a href="https://github.com/simonw/llm-gemini/releases/tag/0.30">llm-gemini 0.30</a> New models <code>gemini-3.1-flash-lite-preview</code>, <code>gemma-4-26b-a4b-it</code> and <code>gemma-4-31b-it</code>. See <a href="https://simonwillison.net/2026/Apr/2/gemma-4/">my notes on Gemma 4</a>. Tags: <a href="https://simonwillison.net/tags/gemini">gemini</a>, <a href="https://simonwillison.net/tags/llm">llm</a>, <a href="https://simonwillison.net/tags/gemma">

Welcome Gemma 4: Frontier multimodal intelligence on device

researchHugging Face Blog4/2/2026

The Download: gig workers training humanoids, and better AI benchmarks

researchMIT Technology Review AI4/1/2026

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. The gig workers who are training humanoid robots at home When Zeus, a medical student in Nigeria, returns to his apartment from a long day at the hospital, he straps his…

Build with Veo 3.1 Lite, our most cost-effective video generation model

researchGoogle AI Blog3/31/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/veo31lite.max-600x600.format-webp.webp">Veo 3.1 Lite is now available in paid preview through the Gemini API and for testing in Google AI Studio.

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

researchHugging Face Blog3/31/2026

Shifting to AI model customization is an architectural imperative

researchMIT Technology Review AI3/31/2026

In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every new model iteration. Today, those jumps have flattened into incremental gains. The exception is domain-specialized intelligence, where true step-function improvements are still the norm. When a model is fused with an organization’s…

AI benchmarks are broken. Here’s what we need instead.

researchMIT Technology Review AI3/31/2026

For decades, artificial intelligence has been evaluated through the question of whether machines outperform humans. From chess to advanced math, from coding to essay writing, the performance of AI models and applications is tested against that of individual humans completing tasks. This framing is seductive: An AI vs. human comparison on isolated problems with clear…

Further human + AI + proof assistant work on Knuth's "Claude Cycles" problem

researchHacker News (Best)3/28/2026

Knuth Claude's Cycles note update: problem now fully solved, by LLMs - https://news.ycombinator.com/item?id=47306926 - March 2026 (2 comments) https://chatgpt.com/share/69aaab4b-888c-8003-9a02-d1df80f9c7... Claude's Cycles [pdf] - https://news.ycombinator.com/item?id=47230710 - March 2026 (362 comments) Comments URL: https://news.ycombinator.com/item?id=47557166 Points: 221 # Comments: 148

Anthropic’s Claude popularity with paying consumers is skyrocketing

researchTechCrunch AI3/28/2026

Estimates for total Claude consumer users are all over the map (we've seen figures ranging from 18 million to 30 million). Anthropic hasn't disclosed this data, but a spokesperson did tell TechCrunch that Claude paid subscriptions have more than doubled this year.

Why OpenAI killed Sora

researchThe Verge AI3/28/2026

On Tuesday morning, everything was business as usual at OpenAI. By the end of the day, the company had announced that it would scrap its video-generation app, Sora, and reverse plans for video generation inside ChatGPT; it would wind down a $1 billion Disney deal; it would shuffle the role of a high-level executive; and […]

Anatomy of the .claude/ folder

researchHacker News (Best)3/27/2026

Article URL: https://blog.dailydoseofds.com/p/anatomy-of-the-claude-folder Comments URL: https://news.ycombinator.com/item?id=47543139 Points: 602 # Comments: 255

Google is making it easier to import another AI’s memory into Gemini

researchThe Verge AI3/26/2026

After Anthropic updated its tool for copying another AI's memory into Claude earlier this month, Google Gemini is rolling out new "Import Memory" and "Import Chat History" features on desktop that can help users quickly copy over everything their current AI already knows about them. To use the "Import Memory" tool, users copy and paste […]

ByteDance’s new AI video generation model, Dreamina Seedance 2.0, comes to CapCut

researchTechCrunch AI3/26/2026

The new model in CapCut will have built-in protections for making video from real faces or unauthorized intellectual property.

Gemini 3.1 Flash Live: Making audio AI more natural and reliable

researchGoogle AI Blog3/26/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-3.1-flash-live_blog_head.max-600x600.format-webp.webp">Gemini 3.1 Flash Live is now available across Google products.

OpenAI’s Sora was the creepiest app on your phone — now it’s shutting down

researchTechCrunch AI3/24/2026

Though the underlying Sora 2 video- and audio-generation model is scarily impressive, there was not sustained interest in an AI-only social feed.

Auto mode for Claude Code

researchSimon Willison's Weblog3/24/2026

<a href="https://claude.com/blog/auto-mode">Auto mode for Claude Code</a> Really interesting new development in Claude Code today as an alternative to <code>--dangerously-skip-permissions</code>: <blockquote> Today, we're introducing auto mode, a new permissions mode in Claude Code where Claude makes permission decisions on your behalf, with safeguards monitoring actions before they run. </blockquote> Those safeguards appear to be implemented using Claude So

Anthropic hands Claude Code more control, but keeps it on a leash

researchTechCrunch AI3/24/2026

Anthropic’s new auto mode for Claude Code lets AI execute tasks with fewer approvals, reflecting a broader shift toward more autonomous tools that balance speed with safety through built-in safeguards.

Google TV’s new Gemini features keep fans updated on sports teams and more

researchTechCrunch AI3/24/2026

Three Gemini-powered features are coming to your Google TV. This includes visual responses, deep dives, and sports briefs.

ChatGPT and Gemini are fighting to be the AI bot that sells you stuff

researchThe Verge AI3/24/2026

The AI-powered shopping rivalry is heating up as Google and OpenAI launch new features to help you buy things while interacting with their chatbots. Now, Google is teaming up with Gap Inc to allow its Gemini AI assistant to purchase clothes on your behalf from any of its stores, which include Gap, Old Navy, Banana […]

Claude Code Cheat Sheet

researchHacker News (Best)3/23/2026

⌨️ Keyboard Shortcuts General Controls CtrlC Cancel input/generation CtrlD Exit session CtrlL Clear screen CtrlO Toggle verbose output CtrlR Reverse search history CtrlG Open prompt in editor CtrlB Background running task CtrlT Toggle task list CtrlV Paste image CtrlF Kill background agents (×2) EscEsc Rewind / undo Mode Switching ShiftTab Cycle permission modes AltP Switch model AltT Toggle thinking Input \Enter Newline (quick) CtrlJ Newline (control seq) Prefixes / Slash command ! Direct bash @ File mention + autocomplete Session Picker ↑↓ Navigate ←→ Expand/collapse P Preview R Rename / Search A All projects B Current branch 🔌 MCP Servers Add Servers --transport http Remote HTTP (recommended) --transport stdio Local process --transport sse Remote SSE Scopes Local ~/.claude.json (per project) Project .mcp.json (shared/VCS) User ~/.claude.json (global) Manage /mcp Interactive UI claude mcp list List all servers claude mcp serve CC as MCP server Elicitation Servers request input mid-taskNEW ⚡ Slash Commands Session /clear Clear conversation /compact [focus] Compact context /resume Resume/switch session /rename [name] Name current session /branch [name] Branch conversation (/fork alias) /cost Token usage stats /context Visualize context (grid) /diff Interactive diff viewer /copy Copy last response /export Export conversation Config /config Open settings /model [model] Switch model (←→ effort) /fast [on|off] Toggle fast mode /vim Toggle vim mode /theme Change color theme /permissions View/update permissions /effort [level] Set effort (low/med/high)NEW /color [color] Set prompt-bar color Tools /init Create CLAUDE.md /memory Edit CLAUDE.md files /mcp Manage MCP servers /hooks Manage hooks /skills List available skills /agents Manage agents /chrome Chrome integration /reload-plugins Hot-reload plugins Special /btw <question> Side question (no context) /plan [desc] Plan mode (+ auto-start) /loop [interval] Schedule recurring task /voice Push-to-talk voice (20 langs) /doctor Diagnose installation /rc Enable remote control /pr-comments [PR] Fetch GitHub PR comments /stats Usage streaks & prefs /insights Analyze sessions report /desktop Continue in Desktop app /remote-control Bridge terminal to claude.ai/codeNEW /stickers Order stickers! 🎉 📁 Memory & Files CLAUDE.md Locations ./CLAUDE.md Project (team-shared) ~/.claude/CLAUDE.md Personal (all projects) /etc/claude-code/ Managed (org-wide) Rules & Import .claude/rules/*.md Project rules ~/.claude/rules/*.md User rules paths: frontmatter Path-specific rules @path/to/file Import in CLAUDE.md Auto Memory ~/.claude/projects/<proj>/memory/ MEMORY.md + topic files, auto-loaded 🧠 Workflows & Tips Plan Mode ShiftTab Normal → Auto → Plan --permission-mode plan Start in plan mode Thinking & Effort AltT Toggle thinking on/off "ultrathink" Max effort for turn CtrlO See thinking (verbose) /effort ○ low · ◐ med · ● highNEW Git Worktrees --worktree name Isolated branch per feature isolation: worktree Agent in own worktree sparsePaths Checkout only needed dirsNEW /batch Auto-creates worktrees Voice Mode /voice Enable push-to-talk Space (hold) Record, release to send 20 languages EN, ES, FR, DE, CZ, PL… Context Management /context Usage + optimization tips /compact [focus] Compress with focus Auto-compact ~95% capacity 1M context Opus 4.6 (Max/Team/Ent) CLAUDE.md Survives compaction! Session Power Moves claude -c Continue last conv claude -r "name" Resume by name /btw question Side Q, no context cost SDK / Headless claude -p "query" Non-interactive --output-format json Structured output --max-budget-usd 5 Cost cap cat file | claude -p Pipe input Scheduling & Remote /loop 5m msg Recurring task /rc Remote control --remote Web session on claude.ai ⚙️ Config & Env Config Files ~/.claude/settings.json User settings .claude/settings.json Project (shared) .claude/settings.local.json Local only ~/.claude.json OAuth, MCP, state .mcp.json Project MCP servers Key Settings modelOverrides Map model picker → custom IDs autoMemoryDirectory Custom memory dir worktree.sparsePaths Sparse checkout dirsNEW Key Env Vars ANTHROPIC_API_KEY ANTHROPIC_MODEL CLAUDE_CODE_EFFORT_LEVEL low/med/high MAX_THINKING_TOKENS 0=off ANTHROPIC_CUSTOM_MODEL_OPTION Custom /model entry CLAUDE_CODE_PLUGIN_SEED_DIR Multiple plugin seed dirs 🔧 Skills & Agents Built-in Skills /simplify Code review (3 parallel agents) /batch Large parallel changes (5-30 worktrees) /debug [desc] Troubleshoot from debug log /loop [interval] Recurring scheduled task /claude-api Load API + SDK reference Custom Skill Locations .claude/skills/<name>/ Project skills ~/.claude/skills/<name>/ Personal skills Skill Frontmatter description Auto-invocation trigger allowed-tools Skip permission prompts model Override model for skill effort Override effort levelNEW context: fork Run in subagent $ARGUMENTS User input placeholder ${CLAUDE_SKILL_DIR} Skill's own directory !`cmd` Dynamic context injection Built-in Agents Explore Fast read-only (Haiku) Plan Re

Creating with Sora Safely

researchOpenAI Blog3/23/2026

To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we’ve built Sora 2 and the Sora app with safety at the foundation. Our approach is anchored in concrete protections.

SQLite Tags Benchmark: Comparing 5 Tagging Strategies

researchSimon Willison's Weblog3/20/2026

Research: <a href="https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme">SQLite Tags Benchmark: Comparing 5 Tagging Strategies</a> I had Claude Code run a micro-benchmark comparing different approaches to implementing tagging in SQLite. Traditional many-to-many tables won, but FTS5 came a close second. Full table scans with LIKE queries performed better than I expected, but full table scans with JSON arrays and <code>json_each()</code> were much sl

Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally

researchSimon Willison's Weblog3/18/2026

<a href="https://twitter.com/danveloper/status/2034353876753592372">Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally</a> Here's a fascinating piece of research by Dan Woods, who managed to get a custom version of <a href="https://huggingface.co/Qwen/Qwen3.5-397B-A17B/tree/main">Qwen3.5-397B-A17B</a> running at 5.5+ tokens/second on a 48GB MacBook Pro M3 Max despite that model taking up 209GB (120GB quantized) on disk. Qwen3.5-397B-A17B is a Mixture

GPT-5.4 mini and GPT-5.4 nano, which can describe 76,000 photos for $52

researchSimon Willison's Weblog3/17/2026

OpenAI today: <a href="https://openai.com/index/introducing-gpt-5-4-mini-and-nano/">Introducing GPT‑5.4 mini and nano</a>. These models join GPT-5.4 which was released <a href="https://openai.com/index/introducing-gpt-5-4/">two weeks ago</a>. OpenAI's self-reported benchmarks show the new 5.4-nano out-performing their previous GPT-5 mini model when run at maximum reasoning effort. The new mini is also 2x faster than the previous mini. Here's how the pricing looks - all prices ar

Introducing GPT-5.4 mini and nano

researchOpenAI Blog3/17/2026

GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.

Introducing Mistral Small 4

researchSimon Willison's Weblog3/16/2026

<a href="https://mistral.ai/news/mistral-small-4">Introducing Mistral Small 4</a> Big new release from Mistral today (despite the name) - a new Apache 2 licensed 119B parameter (Mixture-of-Experts, 6B active) model which they describe like this: <blockquote> Mistral Small 4 is the first Mistral model to unify the capabilities of our flagship models, Magistral for reasoning, Pixtral for multimodal, and Devstral for agentic coding, into a single, versatile model.

1M context is now generally available for Opus 4.6 and Sonnet 4.6

researchSimon Willison's Weblog3/13/2026

<a href="https://claude.com/blog/1m-context-ga">1M context is now generally available for Opus 4.6 and Sonnet 4.6</a> Here's what surprised me: <blockquote> Standard pricing now applies across the full 1M window for both models, with no long-context premium. </blockquote> OpenAI and Gemini both <a href="https://www.llm-prices.com/#sel=gemini-3-1-pro-preview-200k%2Cgpt-5.4-272k%2Cgemini-3-1-pro-preview%2Cgpt-5.4">charge more</a> for prompts where the token co

Gemini in Google Sheets just achieved state-of-the-art performance.

researchGoogle AI Blog3/10/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Workspace_Jan_Moment_Sheets_Blo.max-600x600.format-webp.webp">Today we announced new beta features for Gemini in Sheets to help you create, organize and edit entire sheets, from basic tasks to complex data analysis — just describe …

GPT-5.4 Thinking System Card

researchOpenAI Blog3/5/2026

Reasoning models struggle to control their chains of thought, and that’s good

researchOpenAI Blog3/5/2026

OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.

Introducing GPT-5.4

researchOpenAI Blog3/5/2026

Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context.

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

researchHugging Face Blog3/5/2026

Gemini 3.1 Flash-Lite: Built for intelligence at scale

researchGoogle AI Blog3/3/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-3.1_flash_Lite_blog_keyw.max-600x600.format-webp.webp">Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.

GPT-5.3 Instant: Smoother, more useful everyday conversations

researchOpenAI Blog3/3/2026

GPT-5.3 Instant System Card

researchOpenAI Blog3/3/2026

Build with Nano Banana 2, our best image generation and editing model

researchGoogle AI Blog2/26/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/BuildWith_Hero.max-600x600.format-webp.webp">Nano Banana 2 (Gemini 3.1 Flash Image) delivers Pro-level intelligence and fidelity for all image applications.

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

researchHugging Face Blog2/20/2026

A new way to express yourself: Gemini can now create music

researchGoogle AI Blog2/18/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/0217_KeywordHeaderFinalc.max-600x600.format-webp.webp">Lyria 3 is now available in the Gemini app. Create custom, high-quality 30-second tracks from text and images.

GPT-5.2 derives a new result in theoretical physics

researchOpenAI Blog2/13/2026

A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators.

Custom Kernels for All from Codex and Claude

researchHugging Face Blog2/13/2026

Introducing GPT-5.3-Codex-Spark

researchOpenAI Blog2/12/2026

Introducing GPT-5.3-Codex-Spark—our first real-time coding model. 15x faster generation, 128k context, now in research preview for ChatGPT Pro users.

GPT-5 lowers the cost of cell-free protein synthesis

researchOpenAI Blog2/5/2026

An autonomous lab combining OpenAI’s GPT-5 with Ginkgo Bioworks’ cloud automation cut cell-free protein synthesis costs by 40% through closed-loop experimentation.

GPT-5.3-Codex System Card

researchOpenAI Blog2/5/2026

GPT‑5.3-Codex is the most capable agentic coding model to date, combining the frontier coding performance of GPT‑5.2-Codex with the reasoning and professional knowledge capabilities of GPT‑5.2.

Introducing GPT-5.3-Codex

researchOpenAI Blog2/5/2026

GPT-5.3-Codex is a Codex-native agent that pairs frontier coding performance with general reasoning to support long-horizon, real-world technical work.

VfL Wolfsburg turns ChatGPT into a club-wide capability

researchOpenAI Blog2/4/2026

By focusing on people, not pilots, the Bundesliga club is scaling efficiency, creativity, and knowledge—without losing its football identity.

The Sora feed philosophy

researchOpenAI Blog2/3/2026

Discover the Sora feed philosophy—built to spark creativity, foster connections, and keep experiences safe with personalized recommendations, parental controls, and strong guardrails.

Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT

researchOpenAI Blog1/29/2026

On February 13, 2026, alongside the previously announced retirement⁠ of GPT‑5 (Instant, Thinking, and Pro), we will retire GPT‑4o, GPT‑4.1, GPT‑4.1 mini, and OpenAI o4-mini from ChatGPT. In the API, there are no changes at this time.

Inside Praktika's conversational approach to language learning

researchOpenAI Blog1/22/2026

How Praktika uses GPT-4.1 and GPT-5.2 to build adaptive AI tutors that personalize lessons, track progress, and help learners achieve real-world language fluency

Inside GPT-5 for Work: How Businesses Use GPT-5

researchOpenAI Blog1/22/2026

A data-driven report on how workers across industries use ChatGPT—covering adoption trends, top tasks, departmental patterns, and the future of AI at work.

How Higgsfield turns simple ideas into cinematic social videos

researchOpenAI Blog1/21/2026

Discover how Higgsfield gives creators cinematic, social-first video output from simple inputs using OpenAI GPT-4.1, GPT-5, and Sora 2.

How countries can end the capability overhang

researchOpenAI Blog1/21/2026

Our latest report reveals stark differences in advanced AI adoption across countries and outlines new initiatives to help nations capture productivity gains from AI.

Introducing Waypoint-1: Real-time interactive video diffusion from Overworld

researchHugging Face Blog1/20/2026

How Tolan builds voice-first AI with GPT-5.1

researchOpenAI Blog1/7/2026

Tolan built a voice-first AI companion with GPT-5.1, combining low-latency responses, real-time context reconstruction, and memory-driven personalities for natural conversations.

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

researchHugging Face Blog1/5/2026

Introducing GPT-5.2-Codex

researchOpenAI Blog12/18/2025

GPT-5.2-Codex is OpenAI’s most advanced coding model, offering long-horizon reasoning, large-scale code transformations, and enhanced cybersecurity capabilities.

Introducing GPT-5.2-Codex

researchOpenAI Blog12/18/2025

GPT-5.2-Codex is OpenAI’s most advanced coding model, offering long-horizon reasoning, large-scale code transformations, and enhanced cybersecurity capabilities.

Measuring AI’s capability to accelerate biological research

researchOpenAI Blog12/16/2025

OpenAI introduces a real-world evaluation framework to measure how AI can accelerate biological research in the wet lab. Using GPT-5 to optimize a molecular cloning protocol, the work explores both the promise and risks of AI-assisted experimentation.

How We Used Codex to Ship Sora for Android in 28 Days

researchOpenAI Blog12/12/2025

OpenAI shipped Sora for Android in 28 days using Codex. AI-assisted planning, translation, and parallel coding workflows helped a nimble team deliver rapid, reliable development.

New in llama.cpp: Model Management

researchHugging Face Blog12/11/2025

Advancing science and math with GPT-5.2

researchOpenAI Blog12/11/2025

GPT-5.2 is OpenAI’s strongest model yet for math and science, setting new state-of-the-art results on benchmarks like GPQA Diamond and FrontierMath. This post shows how those gains translate into real research progress, including solving an open theoretical problem and generating reliable mathematical proofs.

Introducing GPT-5.2

researchOpenAI Blog12/11/2025

GPT-5.2 is our most advanced frontier model for everyday professional work, with state-of-the-art reasoning, long-context understanding, coding, and vision. Use it in ChatGPT and the OpenAI API to power faster, more reliable agentic workflows.

Update to GPT-5 System Card: GPT-5.2

researchOpenAI Blog12/11/2025

GPT-5.2 is the latest model family in the GPT-5 series. The comprehensive safety mitigation approach for these models is largely the same as that described in the GPT-5 System Card and GPT-5.1 System Card. Like OpenAI’s other models, the GPT-5.2 models were trained on diverse datasets, including information that is publicly available on the internet, information that we partner with third parties to access, and information that our users or human trainers and researchers provide or generate.

Building Deep Research: How we Achieved State of the Art

researchHugging Face Blog11/24/2025

GPT-5 and the future of mathematical discovery

researchOpenAI Blog11/24/2025

UCLA Professor Ernest Ryu and GPT-5 solved a key question in optimization theory, showcasing AI’s role in accelerating mathematical discovery.

Early experiments in accelerating science with GPT-5

researchOpenAI Blog11/20/2025

OpenAI introduces the first research cases showing how GPT-5 accelerates scientific progress across math, physics, biology, and computer science. Explore how AI and researchers collaborate to generate proofs, uncover new insights, and reshape the pace of discovery.

Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models

researchHugging Face Blog11/19/2025

Building more with GPT-5.1-Codex-Max

researchOpenAI Blog11/19/2025

Introducing GPT-5.1-Codex-Max, a faster, more intelligent agentic coding model for Codex. The model is designed for long-running, project-scale work with enhanced reasoning and token efficiency.

Understanding neural networks through sparse circuits

researchOpenAI Blog11/13/2025

OpenAI is exploring mechanistic interpretability to understand how neural networks reason. Our new sparse model approach could make AI systems more transparent and support safer, more reliable behavior.

Introducing GPT-5.1 for developers

researchOpenAI Blog11/13/2025

GPT-5.1 is now available in the API, bringing faster adaptive reasoning, extended prompt caching, improved coding performance, and new apply_patch and shell tools.

GPT-5.1: A smarter, more conversational ChatGPT

researchOpenAI Blog11/12/2025

We’re upgrading the GPT-5 series with warmer, more capable models and new ways to customize ChatGPT’s tone and style. GPT-5.1 starts rolling out today to paid users.

Introducing IndQA

researchOpenAI Blog11/3/2025

OpenAI introduces IndQA, a new benchmark for evaluating AI systems in Indian languages. Built with domain experts, IndQA tests cultural understanding and reasoning across 12 languages and 10 knowledge areas.

Addendum to GPT-5 System Card: Sensitive conversations

researchOpenAI Blog10/27/2025

This system card details GPT-5’s improvements in handling sensitive conversations, including new benchmarks for emotional reliance, mental health, and jailbreak resistance.

With GPT-5, Wrtn builds lifestyle AI for millions in Korea

researchOpenAI Blog10/2/2025

Wrtn scaled AI apps to 6.5M users in Korea with GPT-5, creating ‘Lifestyle AI’ that blends productivity, creativity, and learning—now expanding across East Asia.

SOTA OCR with Core ML and dots.ocr

researchHugging Face Blog10/2/2025

Sora 2 is here

researchOpenAI Blog9/30/2025

Our latest video generation model is more physically accurate, realistic, and controllable than prior systems. It also features synchronized dialogue and sound effects. Create with it in the new Sora app.

Launching Sora responsibly

researchOpenAI Blog9/30/2025

Sora 2 System Card

researchOpenAI Blog9/30/2025

Sora 2 is our new state of the art video and audio generation model. Building on the foundation of Sora, this new model introduces capabilities that have been difficult for prior video models to achieve– such as more accurate physics, sharper realism, synchronized audio, enhanced steerability, and an expanded stylistic range.