Research
2,213 articles total
Research
Google rolls out Gemini in Chrome in 7 new countries
Google is rolling out Gemini in Chrome in Australia, Indonesia, Japan, the Philippines, Singapore, South Korea, and Vietnam. The company is rolling this feature out to both desktop and iOS in all of these countries except Japan.
Claude Token Counter, now with model comparisons
<p><strong><a href="https://tools.simonwillison.net/claude-token-counter">Claude Token Counter, now with model comparisons</a></strong></p> I <a href="https://github.com/simonw/tools/pull/269">upgraded</a> my Claude Token Counter tool to add the ability to run the same count against different models in order to compare them.</p> <p>As far as I can tell Claude Opus 4.7 is the first model to change the tokenizer, so it's only worth running comparisons between 4.7 and 4.6. The Claude <a href="https
Changes in the system prompt between Claude Opus 4.6 and 4.7
<p>Anthropic are the only major AI lab to <a href="https://platform.claude.com/docs/en/release-notes/system-prompts">publish the system prompts</a> for their user-facing chat systems. Their system prompt archive now dates all the way back to Claude 3 in July 2024 and it's always interesting to see how the system prompt evolves as they publish new models.</p> <p>Opus 4.7 shipped the other day (April 16, 2026) with a <a href="https://claude.ai/">Claude.ai</a> system prompt update since Opus 4.6 (F
Claude system prompts as a git timeline
<p><strong>Research:</strong> <a href="https://github.com/simonw/research/tree/main/extract-system-prompts#readme">Claude system prompts as a git timeline</a></p> <p>Anthropic <a href="https://platform.claude.com/docs/en/release-notes/system-prompts">publish the system prompts</a> for Claude chat and make that page <a href="https://platform.claude.com/docs/en/release-notes/system-prompts.md">available as Markdown</a>. I had Claude Code turn that page into separate files for each model and model
OpenAI’s former Sora boss is leaving
Last month, OpenAI gave up on its Sora video generation tool, and on Friday, the Sora team's leader, Bill Peebles, announced that he is leaving the company. OpenAI has been shifting its priorities as part of an effort to avoid "side quests," and Peebles' departure is just one of many recent changes as the company […]
Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7
<p>For anyone who has been (inadvisably) taking my <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/">pelican riding a bicycle benchmark</a> seriously as a robust way to test models, here are pelicans from this morning's two big model releases - <a href="https://qwen.ai/blog?id=qwen3.6-35b-a3b">Qwen3.6-35B-A3B from Alibaba</a> and <a href="https://www.anthropic.com/news/claude-opus-4-7">Claude Opus 4.7 from Anthropic</a>.</p> <p>Here's the Qwen 3.6 pelican, generated using <a hre
New ways to create personalized images in the Gemini app
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Blog_post_header.max-600x600.format-webp.webp">Nano Banana 2 now uses your personal context and Google Photos to create images that reflect your unique life.
Treating enterprise AI as an operating layer
There’s a fault line running through enterprise AI, and it’s not the one getting the most attention. The public conversation still tracks foundation models and benchmarks—GPT versus Gemini, reasoning scores, and marginal capability gains. But in practice, the more durable advantage is structural: who owns the operating layer where intelligence is applied, governed, and improved.…
Gemini 3.1 Flash TTS
<p><strong><a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-tts/">Gemini 3.1 Flash TTS</a></strong></p> Google released Gemini 3.1 Flash TTS today, a new text-to-speech model that can be directed using prompts.</p> <p>It's presented via the standard Gemini API using <code>gemini-3.1-flash-tts-preview</code> as the model ID, but can only output audio files.</p> <p>The <a href="https://ai.google.dev/gemini-api/docs/speech-generation#transcript-tags"
Gemini 3.1 Flash TTS
<p><strong>Tool:</strong> <a href="https://tools.simonwillison.net/gemini-flash-tts">Gemini 3.1 Flash TTS</a></p> <p>See <a href="https://simonwillison.net/2026/Apr/15/gemini-31-flash-tts/">my notes</a> on Google's new Gemini 3.1 Flash TTS text-to-speech model.</p> <p>Tags: <a href="https://simonwillison.net/tags/gemini">gemini</a>, <a href="https://simonwillison.net/tags/google">google</a></p>
Gemini 3.1 Flash TTS: the next generation of expressive AI speech
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-3.1-flash-tts_blog_keywo.max-600x600.format-webp.webp">Gemini 3.1 Flash TTS is now available across Google products.
At the HumanX conference, everyone was talking about Claude
Anthropic was the star of the show at San Francisco's AI-centric conference.
Anthropic temporarily banned OpenClaw’s creator from accessing Claude
This ban took place after Claude's pricing changed for OpenClaw users last week.
Meta's new model is Muse Spark, and meta.ai chat has some interesting tools
<p>Meta <a href="https://ai.meta.com/blog/introducing-muse-spark-msl/">announced Muse Spark</a> today, their first model release since Llama 4 <a href="https://simonwillison.net/2025/Apr/5/llama-4-notes/">almost exactly a year ago</a>. It's hosted, not open weights, and the API is currently "a private API preview to select users", but you can try it out today on <a href="https://meta.ai/">meta.ai</a> (Facebook or Instagram login required).</p> <p>Meta's self-reported benchmarks show it competiti
Cleanup Claude Code Paste
<p><strong>Tool:</strong> <a href="https://tools.simonwillison.net/cleanup-claude-code-paste">Cleanup Claude Code Paste</a></p> <p>Super-niche tool this. I sometimes copy prompts out of the Claude Code terminal app and they come out with a bunch of weird additional whitespace. This tool cleans that up.</p> <p><img alt="Screenshot of a web tool titled "Cleanup Claude Code Paste" with the subtitle "Paste terminal output to remove the ❯ prompt, fix wrapped-line whitespace, and join lines into clean
Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage
It’s about to become more expensive for Claude Code subscribers to use Anthropic’s coding assistant with OpenClaw and other third-party tools.
Claude Code Found a Linux Vulnerability Hidden for 23 Years
Article URL: https://mtlynch.io/claude-code-found-linux-vulnerability/ Comments URL: https://news.ycombinator.com/item?id=47633855 Points: 399 # Comments: 252
Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw
Received the following email from Anthropic: Hi, Starting April 4 at 12pm PT / 8pm BST, you’ll no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw. You can still use them with your Claude account, but they will require extra usage, a pay-as-you-go option billed separately from your subscription. Your subscription still covers all Claude products, including Claude Code and Claude Cowork. To keep using third-party harnesses with your Claude login,
llm-gemini 0.30
<p><strong>Release:</strong> <a href="https://github.com/simonw/llm-gemini/releases/tag/0.30">llm-gemini 0.30</a></p> <p>New models <code>gemini-3.1-flash-lite-preview</code>, <code>gemma-4-26b-a4b-it</code> and <code>gemma-4-31b-it</code>. See <a href="https://simonwillison.net/2026/Apr/2/gemma-4/">my notes on Gemma 4</a>.</p> <p>Tags: <a href="https://simonwillison.net/tags/gemini">gemini</a>, <a href="https://simonwillison.net/tags/llm">llm</a>, <a href="https://simonwillison.net/tags/gemma">
Welcome Gemma 4: Frontier multimodal intelligence on device
The Download: gig workers training humanoids, and better AI benchmarks
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. The gig workers who are training humanoid robots at home When Zeus, a medical student in Nigeria, returns to his apartment from a long day at the hospital, he straps his…
Build with Veo 3.1 Lite, our most cost-effective video generation model
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/veo31lite.max-600x600.format-webp.webp">Veo 3.1 Lite is now available in paid preview through the Gemini API and for testing in Google AI Studio.
Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents
Shifting to AI model customization is an architectural imperative
In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every new model iteration. Today, those jumps have flattened into incremental gains. The exception is domain-specialized intelligence, where true step-function improvements are still the norm. When a model is fused with an organization’s…
AI benchmarks are broken. Here’s what we need instead.
For decades, artificial intelligence has been evaluated through the question of whether machines outperform humans. From chess to advanced math, from coding to essay writing, the performance of AI models and applications is tested against that of individual humans completing tasks. This framing is seductive: An AI vs. human comparison on isolated problems with clear…
Further human + AI + proof assistant work on Knuth's "Claude Cycles" problem
Knuth Claude's Cycles note update: problem now fully solved, by LLMs - https://news.ycombinator.com/item?id=47306926 - March 2026 (2 comments) https://chatgpt.com/share/69aaab4b-888c-8003-9a02-d1df80f9c7... Claude's Cycles [pdf] - https://news.ycombinator.com/item?id=47230710 - March 2026 (362 comments) Comments URL: https://news.ycombinator.com/item?id=47557166 Points: 221 # Comments: 148
Anthropic’s Claude popularity with paying consumers is skyrocketing
Estimates for total Claude consumer users are all over the map (we've seen figures ranging from 18 million to 30 million). Anthropic hasn't disclosed this data, but a spokesperson did tell TechCrunch that Claude paid subscriptions have more than doubled this year.
Why OpenAI killed Sora
On Tuesday morning, everything was business as usual at OpenAI. By the end of the day, the company had announced that it would scrap its video-generation app, Sora, and reverse plans for video generation inside ChatGPT; it would wind down a $1 billion Disney deal; it would shuffle the role of a high-level executive; and […]
Anatomy of the .claude/ folder
Article URL: https://blog.dailydoseofds.com/p/anatomy-of-the-claude-folder Comments URL: https://news.ycombinator.com/item?id=47543139 Points: 602 # Comments: 255
Google is making it easier to import another AI’s memory into Gemini
After Anthropic updated its tool for copying another AI's memory into Claude earlier this month, Google Gemini is rolling out new "Import Memory" and "Import Chat History" features on desktop that can help users quickly copy over everything their current AI already knows about them. To use the "Import Memory" tool, users copy and paste […]
ByteDance’s new AI video generation model, Dreamina Seedance 2.0, comes to CapCut
The new model in CapCut will have built-in protections for making video from real faces or unauthorized intellectual property.
Gemini 3.1 Flash Live: Making audio AI more natural and reliable
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-3.1-flash-live_blog_head.max-600x600.format-webp.webp">Gemini 3.1 Flash Live is now available across Google products.
OpenAI’s Sora was the creepiest app on your phone — now it’s shutting down
Though the underlying Sora 2 video- and audio-generation model is scarily impressive, there was not sustained interest in an AI-only social feed.
Auto mode for Claude Code
<p><strong><a href="https://claude.com/blog/auto-mode">Auto mode for Claude Code</a></strong></p> Really interesting new development in Claude Code today as an alternative to <code>--dangerously-skip-permissions</code>:</p> <blockquote> <p>Today, we're introducing auto mode, a new permissions mode in Claude Code where Claude makes permission decisions on your behalf, with safeguards monitoring actions before they run.</p> </blockquote> <p>Those safeguards appear to be implemented using Claude So
Anthropic hands Claude Code more control, but keeps it on a leash
Anthropic’s new auto mode for Claude Code lets AI execute tasks with fewer approvals, reflecting a broader shift toward more autonomous tools that balance speed with safety through built-in safeguards.
Google TV’s new Gemini features keep fans updated on sports teams and more
Three Gemini-powered features are coming to your Google TV. This includes visual responses, deep dives, and sports briefs.
ChatGPT and Gemini are fighting to be the AI bot that sells you stuff
The AI-powered shopping rivalry is heating up as Google and OpenAI launch new features to help you buy things while interacting with their chatbots. Now, Google is teaming up with Gap Inc to allow its Gemini AI assistant to purchase clothes on your behalf from any of its stores, which include Gap, Old Navy, Banana […]
Claude Code Cheat Sheet
⌨️ Keyboard Shortcuts General Controls CtrlC Cancel input/generation CtrlD Exit session CtrlL Clear screen CtrlO Toggle verbose output CtrlR Reverse search history CtrlG Open prompt in editor CtrlB Background running task CtrlT Toggle task list CtrlV Paste image CtrlF Kill background agents (×2) EscEsc Rewind / undo Mode Switching ShiftTab Cycle permission modes AltP Switch model AltT Toggle thinking Input \Enter Newline (quick) CtrlJ Newline (control seq) Prefixes / Slash command ! Direct bash @ File mention + autocomplete Session Picker ↑↓ Navigate ←→ Expand/collapse P Preview R Rename / Search A All projects B Current branch 🔌 MCP Servers Add Servers --transport http Remote HTTP (recommended) --transport stdio Local process --transport sse Remote SSE Scopes Local ~/.claude.json (per project) Project .mcp.json (shared/VCS) User ~/.claude.json (global) Manage /mcp Interactive UI claude mcp list List all servers claude mcp serve CC as MCP server Elicitation Servers request input mid-taskNEW ⚡ Slash Commands Session /clear Clear conversation /compact [focus] Compact context /resume Resume/switch session /rename [name] Name current session /branch [name] Branch conversation (/fork alias) /cost Token usage stats /context Visualize context (grid) /diff Interactive diff viewer /copy Copy last response /export Export conversation Config /config Open settings /model [model] Switch model (←→ effort) /fast [on|off] Toggle fast mode /vim Toggle vim mode /theme Change color theme /permissions View/update permissions /effort [level] Set effort (low/med/high)NEW /color [color] Set prompt-bar color Tools /init Create CLAUDE.md /memory Edit CLAUDE.md files /mcp Manage MCP servers /hooks Manage hooks /skills List available skills /agents Manage agents /chrome Chrome integration /reload-plugins Hot-reload plugins Special /btw <question> Side question (no context) /plan [desc] Plan mode (+ auto-start) /loop [interval] Schedule recurring task /voice Push-to-talk voice (20 langs) /doctor Diagnose installation /rc Enable remote control /pr-comments [PR] Fetch GitHub PR comments /stats Usage streaks & prefs /insights Analyze sessions report /desktop Continue in Desktop app /remote-control Bridge terminal to claude.ai/codeNEW /stickers Order stickers! 🎉 📁 Memory & Files CLAUDE.md Locations ./CLAUDE.md Project (team-shared) ~/.claude/CLAUDE.md Personal (all projects) /etc/claude-code/ Managed (org-wide) Rules & Import .claude/rules/*.md Project rules ~/.claude/rules/*.md User rules paths: frontmatter Path-specific rules @path/to/file Import in CLAUDE.md Auto Memory ~/.claude/projects/<proj>/memory/ MEMORY.md + topic files, auto-loaded 🧠 Workflows & Tips Plan Mode ShiftTab Normal → Auto → Plan --permission-mode plan Start in plan mode Thinking & Effort AltT Toggle thinking on/off "ultrathink" Max effort for turn CtrlO See thinking (verbose) /effort ○ low · ◐ med · ● highNEW Git Worktrees --worktree name Isolated branch per feature isolation: worktree Agent in own worktree sparsePaths Checkout only needed dirsNEW /batch Auto-creates worktrees Voice Mode /voice Enable push-to-talk Space (hold) Record, release to send 20 languages EN, ES, FR, DE, CZ, PL… Context Management /context Usage + optimization tips /compact [focus] Compress with focus Auto-compact ~95% capacity 1M context Opus 4.6 (Max/Team/Ent) CLAUDE.md Survives compaction! Session Power Moves claude -c Continue last conv claude -r "name" Resume by name /btw question Side Q, no context cost SDK / Headless claude -p "query" Non-interactive --output-format json Structured output --max-budget-usd 5 Cost cap cat file | claude -p Pipe input Scheduling & Remote /loop 5m msg Recurring task /rc Remote control --remote Web session on claude.ai ⚙️ Config & Env Config Files ~/.claude/settings.json User settings .claude/settings.json Project (shared) .claude/settings.local.json Local only ~/.claude.json OAuth, MCP, state .mcp.json Project MCP servers Key Settings modelOverrides Map model picker → custom IDs autoMemoryDirectory Custom memory dir worktree.sparsePaths Sparse checkout dirsNEW Key Env Vars ANTHROPIC_API_KEY ANTHROPIC_MODEL CLAUDE_CODE_EFFORT_LEVEL low/med/high MAX_THINKING_TOKENS 0=off ANTHROPIC_CUSTOM_MODEL_OPTION Custom /model entry CLAUDE_CODE_PLUGIN_SEED_DIR Multiple plugin seed dirs 🔧 Skills & Agents Built-in Skills /simplify Code review (3 parallel agents) /batch Large parallel changes (5-30 worktrees) /debug [desc] Troubleshoot from debug log /loop [interval] Recurring scheduled task /claude-api Load API + SDK reference Custom Skill Locations .claude/skills/<name>/ Project skills ~/.claude/skills/<name>/ Personal skills Skill Frontmatter description Auto-invocation trigger allowed-tools Skip permission prompts model Override model for skill effort Override effort levelNEW context: fork Run in subagent $ARGUMENTS User input placeholder ${CLAUDE_SKILL_DIR} Skill's own directory !`cmd` Dynamic context injection Built-in Agents Explore Fast read-only (Haiku) Plan Re
Creating with Sora Safely
To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we’ve built Sora 2 and the Sora app with safety at the foundation. Our approach is anchored in concrete protections.
SQLite Tags Benchmark: Comparing 5 Tagging Strategies
<p><strong>Research:</strong> <a href="https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme">SQLite Tags Benchmark: Comparing 5 Tagging Strategies</a></p> <p>I had Claude Code run a micro-benchmark comparing different approaches to implementing tagging in SQLite. Traditional many-to-many tables won, but FTS5 came a close second. Full table scans with LIKE queries performed better than I expected, but full table scans with JSON arrays and <code>json_each()</code> were much sl
Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally
<p><strong><a href="https://twitter.com/danveloper/status/2034353876753592372">Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally</a></strong></p> Here's a fascinating piece of research by Dan Woods, who managed to get a custom version of <a href="https://huggingface.co/Qwen/Qwen3.5-397B-A17B/tree/main">Qwen3.5-397B-A17B</a> running at 5.5+ tokens/second on a 48GB MacBook Pro M3 Max despite that model taking up 209GB (120GB quantized) on disk.</p> <p>Qwen3.5-397B-A17B is a Mixture
GPT-5.4 mini and GPT-5.4 nano, which can describe 76,000 photos for $52
<p>OpenAI today: <a href="https://openai.com/index/introducing-gpt-5-4-mini-and-nano/">Introducing GPT‑5.4 mini and nano</a>. These models join GPT-5.4 which was released <a href="https://openai.com/index/introducing-gpt-5-4/">two weeks ago</a>.</p> <p>OpenAI's self-reported benchmarks show the new 5.4-nano out-performing their previous GPT-5 mini model when run at maximum reasoning effort. The new mini is also 2x faster than the previous mini.</p> <p>Here's how the pricing looks - all prices ar
Introducing GPT-5.4 mini and nano
GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.
Introducing Mistral Small 4
<p><strong><a href="https://mistral.ai/news/mistral-small-4">Introducing Mistral Small 4</a></strong></p> Big new release from Mistral today (despite the name) - a new Apache 2 licensed 119B parameter (Mixture-of-Experts, 6B active) model which they describe like this:</p> <blockquote> <p>Mistral Small 4 is the first Mistral model to unify the capabilities of our flagship models, Magistral for reasoning, Pixtral for multimodal, and Devstral for agentic coding, into a single, versatile model.</p>
1M context is now generally available for Opus 4.6 and Sonnet 4.6
<p><strong><a href="https://claude.com/blog/1m-context-ga">1M context is now generally available for Opus 4.6 and Sonnet 4.6</a></strong></p> Here's what surprised me:</p> <blockquote> <p>Standard pricing now applies across the full 1M window for both models, with no long-context premium.</p> </blockquote> <p>OpenAI and Gemini both <a href="https://www.llm-prices.com/#sel=gemini-3-1-pro-preview-200k%2Cgpt-5.4-272k%2Cgemini-3-1-pro-preview%2Cgpt-5.4">charge more</a> for prompts where the token co
Gemini in Google Sheets just achieved state-of-the-art performance.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Workspace_Jan_Moment_Sheets_Blo.max-600x600.format-webp.webp">Today we announced new beta features for Gemini in Sheets to help you create, organize and edit entire sheets, from basic tasks to complex data analysis — just describe …
GPT-5.4 Thinking System Card
Reasoning models struggle to control their chains of thought, and that’s good
OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.
Introducing GPT-5.4
Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context.
Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines
Gemini 3.1 Flash-Lite: Built for intelligence at scale
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-3.1_flash_Lite_blog_keyw.max-600x600.format-webp.webp">Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.
GPT-5.3 Instant: Smoother, more useful everyday conversations
GPT-5.3 Instant System Card
Build with Nano Banana 2, our best image generation and editing model
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/BuildWith_Hero.max-600x600.format-webp.webp">Nano Banana 2 (Gemini 3.1 Flash Image) delivers Pro-level intelligence and fidelity for all image applications.
GGML and llama.cpp join HF to ensure the long-term progress of Local AI
A new way to express yourself: Gemini can now create music
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/0217_KeywordHeaderFinalc.max-600x600.format-webp.webp">Lyria 3 is now available in the Gemini app. Create custom, high-quality 30-second tracks from text and images.
GPT-5.2 derives a new result in theoretical physics
A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators.
Custom Kernels for All from Codex and Claude
Introducing GPT-5.3-Codex-Spark
Introducing GPT-5.3-Codex-Spark—our first real-time coding model. 15x faster generation, 128k context, now in research preview for ChatGPT Pro users.
GPT-5 lowers the cost of cell-free protein synthesis
An autonomous lab combining OpenAI’s GPT-5 with Ginkgo Bioworks’ cloud automation cut cell-free protein synthesis costs by 40% through closed-loop experimentation.
GPT-5.3-Codex System Card
GPT‑5.3-Codex is the most capable agentic coding model to date, combining the frontier coding performance of GPT‑5.2-Codex with the reasoning and professional knowledge capabilities of GPT‑5.2.
Introducing GPT-5.3-Codex
GPT-5.3-Codex is a Codex-native agent that pairs frontier coding performance with general reasoning to support long-horizon, real-world technical work.
VfL Wolfsburg turns ChatGPT into a club-wide capability
By focusing on people, not pilots, the Bundesliga club is scaling efficiency, creativity, and knowledge—without losing its football identity.
The Sora feed philosophy
Discover the Sora feed philosophy—built to spark creativity, foster connections, and keep experiences safe with personalized recommendations, parental controls, and strong guardrails.
Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT
On February 13, 2026, alongside the previously announced retirement of GPT‑5 (Instant, Thinking, and Pro), we will retire GPT‑4o, GPT‑4.1, GPT‑4.1 mini, and OpenAI o4-mini from ChatGPT. In the API, there are no changes at this time.
Inside Praktika's conversational approach to language learning
How Praktika uses GPT-4.1 and GPT-5.2 to build adaptive AI tutors that personalize lessons, track progress, and help learners achieve real-world language fluency
Inside GPT-5 for Work: How Businesses Use GPT-5
A data-driven report on how workers across industries use ChatGPT—covering adoption trends, top tasks, departmental patterns, and the future of AI at work.
How Higgsfield turns simple ideas into cinematic social videos
Discover how Higgsfield gives creators cinematic, social-first video output from simple inputs using OpenAI GPT-4.1, GPT-5, and Sora 2.
How countries can end the capability overhang
Our latest report reveals stark differences in advanced AI adoption across countries and outlines new initiatives to help nations capture productivity gains from AI.
Introducing Waypoint-1: Real-time interactive video diffusion from Overworld
How Tolan builds voice-first AI with GPT-5.1
Tolan built a voice-first AI companion with GPT-5.1, combining low-latency responses, real-time context reconstruction, and memory-driven personalities for natural conversations.
NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI
Introducing GPT-5.2-Codex
GPT-5.2-Codex is OpenAI’s most advanced coding model, offering long-horizon reasoning, large-scale code transformations, and enhanced cybersecurity capabilities.
Introducing GPT-5.2-Codex
GPT-5.2-Codex is OpenAI’s most advanced coding model, offering long-horizon reasoning, large-scale code transformations, and enhanced cybersecurity capabilities.
Measuring AI’s capability to accelerate biological research
OpenAI introduces a real-world evaluation framework to measure how AI can accelerate biological research in the wet lab. Using GPT-5 to optimize a molecular cloning protocol, the work explores both the promise and risks of AI-assisted experimentation.
How We Used Codex to Ship Sora for Android in 28 Days
OpenAI shipped Sora for Android in 28 days using Codex. AI-assisted planning, translation, and parallel coding workflows helped a nimble team deliver rapid, reliable development.
New in llama.cpp: Model Management
Advancing science and math with GPT-5.2
GPT-5.2 is OpenAI’s strongest model yet for math and science, setting new state-of-the-art results on benchmarks like GPQA Diamond and FrontierMath. This post shows how those gains translate into real research progress, including solving an open theoretical problem and generating reliable mathematical proofs.
Introducing GPT-5.2
GPT-5.2 is our most advanced frontier model for everyday professional work, with state-of-the-art reasoning, long-context understanding, coding, and vision. Use it in ChatGPT and the OpenAI API to power faster, more reliable agentic workflows.
Update to GPT-5 System Card: GPT-5.2
GPT-5.2 is the latest model family in the GPT-5 series. The comprehensive safety mitigation approach for these models is largely the same as that described in the GPT-5 System Card and GPT-5.1 System Card. Like OpenAI’s other models, the GPT-5.2 models were trained on diverse datasets, including information that is publicly available on the internet, information that we partner with third parties to access, and information that our users or human trainers and researchers provide or generate.
Building Deep Research: How we Achieved State of the Art
GPT-5 and the future of mathematical discovery
UCLA Professor Ernest Ryu and GPT-5 solved a key question in optimization theory, showcasing AI’s role in accelerating mathematical discovery.
Early experiments in accelerating science with GPT-5
OpenAI introduces the first research cases showing how GPT-5 accelerates scientific progress across math, physics, biology, and computer science. Explore how AI and researchers collaborate to generate proofs, uncover new insights, and reshape the pace of discovery.
Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models
Building more with GPT-5.1-Codex-Max
Introducing GPT-5.1-Codex-Max, a faster, more intelligent agentic coding model for Codex. The model is designed for long-running, project-scale work with enhanced reasoning and token efficiency.
Understanding neural networks through sparse circuits
OpenAI is exploring mechanistic interpretability to understand how neural networks reason. Our new sparse model approach could make AI systems more transparent and support safer, more reliable behavior.
Introducing GPT-5.1 for developers
GPT-5.1 is now available in the API, bringing faster adaptive reasoning, extended prompt caching, improved coding performance, and new apply_patch and shell tools.
GPT-5.1: A smarter, more conversational ChatGPT
We’re upgrading the GPT-5 series with warmer, more capable models and new ways to customize ChatGPT’s tone and style. GPT-5.1 starts rolling out today to paid users.
Introducing IndQA
OpenAI introduces IndQA, a new benchmark for evaluating AI systems in Indian languages. Built with domain experts, IndQA tests cultural understanding and reasoning across 12 languages and 10 knowledge areas.
Addendum to GPT-5 System Card: Sensitive conversations
This system card details GPT-5’s improvements in handling sensitive conversations, including new benchmarks for emotional reliance, mental health, and jailbreak resistance.
With GPT-5, Wrtn builds lifestyle AI for millions in Korea
Wrtn scaled AI apps to 6.5M users in Korea with GPT-5, creating ‘Lifestyle AI’ that blends productivity, creativity, and learning—now expanding across East Asia.
SOTA OCR with Core ML and dots.ocr
Sora 2 is here
Our latest video generation model is more physically accurate, realistic, and controllable than prior systems. It also features synchronized dialogue and sound effects. Create with it in the new Sora app.
Launching Sora responsibly
To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we’ve built Sora 2 and the Sora app with safety at the foundation. Our approach is anchored in concrete protections.
Sora 2 System Card
Sora 2 is our new state of the art video and audio generation model. Building on the foundation of Sora, this new model introduces capabilities that have been difficult for prior video models to achieve– such as more accurate physics, sharper realism, synchronized audio, enhanced steerability, and an expanded stylistic range.
Creating a safe, observable AI infrastructure for 1 million classrooms
Discover how SchoolAI, built on OpenAI’s GPT-4.1, image generation, and TTS, powers safe, teacher-guided AI tools for 1 million classrooms worldwide—boosting engagement, oversight, and personalized learning.
GPT-5 bio bug bounty call
OpenAI invites researchers to its Bio Bug Bounty. Test GPT-5’s safety with a universal jailbreak prompt and win up to $25,000.
Introducing GPT-5 for developers
Introducing GPT-5 in our API platform—offering high reasoning performance, new controls for devs, and best-in-class results on real coding tasks.
Coding and design with GPT-5
Learn how GPT-5 unlocks new possibilities in coding and design.
Creative writing with GPT-5
Learn how GPT-5 assists with creative writing.