Research
1,752 articles total
Research
Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally
<p><strong><a href="https://twitter.com/danveloper/status/2034353876753592372">Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally</a></strong></p> Here's a fascinating piece of research by Dan Woods, who managed to get a custom version of <a href="https://huggingface.co/Qwen/Qwen3.5-397B-A17B/tree/main">Qwen3.5-397B-A17B</a> running at 5.5+ tokens/second on a 48GB MacBook Pro M3 Max despite that model taking up 209GB (120GB quantized) on disk.</p> <p>Qwen3.5-397B-A17B is a Mixture
GPT-5.4 mini and GPT-5.4 nano, which can describe 76,000 photos for $52
<p>OpenAI today: <a href="https://openai.com/index/introducing-gpt-5-4-mini-and-nano/">Introducing GPT‑5.4 mini and nano</a>. These models join GPT-5.4 which was released <a href="https://openai.com/index/introducing-gpt-5-4/">two weeks ago</a>.</p> <p>OpenAI's self-reported benchmarks show the new 5.4-nano out-performing their previous GPT-5 mini model when run at maximum reasoning effort. The new mini is also 2x faster than the previous mini.</p> <p>Here's how the pricing looks - all prices ar
Introducing GPT-5.4 mini and nano
GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.
Introducing Mistral Small 4
<p><strong><a href="https://mistral.ai/news/mistral-small-4">Introducing Mistral Small 4</a></strong></p> Big new release from Mistral today (despite the name) - a new Apache 2 licensed 119B parameter (Mixture-of-Experts, 6B active) model which they describe like this:</p> <blockquote> <p>Mistral Small 4 is the first Mistral model to unify the capabilities of our flagship models, Magistral for reasoning, Pixtral for multimodal, and Devstral for agentic coding, into a single, versatile model.</p>
1M context is now generally available for Opus 4.6 and Sonnet 4.6
<p><strong><a href="https://claude.com/blog/1m-context-ga">1M context is now generally available for Opus 4.6 and Sonnet 4.6</a></strong></p> Here's what surprised me:</p> <blockquote> <p>Standard pricing now applies across the full 1M window for both models, with no long-context premium.</p> </blockquote> <p>OpenAI and Gemini both <a href="https://www.llm-prices.com/#sel=gemini-3-1-pro-preview-200k%2Cgpt-5.4-272k%2Cgemini-3-1-pro-preview%2Cgpt-5.4">charge more</a> for prompts where the token co
Gemini in Google Sheets just achieved state-of-the-art performance.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Workspace_Jan_Moment_Sheets_Blo.max-600x600.format-webp.webp">Today we announced new beta features for Gemini in Sheets to help you create, organize and edit entire sheets, from basic tasks to complex data analysis — just describe …
GPT-5.4 Thinking System Card
Reasoning models struggle to control their chains of thought, and that’s good
OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.
Introducing GPT-5.4
Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context.
Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines
Gemini 3.1 Flash-Lite: Built for intelligence at scale
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-3.1_flash_Lite_blog_keyw.max-600x600.format-webp.webp">Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.
GPT-5.3 Instant: Smoother, more useful everyday conversations
GPT-5.3 Instant System Card
Build with Nano Banana 2, our best image generation and editing model
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/BuildWith_Hero.max-600x600.format-webp.webp">Nano Banana 2 (Gemini 3.1 Flash Image) delivers Pro-level intelligence and fidelity for all image applications.
GGML and llama.cpp join HF to ensure the long-term progress of Local AI
A new way to express yourself: Gemini can now create music
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/0217_KeywordHeaderFinalc.max-600x600.format-webp.webp">Lyria 3 is now available in the Gemini app. Create custom, high-quality 30-second tracks from text and images.
GPT-5.2 derives a new result in theoretical physics
A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators.
Custom Kernels for All from Codex and Claude
Introducing GPT-5.3-Codex-Spark
Introducing GPT-5.3-Codex-Spark—our first real-time coding model. 15x faster generation, 128k context, now in research preview for ChatGPT Pro users.
GPT-5 lowers the cost of cell-free protein synthesis
An autonomous lab combining OpenAI’s GPT-5 with Ginkgo Bioworks’ cloud automation cut cell-free protein synthesis costs by 40% through closed-loop experimentation.
GPT-5.3-Codex System Card
GPT‑5.3-Codex is the most capable agentic coding model to date, combining the frontier coding performance of GPT‑5.2-Codex with the reasoning and professional knowledge capabilities of GPT‑5.2.
Introducing GPT-5.3-Codex
GPT-5.3-Codex is a Codex-native agent that pairs frontier coding performance with general reasoning to support long-horizon, real-world technical work.
VfL Wolfsburg turns ChatGPT into a club-wide capability
By focusing on people, not pilots, the Bundesliga club is scaling efficiency, creativity, and knowledge—without losing its football identity.
The Sora feed philosophy
Discover the Sora feed philosophy—built to spark creativity, foster connections, and keep experiences safe with personalized recommendations, parental controls, and strong guardrails.
Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT
On February 13, 2026, alongside the previously announced retirement of GPT‑5 (Instant, Thinking, and Pro), we will retire GPT‑4o, GPT‑4.1, GPT‑4.1 mini, and OpenAI o4-mini from ChatGPT. In the API, there are no changes at this time.
Inside Praktika's conversational approach to language learning
How Praktika uses GPT-4.1 and GPT-5.2 to build adaptive AI tutors that personalize lessons, track progress, and help learners achieve real-world language fluency
Inside GPT-5 for Work: How Businesses Use GPT-5
A data-driven report on how workers across industries use ChatGPT—covering adoption trends, top tasks, departmental patterns, and the future of AI at work.
How Higgsfield turns simple ideas into cinematic social videos
Discover how Higgsfield gives creators cinematic, social-first video output from simple inputs using OpenAI GPT-4.1, GPT-5, and Sora 2.
How countries can end the capability overhang
Our latest report reveals stark differences in advanced AI adoption across countries and outlines new initiatives to help nations capture productivity gains from AI.
Introducing Waypoint-1: Real-time interactive video diffusion from Overworld
How Tolan builds voice-first AI with GPT-5.1
Tolan built a voice-first AI companion with GPT-5.1, combining low-latency responses, real-time context reconstruction, and memory-driven personalities for natural conversations.
NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI
Introducing GPT-5.2-Codex
GPT-5.2-Codex is OpenAI’s most advanced coding model, offering long-horizon reasoning, large-scale code transformations, and enhanced cybersecurity capabilities.
Introducing GPT-5.2-Codex
GPT-5.2-Codex is OpenAI’s most advanced coding model, offering long-horizon reasoning, large-scale code transformations, and enhanced cybersecurity capabilities.
Measuring AI’s capability to accelerate biological research
OpenAI introduces a real-world evaluation framework to measure how AI can accelerate biological research in the wet lab. Using GPT-5 to optimize a molecular cloning protocol, the work explores both the promise and risks of AI-assisted experimentation.
How We Used Codex to Ship Sora for Android in 28 Days
OpenAI shipped Sora for Android in 28 days using Codex. AI-assisted planning, translation, and parallel coding workflows helped a nimble team deliver rapid, reliable development.
New in llama.cpp: Model Management
Advancing science and math with GPT-5.2
GPT-5.2 is OpenAI’s strongest model yet for math and science, setting new state-of-the-art results on benchmarks like GPQA Diamond and FrontierMath. This post shows how those gains translate into real research progress, including solving an open theoretical problem and generating reliable mathematical proofs.
Introducing GPT-5.2
GPT-5.2 is our most advanced frontier model for everyday professional work, with state-of-the-art reasoning, long-context understanding, coding, and vision. Use it in ChatGPT and the OpenAI API to power faster, more reliable agentic workflows.
Update to GPT-5 System Card: GPT-5.2
GPT-5.2 is the latest model family in the GPT-5 series. The comprehensive safety mitigation approach for these models is largely the same as that described in the GPT-5 System Card and GPT-5.1 System Card. Like OpenAI’s other models, the GPT-5.2 models were trained on diverse datasets, including information that is publicly available on the internet, information that we partner with third parties to access, and information that our users or human trainers and researchers provide or generate.
Building Deep Research: How we Achieved State of the Art
GPT-5 and the future of mathematical discovery
UCLA Professor Ernest Ryu and GPT-5 solved a key question in optimization theory, showcasing AI’s role in accelerating mathematical discovery.
Early experiments in accelerating science with GPT-5
OpenAI introduces the first research cases showing how GPT-5 accelerates scientific progress across math, physics, biology, and computer science. Explore how AI and researchers collaborate to generate proofs, uncover new insights, and reshape the pace of discovery.
Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models
Building more with GPT-5.1-Codex-Max
Introducing GPT-5.1-Codex-Max, a faster, more intelligent agentic coding model for Codex. The model is designed for long-running, project-scale work with enhanced reasoning and token efficiency.
Understanding neural networks through sparse circuits
OpenAI is exploring mechanistic interpretability to understand how neural networks reason. Our new sparse model approach could make AI systems more transparent and support safer, more reliable behavior.
Introducing GPT-5.1 for developers
GPT-5.1 is now available in the API, bringing faster adaptive reasoning, extended prompt caching, improved coding performance, and new apply_patch and shell tools.
GPT-5.1: A smarter, more conversational ChatGPT
We’re upgrading the GPT-5 series with warmer, more capable models and new ways to customize ChatGPT’s tone and style. GPT-5.1 starts rolling out today to paid users.
Introducing IndQA
OpenAI introduces IndQA, a new benchmark for evaluating AI systems in Indian languages. Built with domain experts, IndQA tests cultural understanding and reasoning across 12 languages and 10 knowledge areas.
Addendum to GPT-5 System Card: Sensitive conversations
This system card details GPT-5’s improvements in handling sensitive conversations, including new benchmarks for emotional reliance, mental health, and jailbreak resistance.
With GPT-5, Wrtn builds lifestyle AI for millions in Korea
Wrtn scaled AI apps to 6.5M users in Korea with GPT-5, creating ‘Lifestyle AI’ that blends productivity, creativity, and learning—now expanding across East Asia.
SOTA OCR with Core ML and dots.ocr
Sora 2 is here
Our latest video generation model is more physically accurate, realistic, and controllable than prior systems. It also features synchronized dialogue and sound effects. Create with it in the new Sora app.
Launching Sora responsibly
To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we’ve built Sora 2 and the Sora app with safety at the foundation. Our approach is anchored in concrete protections.
Sora 2 System Card
Sora 2 is our new state of the art video and audio generation model. Building on the foundation of Sora, this new model introduces capabilities that have been difficult for prior video models to achieve– such as more accurate physics, sharper realism, synchronized audio, enhanced steerability, and an expanded stylistic range.
Creating a safe, observable AI infrastructure for 1 million classrooms
Discover how SchoolAI, built on OpenAI’s GPT-4.1, image generation, and TTS, powers safe, teacher-guided AI tools for 1 million classrooms worldwide—boosting engagement, oversight, and personalized learning.
GPT-5 bio bug bounty call
OpenAI invites researchers to its Bio Bug Bounty. Test GPT-5’s safety with a universal jailbreak prompt and win up to $25,000.
Introducing GPT-5 for developers
Introducing GPT-5 in our API platform—offering high reasoning performance, new controls for devs, and best-in-class results on real coding tasks.
Coding and design with GPT-5
Learn how GPT-5 unlocks new possibilities in coding and design.
Creative writing with GPT-5
Learn how GPT-5 assists with creative writing.
First look at GPT-5
See how a group of leading developers use GPT-5 for the first time.
Introducing GPT-5
We are introducing GPT‑5, our best AI system yet. GPT‑5 is a significant leap in intelligence over all our previous models, featuring state-of-the-art performance across coding, math, writing, health, visual perception, and more.
GPT-5 System Card
This GPT-5 system card explains how a unified model routing system powers fast and smart responses using gpt-5-main, gpt-5-thinking, and lightweight versions like gpt-5-thinking-nano, optimized for different tasks and developer use.
Measuring Open-Source Llama Nemotron Models on DeepResearch Bench
📚 3LM: A Benchmark for Arabic LLMs in STEM and Code
TimeScope: How Long Can Your Video Large Multimodal Model Go?
Ettin Suite: SoTA Paired Encoders and Decoders
Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models
Efficient MultiModal Data Pipeline
Shipping code faster with o3, o4-mini, and GPT-4.1
CodeRabbit uses OpenAI models to revolutionize code reviews—boosting accuracy, accelerating PR merges, and helping developers ship faster with fewer bugs and higher ROI.
Falcon-Arabic: A Breakthrough in Arabic Language Models
Vision Language Models (Better, faster, stronger)
The 4 Things Qwen-3’s Chat Template Teaches Us
Sycophancy in GPT-4o: what happened and what we’re doing about it
We have rolled back last week’s GPT‑4o update in ChatGPT so people are now using an earlier version with more balanced behavior. The update we removed was overly flattering or agreeable—often described as sycophantic.
Thinking with images
OpenAI o3 and o4-mini represent a significant breakthrough in visual perception by reasoning with images in their chain of thought.
OpenAI o3 and o4-mini System Card
OpenAI o3 and OpenAI o4-mini combine state-of-the-art reasoning with full tool capabilities—web browsing, Python, image and file analysis, image generation, canvas, automations, file search, and memory.
Visual Salamandra: Pushing the Boundaries of Multimodal Understanding
Introducing 4o Image Generation
At OpenAI, we have long believed image generation should be a primary capability of our language models. That’s why we’ve built our most advanced image generator yet into GPT‑4o. The result—image generation that is not only beautiful, but useful.
Addendum to GPT-4o System Card: 4o image generation
4o image generation is a new, significantly more capable image generation approach than our earlier DALL·E 3 series of models. It can create photorealistic output. It can take images as inputs and transform them.
Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM
Detecting misbehavior in frontier reasoning models
Frontier reasoning models exploit loopholes when given the chance. We show we can detect exploits using an LLM to monitor their chains-of-thought. Penalizing their “bad thoughts” doesn’t stop the majority of misbehavior—it makes them hide their intent.
A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality
OpenAI GPT-4.5 System Card
We’re releasing a research preview of OpenAI GPT‑4.5, our largest and most knowledgeable model yet.
Introducing GPT-4.5
We’re releasing a research preview of GPT‑4.5—our largest and best model for chat yet. GPT‑4.5 is a step forward in scaling up pre-training and post-training.
SigLIP 2: A better multilingual vision language encoder
PaliGemma 2 Mix - New Instruction Vision Language Models by Google
Introducing the SWE-Lancer benchmark
Can frontier LLMs earn $1 million from real-world freelance software engineering?
Build awesome datasets for video generation
DABStep: Data Agent Benchmark for Multi-step Reasoning
Strengthening America’s AI leadership with the U.S. National Laboratories
OpenAI’s latest line of reasoning models will be used by nation’s leading scientists to drive scientific breakthroughs.
State of open video generation models in Diffusers
Benchmarking Language Model Performance on 5th Gen Xeon at GCP
Boosting the customer retail experience with GPT-4o mini
Zalando boosts the customer experience with its Assistant, powered by GPT-4o mini
Sora is here
Our video generation model, Sora, is now available to use at sora.com. Users can generate videos up to 1080p resolution, up to 20 sec long, and in widescreen, vertical or square aspect ratios. You can bring your own assets to extend, remix, and blend, or generate entirely new content from text.
Sora System Card
Sora is OpenAI’s video generation model, designed to take text, image, and video inputs and generate a new video as an output. Sora builds on learnings from DALL-E and GPT models, and is designed to give people expanded tools for storytelling and creative expression.
Vallée Duhamel & Sora
Filmmaking duo Vallée Duhamel explains how Sora helps build new worlds.
Minne Atairu & Sora
Interdisciplinary artist Minne Atairu discusses how Sora helps realize her vision.
Animator Lyndon Barrois creates new worlds with Sora
Filmmaker Lyndon Barrois describes how to use Sora as a storytelling tool.