AI Development

2,213 articles total

AI Development

Accelerating the cyber defense ecosystem that protects us all

developmentOpenAI Blog4/16/2026

Leading security firms and enterprises join OpenAI’s Trusted Access for Cyber, using GPT-5.4-Cyber and $10M in API grants to strengthen global cyber defense.

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

developmentHugging Face Blog4/16/2026

The next evolution of the Agents SDK

developmentOpenAI Blog4/15/2026

OpenAI updates the Agents SDK with native sandbox execution and a model-native harness, helping developers build secure, long-running agents across files and tools.

datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection

developmentSimon Willison's Weblog4/14/2026

<a href="https://github.com/simonw/datasette/pull/2689">datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection</a> Datasette has long protected against CSRF attacks using CSRF tokens, implemented using my <a href="https://github.com/simonw/asgi-csrf">asgi-csrf</a> Python library. These are something of a pain to work with - you need to scatter forms in templates with <code><input type="hidden" name="csrftoken" value="{{ csrftoken() }}"></code>

Trusted access for the next era of cyber defense

developmentSimon Willison's Weblog4/14/2026

<a href="https://openai.com/index/scaling-trusted-access-for-cyber-defense/">Trusted access for the next era of cyber defense</a> OpenAI's answer to <a href="https://simonwillison.net/2026/Apr/7/project-glasswing/">Claude Mythos</a> appears to be a new model called GPT-5.4-Cyber: <blockquote> In preparation for increasingly more capable models from OpenAI over the next few months, we are fine-tuning our models specifically to enable defensive cybersecurity use case

Trusted access for the next era of cyber defense

developmentOpenAI Blog4/14/2026

OpenAI expands its Trusted Access for Cyber program, introducing GPT-5.4-Cyber to vetted defenders and strengthening safeguards as AI cybersecurity capabilities advance.

Exploring the new `servo` crate

developmentSimon Willison's Weblog4/13/2026

Research: <a href="https://github.com/simonw/research/tree/main/servo-crate-exploration#readme">Exploring the new `servo` crate</a> In <a href="https://servo.org/blog/2026/04/13/servo-0.1.0-release/">Servo is now available on crates.io</a> the Servo team announced the initial release of the <a href="https://crates.io/crates/servo">servo</a> crate, which packages their browser engine as an embeddable library. I set Claude Code for web <a href="https://github.com/

Gemma 4 audio with MLX

developmentSimon Willison's Weblog4/12/2026

Thanks to a <a href="https://twitter.com/RahimNathwani/status/2039961945613209852">tip from Rahim Nathwani</a>, here's a <code>uv run</code> recipe for transcribing an audio file on macOS using the 10.28 GB <a href="https://huggingface.co/google/gemma-4-E2B">Gemma 4 E2B model</a> with MLX and <a href="https://github.com/Blaizzy/mlx-vlm">mlx-vlm</a>: <pre><code>uv run --python 3.13 --with mlx_vlm --with torchvision --with gradio \ mlx_vlm.generate \ --model google/gemma-4-e2b-it \ --audio

SQLite 3.53.0

developmentSimon Willison's Weblog4/11/2026

<a href="https://sqlite.org/releaselog/3_53_0.html">SQLite 3.53.0</a> SQLite 3.52.0 was withdrawn so this is a pretty big release with a whole lot of accumulated user-facing and internal improvements. Some that stood out to me: <ul> <li><code>ALTER TABLE</code> can now add and remove <code>NOT NULL</code> and <code>CHECK</code> constraints - I've previously used my own <a href="https://sqlite-utils.datasette.io/en/stable/python-api.html#changing-not-null-status">sqlit

Microsoft starts removing Copilot buttons from Windows 11 apps

developmentThe Verge AI4/10/2026

Microsoft is starting to remove "unnecessary" Copilot buttons from its Windows 11 apps. In the latest version of the Notepad app for Windows Insiders, Microsoft has removed the Copilot button in favor of a "writing tools" menu. The Copilot button in the Snipping Tool app also no longer appears when you select an area to […]

Our response to the Axios developer tool compromise

developmentOpenAI Blog4/10/2026

April 10, 2026SecurityOur response to the Axios developer tool compromiseLoading…ShareWe recently identified a security issue involving a third-party developer tool, Axios, that was part of a widely reported, broader industry incident⁠(opens in a new window). Out of an abundance of caution we are taking steps to protect the process that certifies our macOS applications are legitimate OpenAI apps. We found no evidence that OpenAI user data was accessed, that our systems or intellectual property was compromised, or that our software was altered. We are updating our security certificates, which will require all macOS users to update their OpenAI apps to the latest versions. This helps prevent any risk—however unlikely—of someone attempting to distribute a fake app that appears to be from OpenAI. You can update safely through an in-app update or at the official links below:ChatGPT Desktop⁠(opens in a new window)Codex App⁠(opens in a new window)Codex CLI⁠(opens in a new window)Atlas⁠(opens in a new window)The security and privacy of your information are a top priority. We’re committed to being transparent and taking quick action when issues arise. We're sharing more technical details and FAQs below.What happened and what we are doingOn March 31, 2026 (UTC), Axios, a widely used third-party developer library, was compromised as part of a broader software supply chain attack.⁠(opens in a new window) At that time, a GitHub Actions workflow we use in the macOS app-signing process downloaded and executed a malicious version of Axios (version 1.14.1). This workflow had access to a certificate and notarization material used for signing macOS applications, including ChatGPT Desktop, Codex, Codex-cli, and Atlas. This certificate helps customers know that software comes from the legitimate developer, OpenAI. Our analysis of the incident concluded that the signing certificate present in this workflow was likely not successfully exfiltrated by the malicious payload due to the timing of the payload execution, certificate injection into the job, sequencing of the job itself, and other mitigating factors. Nevertheless, out of an abundance of caution we are treating the certificate as compromised, and are revoking and rotating it. Effective May 8, 2026, older versions of our macOS desktop apps will no longer receive updates or support, and may not be functional. These versions represent the earliest releases signed with our updated certificate:ChatGPT Desktop: 1.2026.051Codex App: 26.406.40811Codex CLI: 0.119.0Atlas: 1.2026.84.2Investigation and remediation effortsAs part of our investigation and response, we engaged a third-party digital forensics and incident response firm, rotated our macOS code signing certificate, published new builds of all relevant macOS products with the new certificate, and are working with Apple to ensure software signed with the previous certificate cannot be newly notarized. We have also reviewed all notarization of software using our previous certificate to confirm no unexpected software notarization occurred with these keys, and validated that our published software did not have unauthorized modifications. At this time, we have found no evidence of compromise or risk to existing software installations.In the event that the certificate was successfully compromised by a malicious actor, they could use it to sign their own code, making it appear as legitimate OpenAI software. We have stopped new software notarizations using the old certificate, so new software signed with the old certificate by an unauthorized third party would be blocked by default by macOS security protections unless a user explicitly bypasses them. Once we fully revoke our certificate on May 8th, 2026, new downloads and launches of apps signed with the previous certificate will be blocked by macOS security protections.The root cause of this incident was a misconfiguration in the GitHub Actions workflow, which we have addressed. Specifically, the action in question used a floating tag, as opposed to a specific commit hash, and did not have a configured minimumReleaseAge for new packages. FAQWere OpenAI products or user data compromised?No. We have found no evidence that OpenAI products or user data were compromised or exposed.Have you seen malware signed as OpenAI?No. We have found no evidence that the potentially exposed notarization and code signing material have been misused, and we have confirmed all notarization events with the impacted material were expected. Do I need to change my password?No. Passwords and OpenAI API keys were not affected.Does this affect iOS, Android, Linux, or Windows?No. This only affects OpenAI macOS apps. This does not affect the web versions of our software. Why are you asking me to update my Mac apps?OpenAI identified exposure in a GitHub Actions workflow involved in the macOS app-signing process. Because the exposed workflow was related to macOS app signing, we are proactively rotating the notariza

Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs

developmentHugging Face Blog4/9/2026

Multimodal Embedding & Reranker Models with Sentence Transformers

developmentHugging Face Blog4/9/2026

Safetensors is Joining the PyTorch Foundation

developmentHugging Face Blog4/8/2026

scan-for-secrets 0.3

developmentSimon Willison's Weblog4/6/2026

Release: <a href="https://github.com/simonw/scan-for-secrets/releases/tag/0.3">scan-for-secrets 0.3</a> <blockquote> <ul> <li>New <code>-r/--redact</code> option which shows the list of matches, asks for confirmation and then replaces every match with <code>REDACTED</code>, taking escaping rules into account.</li> <li>New Python function <code>redact_file(file_path: str | Path, secrets: list[str], replacement: str = "REDACTED") -> int</code>.</li> </ul> </blockquote> T

scan-for-secrets 0.1

developmentSimon Willison's Weblog4/5/2026

Release: <a href="https://github.com/simonw/scan-for-secrets/releases/tag/0.1">scan-for-secrets 0.1</a> I like publishing transcripts of local Claude Code sessions using my <a href="https://github.com/simonw/claude-code-transcripts">claude-code-transcripts</a> tool but I'm often paranoid that one of my API keys or similar secrets might inadvertently be revealed in the detailed log files. I built this new Python scanning tool to help reassure me. You can feed it

research-llm-apis 2026-04-04

developmentSimon Willison's Weblog4/5/2026

Release: <a href="https://github.com/simonw/research-llm-apis/releases/tag/2026-04-04">research-llm-apis 2026-04-04</a> I'm working on a <a href="https://github.com/simonw/llm/issues/1314">major change</a> to my LLM Python library and CLI tool. LLM provides an abstraction layer over hundreds of different LLMs from dozens of different vendors thanks to its plugin system, and some of those vendors have grown new features over the past year which LLM's abstraction layer c

New Rowhammer attacks give complete control of machines running Nvidia GPUs

developmentArs Technica AI4/2/2026

GDDRHammer, GeForge and GPUBreach hammer GPU memory in ways that hijack the CPU.

Mercor says it was hit by cyberattack tied to compromise of open source LiteLLM project

developmentTechCrunch AI4/1/2026

Mercor, a popular AI recruiting startup, has confirmed a security incident linked to a supply chain attack involving the open source project LiteLLM. The AI startup told TechCrunch on Tuesday that it was “one of thousands of companies” affected by a recent compromise of LiteLLM’s project, which was linked to a hacking group called TeamPCP. Confirmation of the incident comes as extortion hacking group Lapsus$ claimed it had targeted Mercor and gained access to its data. It’s not immediately clear how the Lapsus$ gang obtained the stolen data from Mercor as part of TeamPCP’s cyberattack. Founded in 2023, Mercor works with companies, including OpenAI and Anthropic, to train AI models by contracting specialized domain experts such as scientists, doctors, and lawyers from markets, including India. The startup says it facilitates more than $2 million in daily payouts and was valued at $10 billion following a $350 million Series C round led by Felicis Ventures in October 2025. Mercor spokesperson Heidi Hagberg confirmed to TechCrunch that the company had “moved promptly” to contain and remediate the security incident. “We are conducting a thorough investigation supported by leading third-party forensics experts,” said Hagberg. “We will continue to communicate with our customers and contractors directly as appropriate and devote the resources necessary to resolving the matter as soon as possible.” Earlier, Lapsus$ claimed responsibility for the apparent data breach on its leak site and shared a sample of data allegedly taken from Mercor, which TechCrunch reviewed. The sample included material referencing Slack data and what appeared to be ticketing data, as well as two videos purportedly showing conversations between Mercor’s AI systems and contractors on its platform. Techcrunch event This Week Only: Up to $482 savings for Disrupt 2026 Offer ends April 10, 11:59 p.m. PTYour next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to secure these savings. This Week Only: Up to $482 savings for Disrupt 2026 Offer ends April 10, 11:59 p.m. PTYour next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to secure these savings. San Francisco, CA | October 13-15, 2026 REGISTER NOW Hagberg declined to answer follow-up questions on whether the incident was connected to claims by Lapsus$, or whether any customer or contractor data had been accessed, exfiltrated, or misused. The compromise of LiteLLM originally surfaced last week after malicious code was discovered in a package associated with the Y Combinator-backed startup’s open source project. While the malicious code was identified and removed within hours, the incident drew scrutiny due to LiteLLM’s widespread use around the internet, with the library downloaded millions of times per day, per security firm Snyk. The incident also prompted LiteLLM to make changes to its compliance processes, including shifting from controversial startup Delve to Vanta for compliance certifications. It remains unclear how many companies were affected by the LiteLLM-related incident or whether any data exposure occurred, as investigations continue.

TRL v1.0: Post-Training Library Built to Move with the Field

developmentHugging Face Blog3/31/2026

Quantization from the ground up

developmentSimon Willison's Weblog3/26/2026

<a href="https://ngrok.com/blog/quantization">Quantization from the ground up</a> Sam Rose continues <a href="https://simonwillison.net/tags/sam-rose/">his streak</a> of publishing spectacularly informative interactive essays, this time explaining how quantization of Large Language Models works (which he says might be "<a href="https://twitter.com/samwhoo/status/2036845101561835968">the best post I've ever made</a>".) Also included is the best visual explanation I'

Cohere launches an open source voice model specifically for transcription

developmentTechCrunch AI3/26/2026

Relatively light at just 2 billion parameters, the model is meant for use with consumer-grade GPUs for those who want to self-host it. It currently supports 14 languages.

Roundtables: The Next Era of Space Exploration

developmentMIT Technology Review AI3/25/2026

Listen to the session or watch below Whether it’s the race to find life on Mars, the campaign to outsmart killer asteroids, or the quest to make the moon a permanent home to astronauts, scientists’ efforts in space can tell us more about where humanity is headed. This subscriber-only discussion examines the progress and possibilities…

With $3.5B in fresh capital, Kleiner Perkins is going all in on AI

developmentTechCrunch AI3/25/2026

The fundraise includes $1 billion for investing in early-stage startups, and $2.5 billion for late-stage growth businesses.

OpenAI adds open source tools to help developers build for teen safety

developmentTechCrunch AI3/24/2026

Rather than working from scratch to figure out how to make AI safer for teens, developers can use these policies to fortify what they build.

Experimenting with Starlette 1.0 with Claude skills

developmentSimon Willison's Weblog3/22/2026

<a href="https://marcelotryle.com/blog/2026/03/22/starlette-10-is-here/">Starlette 1.0 is out</a>! This is a really big deal. I think Starlette may be the Python framework with the most usage compared to its relatively low brand recognition because Starlette is the foundation of <a href="https://fastapi.tiangolo.com/">FastAPI</a>, which has attracted a huge amount of buzz that seems to have overshadowed Starlette itself. Kim Christie started working on Starlette in 2018 and it quickly

Cursor admits its new coding model was built on top of Moonshot AI’s Kimi

developmentTechCrunch AI3/22/2026

Building on top of a Chinese model feels particularly fraught right now.

Thinking Fast, Slow, and Artificial: How AI Is Reshaping Human Reasoning

developmentHacker News (Best)3/21/2026

Article URL: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646 Comments URL: https://news.ycombinator.com/item?id=47467913 Points: 191 # Comments: 115

OpenCode – Open source AI coding agent

developmentHacker News (Best)3/20/2026

Article URL: https://opencode.ai/ Comments URL: https://news.ycombinator.com/item?id=47460525 Points: 887 # Comments: 412

Microsoft rolls back some of its Copilot AI bloat on Windows

developmentTechCrunch AI3/20/2026

The company is reducing Copilot entry points on Windows, starting with Photos, Widgets, Notepad, and other apps.

Build a Domain-Specific Embedding Model in Under a Day

developmentHugging Face Blog3/20/2026

Back to Articles Build a Domain-Specific Embedding Model in Under a Day Enterprise + Article Published March 20, 2026 Upvote 9 +3 Steve H steve-nvidia Follow nvidia Rucha Apte ruchaa01 Follow nvidia Sean Sodha ssodha-nv Follow nvidia Oliver Holworthy nvidia-oliver-holworthy Follow nvidia If you are building a RAG (Retrieval-Augmented Generation) system, you have likely hit this wall: Everything works… until it doesn’t. General-purpose embedding models are trained to understand the internet; not your contracts, manufacturing logs, proprietary chemical formulations or internal taxonomy. They capture broad semantic similarity, but they do not understand the fine-grained distinctions that matter in your domain. Fine-tuning an embedding model can improve the performance of your retrieval pipeline when off-the-shelf models fail to effectively capture domain-specific nuances. Despite how critical embeddings are to RAG performance, the process remains surprisingly fragmented, the skills required are specialized, and the time investment is daunting. With a single GPU and less than a day of training time, you can transform a general-purpose embedding model into one that truly understands your domain, no manual labeling required. To help you hit the ground running, we are also releasing a ready-to-use synthetic training dataset generated from NVIDIA's public documentation using this exact pipeline. Using this data and the recipe, we saw over 10% improvement in both Recall@10 and NDCG@10. Atlassian applied this recipe to fine-tune on their JIRA dataset, increasing Recall@60 from 0.751 to 0.951, a 26% improvement - on a single GPU. 🔗Quick Links to Dataset and Codes: Embedding Model GitHub Synthetic dataset on NVIDIA’s public documents 🧑‍💻Open Source Projects Recipe Integrates: NeMo Data Designer for synthetic data generation NeMo Automodel for embedding model training BEIR for Information retrieval evaluation NeMo Export-Deploy for ONNX/TensorRT conversion NVIDIA NIM for production inference serving 📋Prerequisites: A directory of domain documents (text files - .txt, .md, or similar) A valid NVIDIA API key (free at build.nvidia.com) NVIDIA Ampere GPU or newer with at least 80GB memory (with Compute Capability >= 8.0) This tutorial has been tested on 1xA100 (80GB), and 1xH100 (80GB) By the end of this post, you’ll know how to answer:📄 Generate training data from domain documents without labeled data🎯 Use hard negative mining for effective contrastive training🔗 Improve embedding quality with multi-hop queries⚙️ Fine-tune a bi-encoder embedding model📊 Evaluate whether fine-tuning improves retrieval🚀 Deploy the fine-tuned model in your pipeline ⚙️Setup In this tutorial, we will finetune the base model Llama-Nemotron-Embed-1B-v2 - a 1-billion-parameter embedding model that balances quality and inference cost. To get started, follow this setup guide. 📚 Step 1: Generate Training Data from Documents Fine-tuning an embedding model requires thousands of (query, relevant document) pairs. Most use cases don’t have this data readily available. Creating it manually is expensive, slow, and often biased by the annotator’s personal interpretation of what’s “relevant.”Instead of labeling data by hand, you can use an LLM (nvidia/nemotron-3-nano-30b-a3b) to read your documents and automatically generate high-quality synthetic question–answer pairs. nemotron embed sdg -c default corpus_dir=./data/my_domain_docs How does it work? Behind the scenes, this runs a four-stage synthetic data generation (SDG) pipeline powered by NeMo Data Designer: What does the output look like? Source document chunk: The thermal design power (TDP) of the H100 GPU is 700W in SXM form factor. The cooling solution must maintain junction temperature below 83°C under sustained workloads. Liquid cooling is recommended for dense deployments exceeding 4 GPUs per node, as air cooling cannot dissipate sufficient heat in standard 2U chassis configurations. Generated QA pairs: { "question": "What cooling approach is recommended when deploying more than 4 H100 GPUs per server node?", "answer": "Liquid cooling is recommended for dense deployments exceeding 4 GPUs per node, as air cooling cannot dissipate sufficient heat in standard 2U chassis configurations.", "query_type": "contextual", "reasoning_type": "factual", "question_complexity": 3, "segment_ids": [1], "quality_score": 8.5 } { "question": "How does the 700W TDP of the H100 SXM constrain the choice between air and liquid cooling in multi-GPU configurations?", "answer": "The 700W TDP generates substantial heat that must be dissipated to keep junction temperatures below 83°C. In dense configurations exceeding 4 GPUs per node, air cooling in standard 2U chassis cannot handle this thermal load, making liquid cooling necessary.", "query_type": "multi_hop", "reasoning_type": "causal", "question_complexity": 4, "segment_ids": [1, 2], "hop_count": 2, "quality_score": 9.0 } Notice the difference: the first question is

Trump’s AI framework targets state laws, shifts child safety burden to parents

developmentTechCrunch AI3/20/2026

Trump’s AI framework pushes federal preemption of state laws, emphasizes innovation, and shifts responsibility for child safety toward parents while laying out lighter-touch rules for tech companies.

State of Open Source on Hugging Face: Spring 2026

developmentHugging Face Blog3/17/2026

Our latest investment in open source security for the AI era

developmentGoogle AI Blog3/17/2026

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/25367___BRS___Aspen_Security_Fo.max-600x600.format-webp_J9uPoFt.webp">Google is making new investments, building new tools and developing code security to improve open source security.

Introducing Storage Buckets on the Hugging Face Hub

developmentHugging Face Blog3/10/2026

Mixture of Experts (MoEs) in Transformers

developmentHugging Face Blog2/26/2026

Train AI models with Unsloth and Hugging Face Jobs for FREE

developmentHugging Face Blog2/20/2026

Transformers.js v4 Preview: Now Available on NPM!

developmentHugging Face Blog2/9/2026

Introducing Trusted Access for Cyber

developmentOpenAI Blog2/5/2026

OpenAI introduces Trusted Access for Cyber, a trust-based framework that expands access to frontier cyber capabilities while strengthening safeguards against misuse.

Community Evals: Because we're done trusting black-box leaderboards over the community

developmentHugging Face Blog2/4/2026

We Got Claude to Build CUDA Kernels and teach open models!

developmentHugging Face Blog1/28/2026

One in a million: celebrating the customers shaping AI’s future

developmentOpenAI Blog12/22/2025

More than one million customers around the world now use OpenAI to empower their teams and unlock new opportunities. This post highlights how companies like PayPal, Virgin Atlantic, BBVA, Cisco, Moderna, and Canva are transforming the way work gets done with AI.

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

developmentHugging Face Blog12/18/2025

CUGA on Hugging Face: Democratizing Configurable AI Agents

developmentHugging Face Blog12/15/2025

Introducing swift-huggingface: The Complete Swift Client for Hugging Face

developmentHugging Face Blog12/5/2025

We Got Claude to Fine-Tune an Open Source LLM

developmentHugging Face Blog12/4/2025

Transformers v5: Simple model definitions powering the AI ecosystem

developmentHugging Face Blog12/1/2025

Inside JetBrains—the company reshaping how the world writes code

developmentOpenAI Blog11/25/2025

JetBrains is integrating GPT-5 across its coding tools, helping millions of developers design, reason, and build software faster.

OVHcloud on Hugging Face Inference Providers 🔥

developmentHugging Face Blog11/24/2025

20x Faster TRL Fine-tuning with RapidFire AI

developmentHugging Face Blog11/21/2025

Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms

developmentHugging Face Blog11/20/2025

Easily Build and Share ROCm Kernels with Hugging Face

developmentHugging Face Blog11/17/2025

Consensus accelerates research with GPT-5 and Responses API

developmentOpenAI Blog10/23/2025

Consensus uses GPT-5 and OpenAI’s Responses API to power a multi-agent research assistant that reads, analyzes, and synthesizes evidence in minutes—helping over 8 million researchers accelerate scientific discovery.

Hugging Face and VirusTotal collaborate to strengthen AI security

developmentHugging Face Blog10/22/2025

Sentence Transformers is joining Hugging Face!

developmentHugging Face Blog10/22/2025

Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face

developmentHugging Face Blog10/16/2025

Arm will be @ PyTorch Conference, Join Us!

developmentHugging Face Blog10/10/2025

BigCodeArena: Judging code generations end to end with code executions

developmentHugging Face Blog10/7/2025

Introducing apps in ChatGPT and the new Apps SDK

developmentOpenAI Blog10/6/2025

We’re introducing a new generation of apps you can chat with, right inside ChatGPT. Developers can start building them today with the new Apps SDK, available in preview.

Swift Transformers Reaches 1.0 – and Looks to the Future

developmentHugging Face Blog9/26/2025

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

developmentHugging Face Blog9/22/2025

Scaleway on Hugging Face Inference Providers 🔥

developmentHugging Face Blog9/19/2025

Public AI on Hugging Face Inference Providers 🔥

developmentHugging Face Blog9/17/2025

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

developmentHugging Face Blog9/11/2025

Fine-tune Any LLM from the Hugging Face Hub with Together AI

developmentHugging Face Blog9/10/2025

Welcome EmbeddingGemma, Google's new efficient embedding model

developmentHugging Face Blog9/4/2025

Make your ZeroGPU Spaces go brrr with ahead-of-time compilation

developmentHugging Face Blog9/2/2025

Introducing gpt-realtime and Realtime API updates

developmentOpenAI Blog8/28/2025

We’re releasing a more advanced speech-to-speech model and new API capabilities including MCP server support, image input, and SIP phone calling support.

Generate Images with Claude and Hugging Face

developmentHugging Face Blog8/19/2025

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

developmentHugging Face Blog8/18/2025

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

developmentHugging Face Blog8/8/2025

How Cursor uses GPT-5

developmentOpenAI Blog8/7/2025

Learn how Cursor uses GPT-5.

Implementing MCP Servers in Python: An AI Shopping Assistant with Gradio

developmentHugging Face Blog7/31/2025

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

developmentHugging Face Blog7/29/2025

Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨

developmentHugging Face Blog7/25/2025

Fast LoRA inference for Flux with Diffusers and PEFT

developmentHugging Face Blog7/23/2025

Pioneering an AI clinical copilot with Penda Health

developmentOpenAI Blog7/22/2025

OpenAI and Penda Health debut an AI clinical copilot that cuts diagnostic errors by 16% in real-world use—offering a new path for safe, effective AI in healthcare.

Accelerate a World of LLMs on Hugging Face with NVIDIA NIM

developmentHugging Face Blog7/21/2025

Building the Hugging Face MCP Server

developmentHugging Face Blog7/10/2025

Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure

developmentHugging Face Blog7/8/2025

No-code personal agents, powered by GPT-4.1 and Realtime API

developmentOpenAI Blog7/1/2025

Learn how Genspark built a $36M ARR AI product in 45 days—with no-code agents powered by GPT-4.1 and OpenAI Realtime API.

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

developmentHugging Face Blog7/1/2025

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub

developmentHugging Face Blog6/27/2025

Transformers backend integration in SGLang

developmentHugging Face Blog6/23/2025

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

developmentHugging Face Blog6/19/2025

Groq on Hugging Face Inference Providers 🔥

developmentHugging Face Blog6/16/2025

Learn the Hugging Face Kernel Hub in 5 Minutes

developmentHugging Face Blog6/12/2025

Featherless AI on Hugging Face Inference Providers 🔥

developmentHugging Face Blog6/12/2025

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

developmentHugging Face Blog6/3/2025

Tiny Agents in Python: a MCP-powered agent in ~70 lines of code

developmentHugging Face Blog5/23/2025

New tools and features in the Responses API

developmentOpenAI Blog5/21/2025

New features in the Responses API: Remote MCP, image gen, Code Interpreter, and more. Powering faster, smarter agents with GPT-4o & o-series models, plus new features for reliability and efficiency.

Exploring Quantization Backends in Diffusers

developmentHugging Face Blog5/21/2025

nanoVLM: The simplest repository to train your VLM in pure PyTorch

developmentHugging Face Blog5/21/2025

Microsoft and Hugging Face expand collaboration

developmentHugging Face Blog5/19/2025

The Transformers Library: standardizing model definitions

developmentHugging Face Blog5/15/2025

Improving Hugging Face Model Access for Kaggle Users

developmentHugging Face Blog5/14/2025

Welcoming Llama Guard 4 on Hugging Face Hub

developmentHugging Face Blog4/29/2025

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

developmentHugging Face Blog4/29/2025

Introducing our latest image generation model in the API

developmentOpenAI Blog4/23/2025

Our latest image generation model is now available in the API via ‘gpt-image-1’—enabling developers and businesses to build professional-grade, customizable visuals directly into their own tools and platforms.

17 Reasons Why Gradio Isn't Just Another UI Library

developmentHugging Face Blog4/16/2025

AI Development

AI Development

Accelerating the cyber defense ecosystem that protects us all

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

The next evolution of the Agents SDK

datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection

Trusted access for the next era of cyber defense

Trusted access for the next era of cyber defense

Exploring the new `servo` crate

Gemma 4 audio with MLX

SQLite 3.53.0

Microsoft starts removing Copilot buttons from Windows 11 apps

Our response to the Axios developer tool compromise

Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs

Multimodal Embedding & Reranker Models with Sentence Transformers

Safetensors is Joining the PyTorch Foundation

scan-for-secrets 0.3

scan-for-secrets 0.1

research-llm-apis 2026-04-04

New Rowhammer attacks give complete control of machines running Nvidia GPUs

Mercor says it was hit by cyberattack tied to compromise of open source LiteLLM project

TRL v1.0: Post-Training Library Built to Move with the Field

Quantization from the ground up

Cohere launches an open source voice model specifically for transcription

Roundtables: The Next Era of Space Exploration

With $3.5B in fresh capital, Kleiner Perkins is going all in on AI

OpenAI adds open source tools to help developers build for teen safety

Experimenting with Starlette 1.0 with Claude skills

Cursor admits its new coding model was built on top of Moonshot AI’s Kimi

Thinking Fast, Slow, and Artificial: How AI Is Reshaping Human Reasoning

OpenCode – Open source AI coding agent

Microsoft rolls back some of its Copilot AI bloat on Windows

Build a Domain-Specific Embedding Model in Under a Day

Trump’s AI framework targets state laws, shifts child safety burden to parents

State of Open Source on Hugging Face: Spring 2026

Our latest investment in open source security for the AI era

Introducing Storage Buckets on the Hugging Face Hub

Mixture of Experts (MoEs) in Transformers

Train AI models with Unsloth and Hugging Face Jobs for FREE

Transformers.js v4 Preview: Now Available on NPM!

Introducing Trusted Access for Cyber

Community Evals: Because we&apos;re done trusting black-box leaderboards over the community

We Got Claude to Build CUDA Kernels and teach open models!

One in a million: celebrating the customers shaping AI’s future

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

CUGA on Hugging Face: Democratizing Configurable AI Agents

Introducing swift-huggingface: The Complete Swift Client for Hugging Face

We Got Claude to Fine-Tune an Open Source LLM

Transformers v5: Simple model definitions powering the AI ecosystem

Inside JetBrains—the company reshaping how the world writes code

OVHcloud on Hugging Face Inference Providers 🔥

20x Faster TRL Fine-tuning with RapidFire AI

Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms

Easily Build and Share ROCm Kernels with Hugging Face

Consensus accelerates research with GPT-5 and Responses API

Hugging Face and VirusTotal collaborate to strengthen AI security

Sentence Transformers is joining Hugging Face!

Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face

Arm will be @ PyTorch Conference, Join Us!

BigCodeArena: Judging code generations end to end with code executions

Introducing apps in ChatGPT and the new Apps SDK

Swift Transformers Reaches 1.0 – and Looks to the Future

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

Scaleway on Hugging Face Inference Providers 🔥

Public AI on Hugging Face Inference Providers 🔥

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

Fine-tune Any LLM from the Hugging Face Hub with Together AI

Welcome EmbeddingGemma, Google&apos;s new efficient embedding model

Make your ZeroGPU Spaces go brrr with ahead-of-time compilation

Introducing gpt-realtime and Realtime API updates

Generate Images with Claude and Hugging Face

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

How Cursor uses GPT-5

Implementing MCP Servers in Python: An AI Shopping Assistant with Gradio

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨

Fast LoRA inference for Flux with Diffusers and PEFT

Pioneering an AI clinical copilot with Penda Health

Accelerate a World of LLMs on Hugging Face with NVIDIA NIM

Community Evals: Because we're done trusting black-box leaderboards over the community

Welcome EmbeddingGemma, Google's new efficient embedding model

17 Reasons Why Gradio Isn't Just Another UI Library