AI Development

1,752 articles total

AI Development

OpenCode – Open source AI coding agent

developmentHacker News (Best)18h ago

Article URL: https://opencode.ai/ Comments URL: https://news.ycombinator.com/item?id=47460525 Points: 887 # Comments: 412

Microsoft rolls back some of its Copilot AI bloat on Windows

developmentTechCrunch AI18h ago

The company is reducing Copilot entry points on Windows, starting with Photos, Widgets, Notepad, and other apps.

Build a Domain-Specific Embedding Model in Under a Day

developmentHugging Face Blog19h ago

Back to Articles Build a Domain-Specific Embedding Model in Under a Day Enterprise + Article Published March 20, 2026 Upvote 9 +3 Steve H steve-nvidia Follow nvidia Rucha Apte ruchaa01 Follow nvidia Sean Sodha ssodha-nv Follow nvidia Oliver Holworthy nvidia-oliver-holworthy Follow nvidia If you are building a RAG (Retrieval-Augmented Generation) system, you have likely hit this wall: Everything works… until it doesn’t. General-purpose embedding models are trained to understand the internet; not your contracts, manufacturing logs, proprietary chemical formulations or internal taxonomy. They capture broad semantic similarity, but they do not understand the fine-grained distinctions that matter in your domain. Fine-tuning an embedding model can improve the performance of your retrieval pipeline when off-the-shelf models fail to effectively capture domain-specific nuances. Despite how critical embeddings are to RAG performance, the process remains surprisingly fragmented, the skills required are specialized, and the time investment is daunting. With a single GPU and less than a day of training time, you can transform a general-purpose embedding model into one that truly understands your domain, no manual labeling required. To help you hit the ground running, we are also releasing a ready-to-use synthetic training dataset generated from NVIDIA's public documentation using this exact pipeline. Using this data and the recipe, we saw over 10% improvement in both Recall@10 and NDCG@10. Atlassian applied this recipe to fine-tune on their JIRA dataset, increasing Recall@60 from 0.751 to 0.951, a 26% improvement - on a single GPU. 🔗Quick Links to Dataset and Codes: Embedding Model GitHub Synthetic dataset on NVIDIA’s public documents 🧑‍💻Open Source Projects Recipe Integrates: NeMo Data Designer for synthetic data generation NeMo Automodel for embedding model training BEIR for Information retrieval evaluation NeMo Export-Deploy for ONNX/TensorRT conversion NVIDIA NIM for production inference serving 📋Prerequisites: A directory of domain documents (text files - .txt, .md, or similar) A valid NVIDIA API key (free at build.nvidia.com) NVIDIA Ampere GPU or newer with at least 80GB memory (with Compute Capability >= 8.0) This tutorial has been tested on 1xA100 (80GB), and 1xH100 (80GB) By the end of this post, you’ll know how to answer:📄 Generate training data from domain documents without labeled data🎯 Use hard negative mining for effective contrastive training🔗 Improve embedding quality with multi-hop queries⚙️ Fine-tune a bi-encoder embedding model📊 Evaluate whether fine-tuning improves retrieval🚀 Deploy the fine-tuned model in your pipeline ⚙️Setup In this tutorial, we will finetune the base model Llama-Nemotron-Embed-1B-v2 - a 1-billion-parameter embedding model that balances quality and inference cost. To get started, follow this setup guide. 📚 Step 1: Generate Training Data from Documents Fine-tuning an embedding model requires thousands of (query, relevant document) pairs. Most use cases don’t have this data readily available. Creating it manually is expensive, slow, and often biased by the annotator’s personal interpretation of what’s “relevant.”Instead of labeling data by hand, you can use an LLM (nvidia/nemotron-3-nano-30b-a3b) to read your documents and automatically generate high-quality synthetic question–answer pairs. nemotron embed sdg -c default corpus_dir=./data/my_domain_docs How does it work? Behind the scenes, this runs a four-stage synthetic data generation (SDG) pipeline powered by NeMo Data Designer: What does the output look like? Source document chunk: The thermal design power (TDP) of the H100 GPU is 700W in SXM form factor. The cooling solution must maintain junction temperature below 83°C under sustained workloads. Liquid cooling is recommended for dense deployments exceeding 4 GPUs per node, as air cooling cannot dissipate sufficient heat in standard 2U chassis configurations. Generated QA pairs: { "question": "What cooling approach is recommended when deploying more than 4 H100 GPUs per server node?", "answer": "Liquid cooling is recommended for dense deployments exceeding 4 GPUs per node, as air cooling cannot dissipate sufficient heat in standard 2U chassis configurations.", "query_type": "contextual", "reasoning_type": "factual", "question_complexity": 3, "segment_ids": [1], "quality_score": 8.5 } { "question": "How does the 700W TDP of the H100 SXM constrain the choice between air and liquid cooling in multi-GPU configurations?", "answer": "The 700W TDP generates substantial heat that must be dissipated to keep junction temperatures below 83°C. In dense configurations exceeding 4 GPUs per node, air cooling in standard 2U chassis cannot handle this thermal load, making liquid cooling necessary.", "query_type": "multi_hop", "reasoning_type": "causal", "question_complexity": 4, "segment_ids": [1, 2], "hop_count": 2, "quality_score": 9.0 } Notice the difference: the first question is

Trump’s AI framework targets state laws, shifts child safety burden to parents

developmentTechCrunch AI23h ago

Trump’s AI framework pushes federal preemption of state laws, emphasizes innovation, and shifts responsibility for child safety toward parents while laying out lighter-touch rules for tech companies.

State of Open Source on Hugging Face: Spring 2026

developmentHugging Face Blog3d ago

Our latest investment in open source security for the AI era

developmentGoogle AI Blog3d ago

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/25367___BRS___Aspen_Security_Fo.max-600x600.format-webp_J9uPoFt.webp">Google is making new investments, building new tools and developing code security to improve open source security.

Introducing Storage Buckets on the Hugging Face Hub

developmentHugging Face Blog3/10/2026

Mixture of Experts (MoEs) in Transformers

developmentHugging Face Blog2/26/2026

Transformers.js v4 Preview: Now Available on NPM!

developmentHugging Face Blog2/9/2026

Introducing Trusted Access for Cyber

developmentOpenAI Blog2/5/2026

OpenAI introduces Trusted Access for Cyber, a trust-based framework that expands access to frontier cyber capabilities while strengthening safeguards against misuse.

One in a million: celebrating the customers shaping AI’s future

developmentOpenAI Blog12/22/2025

More than one million customers around the world now use OpenAI to empower their teams and unlock new opportunities. This post highlights how companies like PayPal, Virgin Atlantic, BBVA, Cisco, Moderna, and Canva are transforming the way work gets done with AI.

We Got Claude to Fine-Tune an Open Source LLM

developmentHugging Face Blog12/4/2025

Inside JetBrains—the company reshaping how the world writes code

developmentOpenAI Blog11/25/2025

JetBrains is integrating GPT-5 across its coding tools, helping millions of developers design, reason, and build software faster.

OVHcloud on Hugging Face Inference Providers 🔥

developmentHugging Face Blog11/24/2025

20x Faster TRL Fine-tuning with RapidFire AI

developmentHugging Face Blog11/21/2025

Easily Build and Share ROCm Kernels with Hugging Face

developmentHugging Face Blog11/17/2025

Consensus accelerates research with GPT-5 and Responses API

developmentOpenAI Blog10/23/2025

Consensus uses GPT-5 and OpenAI’s Responses API to power a multi-agent research assistant that reads, analyzes, and synthesizes evidence in minutes—helping over 8 million researchers accelerate scientific discovery.

Sentence Transformers is joining Hugging Face!

developmentHugging Face Blog10/22/2025

Arm will be @ PyTorch Conference, Join Us!

developmentHugging Face Blog10/10/2025

Introducing apps in ChatGPT and the new Apps SDK

developmentOpenAI Blog10/6/2025

We’re introducing a new generation of apps you can chat with, right inside ChatGPT. Developers can start building them today with the new Apps SDK, available in preview.

Scaleway on Hugging Face Inference Providers 🔥

developmentHugging Face Blog9/19/2025

Public AI on Hugging Face Inference Providers 🔥

developmentHugging Face Blog9/17/2025

Introducing gpt-realtime and Realtime API updates

developmentOpenAI Blog8/28/2025

We’re releasing a more advanced speech-to-speech model and new API capabilities including MCP server support, image input, and SIP phone calling support.

Generate Images with Claude and Hugging Face

developmentHugging Face Blog8/19/2025

How Cursor uses GPT-5

developmentOpenAI Blog8/7/2025

Learn how Cursor uses GPT-5.

Fast LoRA inference for Flux with Diffusers and PEFT

developmentHugging Face Blog7/23/2025

Pioneering an AI clinical copilot with Penda Health

developmentOpenAI Blog7/22/2025

OpenAI and Penda Health debut an AI clinical copilot that cuts diagnostic errors by 16% in real-world use—offering a new path for safe, effective AI in healthcare.

Building the Hugging Face MCP Server

developmentHugging Face Blog7/10/2025

No-code personal agents, powered by GPT-4.1 and Realtime API

developmentOpenAI Blog7/1/2025

Learn how Genspark built a $36M ARR AI product in 45 days—with no-code agents powered by GPT-4.1 and OpenAI Realtime API.

Transformers backend integration in SGLang

developmentHugging Face Blog6/23/2025

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

developmentHugging Face Blog6/19/2025

Groq on Hugging Face Inference Providers 🔥

developmentHugging Face Blog6/16/2025

Learn the Hugging Face Kernel Hub in 5 Minutes

developmentHugging Face Blog6/12/2025

Featherless AI on Hugging Face Inference Providers 🔥

developmentHugging Face Blog6/12/2025

New tools and features in the Responses API

developmentOpenAI Blog5/21/2025

New features in the Responses API: Remote MCP, image gen, Code Interpreter, and more. Powering faster, smarter agents with GPT-4o & o-series models, plus new features for reliability and efficiency.

Exploring Quantization Backends in Diffusers

developmentHugging Face Blog5/21/2025

Microsoft and Hugging Face expand collaboration

developmentHugging Face Blog5/19/2025

Improving Hugging Face Model Access for Kaggle Users

developmentHugging Face Blog5/14/2025

Welcoming Llama Guard 4 on Hugging Face Hub

developmentHugging Face Blog4/29/2025

Introducing our latest image generation model in the API

developmentOpenAI Blog4/23/2025

Our latest image generation model is now available in the API via ‘gpt-image-1’—enabling developers and businesses to build professional-grade, customizable visuals directly into their own tools and platforms.

Cohere on Hugging Face Inference Providers 🔥

developmentHugging Face Blog4/16/2025

Our updated Preparedness Framework

developmentOpenAI Blog4/15/2025

Sharing our updated framework for measuring and protecting against severe harm from frontier AI capabilities.

Introducing GPT-4.1 in the API

developmentOpenAI Blog4/14/2025

Introducing GPT-4.1 in the API—a new family of models with across-the-board improvements, including major gains in coding, instruction following, and long-context understanding. We’re also releasing our first nano model. Available to developers worldwide starting today.

Welcome Llama 4 Maverick & Scout on Hugging Face

developmentHugging Face Blog4/5/2025

Introducing next-generation audio models in the API

developmentOpenAI Blog3/20/2025

For the first time, developers can also instruct the text-to-speech model to speak in a specific way—for example, “talk like a sympathetic customer service agent”—unlocking a new level of customization for voice agents.

FastRTC: The Real-Time Communication Library for Python

developmentHugging Face Blog2/25/2025

Wayfair is shaping the future of retail with AI

developmentOpenAI Blog2/13/2025

A conversation with Fiona Tan, Chief Technology Officer of Wayfair.

Visualize and understand GPU memory in PyTorch

developmentHugging Face Blog12/24/2024

OpenAI o1 and new tools for developers

developmentOpenAI Blog12/17/2024

Introducing OpenAI o1, Realtime API improvements, a new fine-tuning method and more for developers.

Hugging Face models in Amazon Bedrock

developmentHugging Face Blog12/9/2024

Shaping the future of financial services

developmentOpenAI Blog12/4/2024

Morgan Stanley uses AI evals to shape the future of financial services

Open Source Developers Guide to the EU AI Act

developmentHugging Face Blog12/2/2024

Rearchitecting Hugging Face Uploads and Downloads

developmentHugging Face Blog11/26/2024

Building smarter maps with GPT-4o vision fine-tuning

developmentOpenAI Blog11/20/2024

Building smarter maps with GPT-4o vision fine-tuning

Share your open ML datasets on Hugging Face Hub!

developmentHugging Face Blog11/12/2024

Hugging Face + PyCharm

developmentHugging Face Blog11/5/2024