[In preview] Public Preview: Cosmos DB in Microsoft Fabric
July 15, 2025[Launched] Generally Available: Hosted-On-Behald-Of (HOBO) Public IP model for ExpressRoute Gateways
July 15, 2025In a world where foundation models are increasingly accessible, the real magic happens when developers make them their own. Fine-tuning is no longer a niche capability, it’s becoming a core skill for developers who want to build AI that’s faster, smarter, and more aligned with their users scaling expert knowledge across their organizations. Over the past few months, we’ve seen something remarkable: a growing community of builders, tinkerers, and innovators coming together to push the boundaries of what fine-tuning can do making a powerful difference for everyday organizations.
A Community Making a Big Impact: Customer Stories
At Build 2025, we saw firsthand how much the landscape has shifted. Just a year ago, many teams were still relying solely on prompt engineering or retrieval-augmented generation (RAG). Today, nearly half of developers say they’re actively exploring fine-tuning. Why? Because it gives them something no off-the-shelf model can: the ability to embed their domain knowledge, tone, and logic directly into the model itself.
- In our breakout session led by Product Leaders, Alicia and Omkar, they dove into fine-tuning and distillation with Azure AI Foundry, and we heard from the Oracle Health team and how they are restoring joy in providing patient care by relieving administration burden. The Oracle Health team used fine-tuned GPT-4o-mini models to power a clinical AI agent that responds in under 800 milliseconds—fast enough to keep up with real-time healthcare workflows.
- Earlier this year, we saw DraftWise use reinforcement fine-tuning on o-series reasoning models within Azure AI Foundry to tailor model behavior for legal-specific tasks. This approach allowed them to sharpen responses based on proprietary legal data, improving the quality of contract drafting and review. The fine-tuned models contributed to a 30% improvement in search result quality enabling faster, more accurate legal drafting and review at scale.
- And we watched CoStar Group deliver a low-latency, voice-driven home search experience that scales to over 100 million monthly users, while reducing token usage, improving cost-efficiency, and accelerating time to deployment by combining GPT-4o Realtime API audio models and Mistral 3B on Azure AI Foundry.
These aren’t just technical wins – they’re community wins. They show what’s possible when developers have the right tools and support to build AI that truly fits their needs.
What’s New in Azure AI Foundry: Tools That Empower, Not Overwhelm
The last 3 months have brought a wave of updates designed to make fine-tuning more accessible, more affordable, and more powerful—especially for developers who are just getting started.
- One of the most exciting additions is Reinforcement Fine-Tuning (RFT), now available in public preview with the o4-mini model. Unlike traditional supervised fine-tuning, RFT lets you teach models how to reason through complex tasks using reward signals. It’s ideal for domains where logic and nuance matter—like law, finance, or healthcare—and it’s already helping teams like DraftWise build smarter, more adaptive systems. Watch this o4 mini demo for more.
- We also introduced Global Training, which lets you fine-tune models from any of 24 Azure OpenAI regions. This means no more guessing which region supports which model, significantly lowering the barrier to entry for model customization.
- One of the most common questions we hear from developers is: “How do I know if my fine-tuned model is actually better?” The new Evaluation API is our answer. This API lets you programmatically evaluate model outputs using model-based graders, custom rubrics, and structured scoring—all from code
- And for developers who want to experiment without breaking the bank, we launched the Developer Tier. It’s a new way to deploy fine-tuned models for free (for 24 hours), paying only for tokens at the same rate as base models. It’s ideal for A/B testing, distillation experiments, or just kicking the tires on a new idea.
Learning Together: From Distillation to Deployment
One of the most powerful trends we’ve seen is the rise of distillation. Developers are using larger “teacher” models like GPT-4o to generate high-quality outputs, then fine-tuning smaller “student” models like GPT-4.1-mini or nano to replicate that performance at a fraction of the cost. This is now supported end-to-end in Azure AI Foundry. You can generate completions, store them automatically, fine-tune your student model, and evaluate it using model-based graders—all in one place.
And the results speak for themselves. In one demo at Build, we saw a distilled 4o model go from 35% to 90% accuracy—just by learning from the outputs of a larger o3 model.
Let’s Keep Building: Join Us for Model Mondays | LIVE today – July 14, 10:30 AM PT
We’re excited about what’s next. More models. More techniques. More impact from developers like you. If you’re already fine-tuning, we’d love to hear what you’re working on. And if you’re just getting started, we’re here to help with Model Mondays. Model Mondays is your weekly dose of AI model magic— an hour of live demos, developer-friendly deep dives, and just enough chaos to keep Mondays interesting.
Join us today LIVE with Dave Voutila, Microsoft PM, on Fine-tuning & Distillation – July 14 10:30 AM PT
-> RSVP here and Join Here
Checkout these Resources
🧠 Get Started with fine-tuning with Azure AI Foundry on Microsoft Learn Docs
▶️ Watch On-Demand: Fine-tuning and distillation with Azure AI Foundry
👩💻 Fine-tune GPT-4o-mini model with this tutorial
👋 Continue the conversation on Discord