What If You Could Cut AI Costs by 60% Without Losing Quality?
July 8, 2025With all our big announcements at Build, you might think we’d kick back and take a break for a few weeks… but fine tuning never stops! We’re wrapping up June with a release of direct preference optimization for the 4.1 family of models, fine tuning available in more regions than ever before, and Responses API support for fine-tuned models.
GPT-4.1, GPT-4.1-mini support Direct Preference Optimization (DPO) 😍
GPT-4.1 and GPT-4.1-mini now support Direct Preference Optimization (DPO). DPO is a finetuning technique that adjusts model weights based on human preferences. You provide a prompt along with two responses: a preferred and non-preferred example; using this data, you can align a fine-tuned model to match your own style, preferences, or safety requirements.
Unlike Reinforcement Learning from Human Feedback (RLHF), DPO does not require fitting a reward model and uses binary preferences for training. This makes DPO computationally lighter and faster than RLHF while being equally effective at alignment.
Global Training, now available globally 🌎
Since Build, we’ve significantly expanded the availability of Global Training (public preview). If you’ve been waiting for support in your region, we’ve added another 12 regions! Look for additional features (pause/resume and continuous fine tuning) and models (gpt-4.1-nano) in the coming weeks.
New available regions:
- East US
- East US 2
- North Central US
- South Central US
- Spain Central
- Sweden Central
- Switzerland North
- Switzerland West
- UK South
- West Europe
- West US
- West US 3
Responses API now Supports Fine Tuned Models ☎️
Training is great- but inferencing is what matters most when you want to use your models! Responses API is the newest inferencing API. The Responses API is purpose built for agentic workflows: it supports stateful, multi-turn conversations and allows seamless tool calling, automatically stitching everything together in the background.
With this update, you can make better use of fine-tuned models in multi-agent workflows: after teaching your model what tools to use, and when, RAPI will keep track of conversations so the model can remember context, shows how the model is reasoning through its answers, and let users check progress while a response is being generated. It also supports background processing (so you don’t have to wait) and works with tools like web search and file lookup—making it great for building smarter, more interactive AI experiences.