Azure Cloud Commanders – DNS Disasters in Azure: Hybrid Pitfalls, Fixes & Lessons Learned
August 9, 2025What’s New in Microsoft EDU webinar – August 2025
August 9, 2025The Future of AI blog series is an evolving collection of posts from the AI Futures team in collaboration with subject matter experts across Microsoft. In this series, we explore tools and technologies that will drive the next generation of AI. Explore more at: Collections | Microsoft Learn
The Future of AI: An Intern’s Adventure Improving Usability with Agents
As enterprises increasingly adopt Azure AI Foundry Models at scale, managing multiple model deployments across different regions, SKU types, and configurations has become a complex operational challenge. With Azure Direct Models, continuous model evolution, and inevitable model version retirement means enterprise customers need to regularly evaluate their model performance against newer model versions and plan for continuous model upgrades across dozens or potentially hundreds of deployments. As an intern on the Azure AI Foundry Product Team, I have the privilege of working on a project called the “model operation agent”—an internal proof-of-concept agent that transforms complex model management into a guided, conversational experience. In this blog, I will walk through the capabilities of the “model operation agent” and demonstrate its functionality.
Why Model Management Feels Hard
Imagine you’re responsible for a fleet of vehicles. Some are stationed in New York, others in London; each run on a different fuel type and retirement schedule. Now, swap cars for AI models: each deployment exists in its own region, under its own SKU, with its own version and retirement date. Tracking them all means juggling:
- Multiple Azure regions for compliance and performance
- Diverse SKUs, from Standard to Global, DataZone, and Provisioned
- A mix of model families and versions deployed across teams
- Varying quotas and capacities that must be respected before upgrades
Trying to manage this manually is like updating “every car by hand” – it’s exhausting.
An Overview of the Model Operation Agent
The “model operation agent” steps in as a friendly guide, wrapping four core steps into a chat interface:
- Discovery: It scans your entire subscription, compiling a single, searchable inventory of all deployments—no more blind spots.
- Analysis: It cross‑references Microsoft’s official retirement data to recommend when and which models to upgrade.
- Validation: It checks quotas and capacity in each region, warning you if there’s not enough headroom for the next upgrade.
- Execution: It batches updates with clear progress reporting, safeguards against failures, and even suggests rollbacks if something goes wrong.
1. Discovery – Comprehensive Deployment Inventory
The agent discovers model deployments across subscriptions, providing a unified view of deployed models:
Example output:
Account Name | Resource Group | Location | Deployment | Model | Version | SKU | Capacity
foundry-eastus | rg-prod | eastus | gpt4-prod | gpt-4o | 2024-05-13 | Standard | 10
foundry-westus | rg-prod | westus | gpt4-west | gpt-4 | 0613 | GlobalStd | 20
foundry-europe | rg-eu | westeu | chat-eu | gpt-35-turbo| 1106 | DataZone | 30
2. Analysis – Intelligent Retirement and Replacement Recommendations
The agent searches Microsoft’s official model retirement schedules, providing authoritative information about:
- Exact retirement dates or “no earlier than” timeframes
- Official replacement model recommendations
- Version-specific upgrade paths
Example output:
Model: gpt-4 (version 0613)
Retirement Date: June 6, 2025
Recommended Replacement: gpt-4o version 2024-11-20
Upgrade Priority: High – retirement in 4 months
3. Validation – Automated Quota and Capacity Verification
Before attempting any upgrades, the agent automatically queries quota and usage APIs:
- Check available capacity for target models in specific regions
- Compare required capacity against available quota
- Warn about insufficient capacity that would prevent successful upgrades
- Suggest alternative regions or SKU types with better availability
4. Execution – Batch Updates with Comprehensive Error Handling
The agent supports updating multiple deployments simultaneously with:
- Pre-flight validation of all update requests
- Detailed success/failure reporting with specific error messages
- Rollback guidance for failed operations
Looking Ahead: From IaC to IaA (Infrastructure as Agents)
Infrastructure as Code (IaC) tools like Terraform and Bicep let you define and version infrastructure, but you still need to learn HCL or ARM templates. What if you could simply hand an agent a system design document and watch it spin up your entire environment?
I envision a future where a platform agent serves as the ultimate abstraction layer for services:
- Doc‑Driven Deployment: Drop in a spec (think: architecture diagram, YAML/Markdown description), and agents negotiate with services to provision networks, VMs, data stores, and AI endpoints automatically.
- Natural‑Language Operations: No HCL. No ARM. Just plain English (or your preferred language) to create, adjust, and tear down environments.
This “Infrastructure as Agents” paradigm could democratize cloud operations, enabling product teams to focus on innovation instead of syntax. The “model operation agent” is our first step toward that vision—an internal proof-of-concept that enables intelligent agents to handle complex orchestration tasks through simple conversation.
Create with Azure AI Foundry
- Get started with Azure AI Foundry, and jump directly into Visual Studio Code
- Download the Azure AI Foundry SDK
- Take the Azure AI Foundry learn courses
- Review the Azure AI Foundry documentation
- Keep the conversation going in GitHub and Discord