Understanding Pulumi Resource Hooks
August 31, 2025Container Networking with Azure Application Gateway for Containers (AGC): Overlay vs. Flat AKS
September 1, 2025Multi‑agent systems turn complex, cross‑department processes into reliable, scalable AI workflows. Here’s a practical guide to designing, wiring, and shipping them with Azure AI Foundry—plus three real‑world scenarios you can take to production.
Why multi‑agent—and why now?
Modern business workflows rarely fit a single prompt. They span data retrieval, policy checks, approvals, and human sign‑off. Azure AI Foundry brings an Agent Service, connected agents, and multi‑agent workflows that let specialized agents collaborate, with enterprise‑grade security, observability, and open standards (A2A, MCP) baked in. In short: you get faster time‑to‑value without home‑grown orchestration and glue code.
What you can build today with Foundry
- Agent Service: Define goal‑directed agents, attach tools (Azure AI Search, OpenAPI plugins, Logic Apps, Functions), and deploy with enterprise policies.
- Connected agents (A2A): Register agents as each other’s “tools” so the orchestrator can delegate tasks to specialists via natural language—no custom routing required.
- Multi‑agent workflows: Add a stateful layer for context, retries, compensation, and long‑running steps—ideal for approvals, fulfilment, and back‑office flows.
- Interoperability: MCP for shared context; A2A for agent‑to‑agent message exchange—across Azure and other clouds.
- Observability & trust: Tracing, metrics, evaluations, and content safety guardrails to measure quality and keep humans in the loop when it matters.
Reference architecture (mental model)
User / System trigger ⟶ Orchestrator Agent
→ delegates to:
- Retrieval Agent (Azure AI Search, Fabric/OneLake)
- Analysis Agent (reasoning, code execution for calculations)
- Policy Agent (entitlements, RAI checks, approvals)
- Action Agent (OpenAPI/Logic Apps/Functions to update systems)
Results consolidated by Orchestrator → Human‑in‑the‑loop (if required) → Audit & telemetry (Foundry observability)
Three real‑world scenarios you can ship
1) Customer support autopilot (triage → resolution → summary)
- Flow: Orchestrator receives a ticket → classifies (Priority/Intent) → Retrieval Agent pulls KB + case history → Analysis Agent drafts fix → Policy Agent checks entitlement/SLAs → Action Agent updates CRM & sends response.
- Why multi‑agent: Clear separation of concerns improves reliability and speeds root‑cause when something breaks.
2) Financial approvals (policy‑aware, multi‑step)
- Flow: Orchestrator parses invoice → Extraction Agent pulls vendor/PO data → Risk Agent screens anomalies/fraud → Policy Agent checks limits & compliance → Human approval gate → Action Agent posts to ERP.
- Why multi‑agent: Stateful workflows with retries, compensations, and gated actions map cleanly to approvals.
3) Supply‑chain exceptions (sense → reason → act)
- Flow: Orchestrator ingests an exception → Forecast Agent estimates impact → Vendor Agent queries lead times via OpenAPI → Plan Agent proposes mitigation → Policy Agent validates → Action Agent places change orders.
- Why multi‑agent: Parallel specialization lowers latency and keeps the orchestrator simple.
Build your first connected‑agent workflow (conceptual walk‑through)
You can do this in the Azure AI Foundry portal or via the Foundry SDK. The high‑level steps mirror the official “Connected agents” how‑to.
1) Create a Foundry project & deployments
- Provision your model deployments (for example, a reasoning model for the Orchestrator, a cost‑optimized model for specialists).
- Connect data sources (Azure AI Search index; Fabric/OneLake; SharePoint) and register tools (OpenAPI, Logic Apps/Functions).
2) Define agent roles
- Orchestrator: understands goals, delegates, composes answers.
- Retrieval: authoritative grounding from enterprise content.
- Analysis: calculations, code execution for tabular or numeric tasks.
- Policy: entitlements, data‑loss checks, approval thresholds.
- Action: calls systems of record with auditable side‑effects.
3) Register connected agents (A2A)
In the portal, add each specialist as a tool to the Orchestrator; in code, you’d associate child agent IDs as callable tools.
# Conceptual snippet (simplified). Refer to docs for exact SDK classes.
orchestrator = agents.create(
name=”orchestrator”,
instructions=”You coordinate specialists. Delegate, verify, and compile final answers.”
)
retrieval = agents.create(name=”retrieval”, tools=[“azure_ai_search:kb_index”])
analysis = agents.create(name=”analysis”, tools=[“code_interpreter”])
policy = agents.create(name=”policy”, tools=[“policy_rules:mcp”])
action = agents.create(name=”action”, tools=[“openapi:erp”,”logicapp:notify”])
# Connect specialists as ‘tools’ on the orchestrator (A2A)
agents.connect(parent=orchestrator.id, children=[retrieval.id, analysis.id, policy.id, action.id])
4) Orchestrate as a workflow (state, retries, HIL)
Add a workflow definition that persists state, sets retry/backoff policies, and inserts human‑in‑the‑loop gates for sensitive steps (for example, spend over a threshold).
# Pseudocode: workflow policy
steps:
– delegate: retrieval
retry: {max: 2, backoff: expo}
– parallel:
– analysis
– policy
– gate:
type: human_approval
condition: “${policy.limit_exceeded == true}”
– delegate: action
audit:
trace: enabled
pii_redaction: strict
5) Observe, evaluate, and iterate
Use Foundry traces, scores, and evaluations to compare prompts, tools, and model mix; add cost/latency budgets per step and enable content safety filters.
Design tips from the field
- Start small: two agents (Orchestrator + specialist) are enough to prove value. Add more only when clarity or latency improves.
- Isolate responsibilities: retrieval should never mutate systems; action agents shouldn’t reason about policy.
- Make steps idempotent: so retries are safe. Leverage correlation IDs in action agents.
- Guardrails > guesswork: define gated actions for irreversible operations; log every action payload for audits.
- Cost & latency budgets: use cheaper models for retrieval/formatting; reserve premium reasoning where it moves the KPI.
- Humans in the loop: approvals, exceptions, and SLA breaches should route to people—with crisp, linked evidence packs.
Resources to go deeper
- Azure AI Foundry: AI app & agent factory (multi‑agent overview, A2A, MCP, workflows)
- How‑to: Build “connected agents” (step‑by‑step)
- Architecture Center: AI agent orchestration patterns
- Multi‑agent systems & MCP tools (TechCommunity explainer + Learn modules)