Your First GraphRAG Demo – A Video Walkthrough
May 10, 2025New in Azure Marketplace: April 22-30, 2025
May 10, 2025In today’s data-driven world, organizations are increasingly turning to AI for document understanding. Whether it’s extracting invoices, contracts, ID cards, or complex forms, Azure Document Intelligence (formerly known as Form Recognizer) provides a robust, AI-powered solution for automated document processing.
But what happens when you want to scale, secure, and load balance your document intelligence backend for high availability and enterprise-grade integration?
Enter Azure API Management (APIM) — your gateway to efficient, scalable API orchestration.
In this blog, we’ll explore how to integrate Azure Document Intelligence with APIM using a load-balanced architecture that works seamlessly with the Document Intelligence SDK — without rewriting your application logic.
Azure Doc Intelligence SDKs simplify working with long-running document analysis operations — particularly asynchronous calls — by handling the polling and response parsing under the hood.
Why Use API Management with Document Intelligence?
While the SDK is great for client-side development, APIM adds essential capabilities for enterprise-scale deployments:
- 🔐 Security & authentication at the gateway level
- ⚖️ Load balancing across multiple backend instances
- 🔁 Circuit breakers, caching, and retries
- 📊 Monitoring and analytics
- 🔄 Response rewriting and dynamic routing
By routing all SDK and API calls through APIM, you get full control over traffic flow, visibility into usage patterns, and the ability to scale horizontally with multiple Document Intelligence backends.
SDK Behavior with Document Intelligence
When using the Document Intelligence SDK (e.g., begin_analyze_document), it follows this two-step pattern:
- POST request to initiate document analysis
- Polling (GET) request to the operation-location URL until results are ready
This is an asynchronous pattern where the SDK expects a polling URL in the response of the POST. If you’re not careful, this polling can bypass APIM — which defeats the purpose of using APIM in the first place.
So what do we do?
The Smart Rewrite Strategy
We use APIM to intercept and rewrite the response from the POST call.
POST Flow
- SDK sends a POST to:
https://apim-host/analyze - APIM routes the request to one of the backend services:
https://doc-intel-backend-1/analyze - Backend responds with:
operation-location: https://doc-intel-backend-1/operations/123 - APIM rewrites this header before returning to the client:
operation-location: https://apim-host/operations/poller?backend=doc-intel-backend-1
Now, the SDK will automatically poll APIM, not the backend directly.
GET (Polling) Flow
- Path to be set as /operations/123 in GET operation of APIM
- SDK polls:
https://apim-host/operations/123?backend=doc-intel-backend-1 - APIM extracts the query parameter backend=doc-intel-backend-1
- APIM dynamically sets the backend URL for this request to:
https://doc-intel-backend-1 - It forwards the request to:
https://doc-intel-backend-1/operations/123 - Backend sends the status/result back to APIM → which APIM returns to the SDK.
All of this happens transparently to the SDK.
Sample policies
//Outbound policies for POST – /documentintelligence/documentModels/prebuilt-read:analyze
//—————————————————————————————————
<!–
– Policies are applied in the order they appear.
– Position inside a section to inherit policies from the outer scope.
– Comments within policies are not preserved.
–>
<!– Add policies as children to the , , , and elements –>
@{
// Original operation-location from backend
var originalOpLoc = context.Response.Headers.GetValueOrDefault(“operation-location”, “”);
// Encode original URL to pass as query parameter
var encoded = System.Net.WebUtility.UrlEncode(originalOpLoc);
// Construct APIM URL pointing to poller endpoint with backendUrl
var apimUrl = $”https://tstmdapim.azure-api.net/document-intelligent/poller?backendUrl={encoded}”;
return apimUrl;
}
//Inbound policies for Get (Note: path for get should be modified – /document-intelligent/poller
//———————————————————————————————-
<!–
– Policies are applied in the order they appear.
– Position inside a section to inherit policies from the outer scope.
– Comments within policies are not preserved.
–>
<!– Add policies as children to the , , , and elements –>
<set-variable name="decodedUrl" value="@{
var backendUrlEncoded = context.Request.Url.Query.GetValueOrDefault("backendUrl", "");
// Make sure to decode the URL properly, potentially multiple times if needed
var decoded = System.Net.WebUtility.UrlDecode(backendUrlEncoded);
// Check if it’s still encoded and decode again if necessary
while (decoded.Contains("%"))
{
decoded = System.Net.WebUtility.UrlDecode(decoded);
}
return decoded;
}” />
@((string)context.Variables[“decodedUrl”])
@((string)context.Variables[“decodedUrl”])
GET
{“error”: “Missing backendUrl query parameter.”}
Load Balancing in APIM
You can configure multiple backend services in APIM and use built-in load-balancing policies to:
- Distribute POST requests across multiple Document Intelligence instances
- Use custom headers or variables to control backend selection
- Handle failure scenarios with circuit-breakers and retries
Reference: Azure API Management backends – Microsoft Learn
Sample: Using APIM Circuit Breaker & Load Balancing – Microsoft Community Hub