Public Preview: GitHub Copilot App Modernization for Java
May 20, 2025Taste the Rainbow: Multi-Cloud Style
May 20, 2025First introduced in public preview last year, Azure AI Content Understanding enables you to convert unstructured content—documents, audio, video, text, and images—into structured data. The service is designed to support consistent, high-quality output, directed improvements, built-in enrichment, and robust pre-processing to accelerate workflows and reduce cost.
A New Chapter in Content Understanding
Since our launch we’ve seen customers pushing the boundaries to go beyond simple data extraction with agentic solutions fully automating decisions. This requires more than just extracting fields. For example, a healthcare insurance provider decision to pay a claim requires cross-checking against insurance policies, applicable contracts, patient’s medical history and prescription datapoints. To do this a system needs the ability to interpret information in context, perform more complex enrichments and analysis across various data sources. Beyond field extraction, this requires a custom designed workflow leveraging reasoning.
In response to this demand, Content Understanding now introduces Pro mode which enables enhanced reasoning, validation, and information aggregation capabilities.
These updates allow the service to aggregate and compare results across sources, enrich extracted data with context, and deliver decisions as output. While Standard mode continues to offer reliable and scalable field extraction, Pro mode extends the service to support more complex content interpretation scenarios—enabling workflows that reflect the way people naturally reason over data.
With this update, Content Understanding now solves a much larger component of your data processing workflows, offering new ways to automate, streamline, and enhance decision-making based on unstructured information.
Key Benefits of Pro Mode
Packed with cutting-edge reasoning capabilities, Pro mode revolutionizes document analysis.
- Multi-Content Input
Process and aggregate information across multiple content files in a single request. Pro mode can build a unified schema from distributed data sources, enabling richer insight across documents. - Multi-Step Reasoning
Go beyond basic extraction with a process that supports reasoning, linking, validation, and enrichment. - Knowledge Base Integration
Seamlessly integrate with organizational knowledge bases and domain-specific datasets to enhance field inference. This ensures outputs can reason over the task of generating the output using the context of your business.
When to Use Pro Mode
Pro mode, currently limited to documents, is designed for scenarios where content understanding needs to go beyond surface-level extraction—ideal for use cases that traditionally require postprocessing, human review and decision-making based on multiple data points and contextual references.
Pro mode enables intelligent processing that not only extracts data, but also validates, links, and enriches it. This is especially impactful when extracted information must be cross-referenced with external datasets or internal knowledge sources to ensure accuracy, consistency, and contextual depth.
Examples include:
- Invoice processing that reconciles against purchase orders and contract terms
- Healthcare claims validation using patient records and prescription history
- Legal document review where clauses reference related agreements or precedents
- Manufacturing spec checks against internal design standards and safety guidelines
By automating much of the reasoning, you can focus on higher value tasks! Pro mode helps reduce manual effort, minimize errors, and accelerate time to insight—unlocking new potential for downstream applications, including those that emulate higher-order decision-making.
Simplified Pricing Model
Introducing a simplified pricing structure that significantly reduces costs across all content modalities compared to previous versions, making enterprise-scale deployment more affordable and predictable.
Expanded Feature Coverage
We are also extending capabilities across various content types:
- Structured Document Outputs: Improved handling of tables spanning multiple pages, recognition of selection marks, and support for additional file types like .docx, .xlsx, .pptx, .msg, .eml, .rtf, .html, .md, and .xml.
- Classifier API: Automatically categorize/split and route documents to appropriate processing pipelines.
- Video Analysis: Extract data across an entire video or break a video into chapters automatically. Enrich metadata with face identification and descriptions that include facial images.
- Face API Preview: Detect, recognize, and enroll faces, enabling richer user-aware applications.
Learn in detail about each of these capabilities and more!
Let’s hear it from our customers
Customers all over the globe are using Content Understanding for its powerful one-stop solution capabilities by levering the advance modes of reasoning, grounding and confidence scores across diverse content types.
ASC: AI-based analytics in ASC’s Recording Insights platform allows customers to move to a 100% compliance review coverage of conversations across multiple channels. ASC’s integration of Content Understanding replaces a previously complex setup—where multiple separate AI services had to be manually connected—with a single multimodal solution that delivers transcription, summarization, sentiment analysis, and data extraction in one streamlined interface. This shift not only simplifies implementation and accelerates time-to-value but also received positive customer feedback for its powerful features and the quick, hands-on support from Microsoft product teams.
“With the integration of Content Understanding into the ASC Recording Insights platform, ASC was able to reduce R&D effort by 30% and achieve 5 times faster results than before. This helps ASC drive customer satisfaction and stay ahead of competition.”
—Tobias Fengler, Chief Engineering Officer, ASC.
To learn more about ASCs integration check out From Complexity to Simplicity: The ASC and Azure AI Partnership.”
Ramp: Ramp, the all-in-one financial operations platform, is exploring how Azure AI Content Understanding can help transform receipts, bills, and multi-line invoices into structured data automatically. Ramp is leveraging the pre-built invoice template and experimenting with custom extraction capabilities across various document types. These experiments are helping Ramp evaluate how to further reduce manual entry and enhance the real-time logic that powers approvals, policy checks, and reconciliation.
“Content Understanding gives us a single API to parse every receipt and statement we see—then lets our own AI reason over that data in real time. It’s an efficient path from image to fully reconciled expense.”
— Rahul S, Head of AI, Ramp
MediaKind: MK.IO’s cloud-native video platform, available on Azure Marketplace—now integrates Azure AI Content Understanding to make it easy for developers to personalize streaming experiences. With just a few lines of code, you can turn full game footage into real-time, fan-specific highlight reels using AI-driven metadata like player actions, commentary, and key moments.
“Azure AI Content Understanding gives us a new level of control and flexibility—letting us generate insights instantly, personalize streams automatically, and unlock new ways to engage and monetize. It’s video, reimagined.”
—Erik Ramberg, VP, MediaKind
Catch the full story in our breakout session at Build 2025 on May 18: My Game, My Way, where we walk you through the creation of personalized highlight reels in real-time. You’ll never look at your TV in the same way again.
Getting Started
- Build your own Content Understanding solution in the Azure AI Foundry. Pro mode will be available in the Foundry starting June 1st 2025.
- Refer to our documentation and sample code on Content Understanding.
- Explore the video series on getting started with Content Understanding