
About the implicit outbound connectivity retirement in Azure
June 20, 2025Retirement: skip-gpu-driver-install nodepool tag retirement om August 14, 2025
June 20, 2025Healthcare AI insights, or enrichments, are crucial for personalizing care plans, accelerating diagnosis, and supporting data-driven healthcare decisions. However, generating, storing, integrating, and retrieving AI enrichments for these applications can be a complex task. The process often involves dealing with the diverse input and output structures of AI models, managing large volumes of data, and ensuring data privacy and security. This is where the orchestrate multimodal AI insights capability in healthcare data solutions in Microsoft Fabric steps in. It enables you to address all these challenges in a scalable and efficient manner.
The orchestrate multimodal AI insights capability stores both AI enrichments metadata and data using:
- A Fabric Lakehouse that contains several metadata tables for traceability and reuse.
- Five data tables in the Silver Lakehouse that store the actual AI enrichment data.
The below diagram shows the healthcare data solutions in Microsoft Fabric architecture:
Next, we’ll explore the core features of the orchestrate multimodal AI insights capability, specifically its storage and retrieval functionalities.
1. AI enrichments are stored in standardized schemas
The orchestrate multimodal AI insights capability addresses the varied shapes of healthcare AI enrichments by providing a comprehensive schema. It classifies AI enrichment data into five types and provides the corresponding tables in the Silver lakehouse:
Enrichment Type |
Description |
Example model |
TextEnrichments |
natural language processing (NLP) entities |
|
EmbeddingEnrichments |
Vector embeddings |
|
Segmentation2DEnrichments |
2D segmentations |
|
Segmentation3DEnrichments |
3D segmentations |
|
ObjectEnrichments |
Entity relations, patient graphs |
The schemas for these tables follow a simple recipe. Each schema is composed of a set of common columns that appear in all tables and table-specific columns. The common columns include: unique_id, model_name, model_version, patient_id, enrichment_generation_id, enrichment_context_id, metadata, confidence_score,msftModifiedDateTime, and msftCreatedDateTime.
The orchestrate multimodal AI insights capability also keeps metadata tables:
- Tables for reusable metadata, namely, the Enrichment, EnrichmentContext, and EnrichmentView tables. These are used by the enrichment generation pipelines.
- Tables for runtime state upkeep, such as pipeline invocations: namely, the EnrichmentMaterializedView and EnrichmentGeneratedInfo tables.
2. Traceability and uniqueness of AI enrichments is enforced
AI enrichments are generated based on input data and metadata, where traceability is ensured through unique enrichment context identifiers. The totality of both the model configuration and inputs defines the enrichment traceability context, in short, the enrichment context. AI enrichments are meaningful only if the context is known. Consequently, the system keeps the context along with the enrichments:
- The EnrichmentContext metadata table stores the context, its hashed representation, the enrichment_context_id, and multiple other metadata.
- The enrichment_context_id column, a column that is present across all five enrichment data tables, stores the hashed enrichment context value.
Given a particular enrichment, say, a text enrichment from the TextEnrichments table with an enrichment_context_id “3ef2b635450a7a18628bb67ecee5c6d192f9172a3e7dbc4c20544f9deba5bbdf”, you can determine the full context that was used to generate this enrichment like so:-
Example pyspark code snippet
temp_df = spark.sql(“””
select * from EnrichmentContext where enrichment_context_id == ’00aad9937882906b3568a5294871e69cf8f3e455ffe665df7e2ad490f4b22e8c'”””)
temp_df[[“enrichment_context”]].show(3)
Note that depending on the AI model, a given enrichment context may refer to multiple enrichment rows (outputs). A good example is Azure text analytics for health which produces multiple outputs (entities and relations) for a given clinical note.
3. Metadata rich for optimal AI and analytics workflows
The orchestrate multimodal AI insights capability is rich in metadata, encouraging single definition and multiple/iterative execution patterns. In addition, as Fabric is an open architecture, so is the orchestrate multimodal AI insights capability . This means you can easily build analytics and AI applications on top of it. For example, after deploying orchestrate multimodal AI insights capability, you can generate 2D medical image segmentation using MedImageParse, store the segmentations, and visualize them in a Fabric notebook using the following steps:
– Ingest your medical imaging dataset:- https://learn.microsoft.com/en-us/training/modules/healthcare-dicom-ingestion/?source=recommendations
– Run the MedImageParse notebook:- https://learn.microsoft.com/en-us/industry/healthcare/healthcare-data-solutions/orchestrate-multimodal-ai-insights?toc=%2Findustry%2Fhealthcare%2Ftoc.json&bc=%2Findustry%2Fbreadcrumb%2Ftoc.json
– Run the healthcare#_msft_ai_enrichments_ingestion data pipeline to ingest the generated enrichments into the enrichment data tables in the silver lakehouse.
– At this point, segmentation2DEnrichments has the segmentations from MedImageParse and the contexts are in the EnrichmentContext table.
– To display sample raw image and its corresponding segmentations:-
o Select a row from the Segmentation2DEnrichments table. Resolve the value RLE into a mask array using the below python code snippet. Also, resolve the enrichment_context_id from the EnrichmentContext table to obtain the corresponding document, in this case DICOM instance, id.
o Use the instance id to obtain the DICOM file reference from the filePath field in the ImagingMetastore table.
Sample Python code to resolve RLE encoding to image mask
import numpy as np
def rle_decode(mask_rle, shape):
”’
mask_rle: run-length as string formated (start length)
shape: (height,width) of array to return
Returns numpy array, 1 – mask, 0 – background
”’
s = mask_rle[1:-1].split(“,”)
starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
starts -= 1
ends = starts + lengths
img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
for lo, hi in zip(starts, ends):
img[lo:hi] = 1
return img.reshape(shape)
The following is an example of an original raw image and a MedImageParse segmentation
4. Optimized storage
The orchestrate multimodal AI insights capability implements storage optimization techniques, such as Run-length Encoding (RLE), which provide significant storage optimization. For example, a 2D segmentation mask from MedImageParse can be several Megabytes in size. However, using RLE encoding, the segmentation generation process can compress the data by at least two orders of magnitude, enabling faster processing and lower storage costs. The 2D and 3D segmentation data tables in the silver layer, namely, Segmentation2DEnrichments and Segmentation3DEnrichments, allow for the storage of this compressed representation in the value column if you set the segmentation_type column to “rle”.
In conclusion, the orchestrate multimodal AI insights capability provides a comprehensive solution for the generation, storage, integration, and retrieval of AI enrichments in healthcare. Its open, traceable, scalable, and efficient architecture makes it an ideal choice if you are dealing with large amounts of healthcare AI data. Furthermore, the enrichment generation framework, supporting pipelines, and Fabric notebooks enable smooth enrichment generation, integration and management, making it easy to plug into the AI enrichments facility at different stages.
MEDICAL DEVICE DISCLAIMER. Microsoft product(s)/service(s) are not designed, intended or made available as a medical device(s), and are not designed or intended to be a substitute for professional medical advice, diagnosis, treatment, or judgment and should not be used to replace or as a substitute for professional medical advice, diagnosis, treatment, or judgment.
The following are helpful resources:
Healthcare Data Solutions in Microsoft Fabric
https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/healthcare-ai/healthcare-ai-models