Sensitivity Auto-labelling via Document Property

Announcing General Availability of App Service Inbound IPv6 Support

August 6, 2025

How Microsoft Azure and Qumulo Deliver a Truly Cloud-Native File System for the Enterprise

August 6, 2025

Published by azurefeeds on August 6, 2025

Why is this needed?

Sensitivity labels are generally relevant within an organisation only. If a file is labelled within one environment and then moved to another environment, sensitivity label content markings may be visible, but by default, the applied sensitivity label will not be understood. This can lead to scenarios where information that has been generated externally is not adequately protected.

My favourite analogy for these scenarios is to consider the parallels between receiving sensitive information and unpacking groceries. When unpacking groceries, you might sit your grocery bag on a counter or on the floor next to the pantry. You’ll likely then unpack each item, take a look at it and then decide where to place it. Without looking at an item to determine its correct location, you might place it in the wrong location. Porridge might be safe from the kids on the bottom shelf. If you place items that need to be protected, such as chocolate, on the bottom shelf, it’s not likely to last very long.

So, I affectionately refer to information that hasn’t been evaluated as ‘porridge’, as until it has been checked, it will end up on the bottom shelf of the pantry where it is quite accessible. Label-based security controls, such as Data Loss Prevention (DLP) policies using conditions of ‘content contains sensitivity label’ will not apply to these items. To ensure the security of any contained sensitive information, we should look for potential clues to its sensitivity and then utilize these clues to ensure that the contained information is adequately protected – We take a closer look at the ‘porridge’, determine whether it’s an item that needs protection and if so, move it to a higher shelf in the pantry so that it’s out of reach for the kids.

Figure 1: Diagram showing auto-labelling increasing the sensitivity of a received file.

Effective use of Purview revolves around the use of ‘know your data’ strategies. We should be using as many methods as possible to try to determine the sensitivity of items. This can include the use of Sensitive Information Types (SITs) containing keyword or pattern-based classifiers, trainable classifiers, Exact Data Match, Document fingerprinting, etc.

Matching items via SITs present in the items content can be problematic due to false positives. Keywords like ‘Sensitive’ or ‘Protected’ may be mentioned out of context, such as when referring to a classification or an environment.

When classifications have been stamped via a property, it allows us to match via context rather than content. We don’t need to guess at an item’s sensitivity if another system has already established what the item’s classification is. These methods are much less prone to false positives.

Why isn’t everyone doing this?

Document properties are often not considered in Purview deployments. SharePoint metadata management seems to be a dying artform and most compliance or security resources completing Purview configurations don’t have this skill set. There’s also a lack of understanding of the relevance of checking for item properties. Microsoft haven’t helped as the documentation in this space is somewhat lacking and needs to be unpicked via some aligning DLP guidance (Create a DLP policy to protect documents with FCI or other properties). Many of these configurations will also be tied to regional requirements. Document properties being used by systems where I’m from, in Australia, will likely be very different to those used in other parts of the world.

In the following sections, we’ll take a look at applicable use cases and walk through how to enable these configurations.

Scenarios for use

Labelling via document property isn’t for everyone. If your organisation is new to classification or you don’t have external partners that you collaborate with at higher sensitivity levels, then this likely isn’t for you. For those that collaborate heavily and have a shared classification framework, as is often seen across government, this is a must! This approach will also be highly relevant to multi-tenant organisations or conglomerates where information is regularly shared between environments.

The following scenarios are examples of where this configuration will be relevant:

1. Migrating from 3^rd party classification tools

If an item has been previously stamped by a 3^rd party classification tool, then evaluating its applied document properties will provide a clear picture of its security classification. These properties can then be used in service-based auto-labelling policies to effectively transition items from 3^rd party tools to Microsoft Purview sensitivity labels. As labels are applied to items, they will be brought into scope of label-based controls.

2. Detecting data spill

Data spill is a term that is used to define situations where information that is of a higher than permitted security classification land in an environment. Consider a Microsoft 365 tenant that is approved for the storage of Official information but Top Secret files are uploaded to it. Document properties that align with higher than permitted classifications provide us with an almost guaranteed method of identifying spilled items. Pairing this document property with an auto-labelling policy allows for the application of encryption to lock unauthorized users out of the items. Tools like Content Explorer and eDiscovery can then be used to easily perform cleanup activities.

If using document properties and auto-labelling for this purpose, keep in mind that you’ll need to create sensitivity labels for higher than permitted classifications in order to catch spilled items. These labels won’t impact usability as you won’t publish them to users. You will, however, need to publish them to a single user or break glass account so that they’re not ignored by auto-labelling.

3. Blocking access by AI tools

If your organization was concerned about items with certain properties applied being accessed by generative AI tools, such as Copilot, you could use Auto-labelling to apply a sensitivity label that restricts EXTRACT permissions. You can find some information on this at Microsoft 365 Copilot data protection architecture | Microsoft Learn. This should be relevant for spilled data, but might also be useful in situations where there are certain records that have been marked via properties and which should not be Copilot accessible.

4. External Microsoft Purview Configurations

Sensitivity labels are relevant internally only. A label, in its raw form, is essentially a piece of metadata with an ID (or GUID) that we stamp on pieces of information. These GUIDs are understood by your tenant only. If an item marked with a GUID shows up in another Microsoft 365 tenant, the GUID won’t correspond with any of that tenant’s labels or label-based controls. The art in Microsoft Purview lies in interpreting the sensitivity of items based on content markings and other identifiers, so that data security can be maintained. Document properties applied by Purview, such as ClassificationContentMarkingHeaderText are not relevant to a specific tenant, which makes them portable. We can use these properties to help maintain classifications as items move between environments.

5. Utilizing metadata applied by Records Management solutions

Some EDRMS, Records or Content Management solutions will apply properties to items. If an item has been previously managed and then stamped with properties, potentially including a security classification, via one of these systems, we could use this information to inform sensitivity label application.

6. 3^rd party classification tools used externally

Even if your organisation hasn’t been using 3rd party classification tools, you should consider that partner organisations, such as other Government departments, might be. Evaluating the properties applied by external organisations to items that you receive will allow you to extend protections to these items. If classification tools like Janus or Titus are used in your geography/industry, then you may want to consider checking for their properties.

Regarding the use of auto-classification tools

Some organisations, particularly those in Government, will have organisational policies that prevent the use of automatic classification capabilities. These policies are intended to ensure that each item is assessed by an actual person for risk of disclosure rather than via an automated service that could be prone to error. However, when auto-labelling is used to interpret and honour existing classifications, we are lowering rather than raising the risk profile.

If the item’s existing classification (applied via property) is ignored, the item will be treated as porridge and is likely to be at risk.

If auto-labelling is able to identify a high-risk item and apply the relevant label, it will then be within scope of Purview’s data security controls, including label-based DLP, groups and sites data out of place alerting, and potentially even item encryption.

The outcome is that, through the use of auto-labelling, we are able to significantly reduce risk of inappropriate or unintended disclosure.

Configuration Process

Setting up document property-based auto-labelling is fairly straightforward. We need to setup a managed property and then utilize it an auto-labelling policy. Below, I’ve split this process into 6 steps:

Step 1 – Prepare your files

In order to make use of document properties, an item with the properties applied will first need to be indexed by SharePoint. SharePoint will record the properties as ‘crawled properties’, which we’ll then need to convert into ‘managed properties’ to make them useful.

If you already have items with the relevant properties stored in SharePoint, then they are likely already indexed. If not, you’ll need to upload or create an item or items with the properties applied.

For testing, you’ll want to create a file with each property/value combination so that you can confirm that your auto-labelling policies are all working correctly. This could require quite a few files depending on the number of properties you’re looking for. To kick off your crawled property generation though, you could create or upload a single file with the correct properties applied. For example:

Figure 2: Document properties applied to a Word document.

In the above, I’ve created properties for ClassificationContentMarkingHeaderText and ClassificationContentMarkingFooterText, which you’ll often see applied by Purview when an item has a sensitivity label content marking applied to it. I’ve also included properties to help identify items classified via JanusSeal, Titus and Objective.

Step 2 – Index the files

After creating or uploading your file, we then need SharePoint to index it. This should happen fairly quickly depending on the size of your environment. I’d expect to wait sometime between 10 minutes and 24 hrs. If you’re not in a hurry, then I’d recommend just checking back the next day.

You’ll know when this has been completed when you head into SharePoint Admin > Search > Managed Search Schema > Crawled Properties and can find your newly indexed properties:

Figure 3: Finding your newly indexed properties in Crawled Properties

Step 3 – Configure managed properties

Next, the properties need to be configured as managed properties. To do this, go to SharePoint Admin > More features > Search > Managed Search Schema > Managed Properties.

Create a new managed property and give it a name. Note that there are some character restrictions in naming, but you should be able to get it close to your document property name. Set the property’s type to text, select queryable and retrievable.

Under ‘mappings to crawled properties’, choose add mapping, search for and select the property indexed from the file property. Note that the crawled property will have the same name as your document property, so there’s no need to browse through all of them:

Figure 4: Screenshot of crawled property selection when selecting a managed property.

Repeat this so that you have a managed property for each document property that you want to look for.

Step 4 – Configure Auto-labelling policies

Next up, create some auto-labelling policies. You’ll need one for each label that you want to apply, not one per property as you can check multiple properties within the one auto-labelling policy.

– From within Purview, head to Information Protection > Policies > Auto-labelling policies.

– Create a new policy using the custom policy template.

– Give your policy an appropriate name (e.g. Label PROTECTED via property).

– Select the label that you want to apply (e.g. PROTECTED).

– Select SharePoint based services (SharePoint and OneDrive).

– Name your auto-labelling rules appropriately (e.g. SPO – Contains PROTECTED property)

– Enter your conditions as a long string with property and value separated via a colon and multiple entries separated with a comma. For example:

ClassificationContentMarkingHeaderText:PROTECTED,ClassificationContentMarkingFooterText:PROTECTED,Objective-Classification:PROTECTED,PMDisplay:PROTECTED,TitusSEC:PROTECTED

Note that the properties that you are referencing are the Managed Property rather than the document property. This will be relevant if your managed property ended up having a different name due to character restrictions.

After pasting in your string into the UI, the resultant rule should look something like this:

Figure 5: Screenshot of the Resultant rule from the above steps.

When done, you can either leave your policy in simulation mode or save it and then turn it on from the auto-labelling policies screen. Just be aware of any potential impacts, such as accidently locking users out by automatically deploying a label with encryption configuration. You can reduce any potential impact by targeting your auto-labelling policy at a site or set of sites initially and then expanding its scope after testing.

Step 5 – Test

Testing your configuration will be as easy as uploading or creating a set of files with the relevant document properties in place. Once uploaded, you’ll need to give SharePoint some time to index the items and then the auto-labelling policy some time to apply sensitivity labels to them.

To confirm label application, you can head to the document library where your test files are located and enable the sensitivity column. Files that have been auto-labelled will have their label listed:

Figure 6: Classification Properties

You could also check for auto-labelling activity in Purview via Activity explorer:

Figure 7: Auto-labelling activity in Purview via Activity explorer.

Step 6 – Expand into DLP

If you’ve spent the time setting up managed properties, then you really should consider capitalizing on them in your DLP configurations. DLP policy conditions can be configured in the same manner that we configured Auto-labelling in Step 3 above. The document property also gives us an anchor for DLP conditions that is independent of an item’s sensitivity label.

You may wish to consider the following:

DLP policies blocking external sharing of items with certain properties applied. This might be handy for situations where auto-labelling hasn’t yet labelled an item.

DLP policies blocking the external sharing of items where the applied sensitivity label doesn’t match the applied document property. This could provide an indication of risky label downgrade.

You could extend such policies into Insider Risk Management (IRM) by creating IRM policies that are aligned with the above DLP policies. This will allow for document properties to be considered in user risk calculation, which can inform controls like Adaptive Protection.

Here’s an example of a policy from the DLP rule summary screen that shows conditions of item contains a label or one of our configured document properties:

Figure 8: Example of a policy from the DLP rule summary screen

Thanks for reading and I hope this article has been of use. If you have any questions or feedback, please feel free to reach out.

Announcing General Availability of App Service Inbound IPv6 Support

How Microsoft Azure and Qumulo Deliver a Truly Cloud-Native File System for the Enterprise

Announcing General Availability of App Service Inbound IPv6 Support

How Microsoft Azure and Qumulo Deliver a Truly Cloud-Native File System for the Enterprise

Why is this needed?

Why isn’t everyone doing this?

Scenarios for use

Regarding the use of auto-classification tools

Configuration Process

Step 1 – Prepare your files

Step 2 – Index the files

Step 3 – Configure managed properties

Step 4 – Configure Auto-labelling policies

Step 5 – Test

Step 6 – Expand into DLP

Related posts

Excel Turns 40: Join the Celebration!

Generally Available: MongoDB Atlas as an Azure Native Integration

Deploy LangChain applications to Azure App Service