Managing Azure OpenAI costs with the FinOps toolkit and FOCUS: Turning tokens into unit economics

Bringing AI to the edge: Hackathon Windows ML

May 21, 2025

[Launched] Generally Available: Azure Monitor Log Analytics UI query limit increased to 100K records

May 21, 2025

Published by azurefeeds on May 21, 2025

Introduction

As organizations rapidly adopt generative AI, Azure OpenAI usage is growing—and so are the complexities of managing its costs. Unlike traditional cloud services billed per compute hour or storage GB, Azure OpenAI charges based on token usage.

For FinOps practitioners, this introduces a new frontier: understanding AI unit economics and managing costs where the consumed unit is a token.

This article explains how to leverage the Microsoft FinOps toolkit and the FinOps Open Cost and Usage Specification (FOCUS) to gain visibility, allocate costs, and calculate unit economics for Azure OpenAI workloads.

Why Azure OpenAI cost management is different

AI services break many traditional cost management assumptions:

Billed by token usage (input + output tokens).

Model choices matter (e.g., GPT-3.5 vs. GPT-4 Turbo vs. GPT-4o).

Prompt engineering impacts cost (longer context = more tokens).

Bursty usage patterns complicate forecasting.

Without proper visibility and unit cost tracking, it’s difficult to optimize spend or align costs to business value.

Step 1: Get visibility with the FinOps toolkit

The Microsoft FinOps toolkit provides pre-built modules and patterns for analyzing Azure cost data.

Key tools include:

Microsoft Cost Management exports
Export daily usage and cost data in a FOCUS-aligned format.

FinOps hubs
Infrastructure-as-Code solution to ingest, transform, and serve cost data.

Power BI templates
Pre-built reports conformed to FOCUS for easy analysis.

Pro tip:
Start by connecting your Microsoft Cost Management exports to a FinOps hub. Then, use the toolkit’s Power BI FOCUS templates to begin reporting.

Learn more about the FinOps toolkit

Step 2: Normalize data with FOCUS

The FinOps Open Cost and Usage Specification (FOCUS) standardizes billing data across providers—including Azure OpenAI.

FOCUS Column	Purpose	Azure Cost Management Field
ServiceName	Cloud service (e.g., Azure OpenAI Service)	ServiceName
ConsumedQuantity	Number of tokens consumed	Quantity
PricingUnit	Unit type, should align to “tokens”	DistinctUnits
BilledCost	Actual cost billed	CostInBillingCurrency
ChargeCategory	Identifies consumption vs. reservation	ChargeType
ResourceId	Links to specific deployments or apps	ResourceId
Tags	Maps usage to teams, projects, or environments	Tags
UsageType / Usage Details	Further SKU-level detail	Sku Meter Subcategory, Sku Meter Name

Why it matters:
Azure’s native billing schema can vary across services and time. FOCUS ensures consistency and enables cross-cloud comparisons.

Tip: If you use custom deployment IDs or user metadata, apply them as tags to improve allocation and unit economics.

Review the FOCUS specification

Step 3: Calculate unit economics

Unit cost per token = BilledCost ÷ ConsumedQuantity

Real-world example: Calculating unit cost in Power BI

A recent Power BI report breaks down Azure OpenAI usage by:

SKU Meter Category → e.g., Azure OpenAI

SKU Meter Subcategory → e.g., gpt 4o 0513 Input global Tokens

SKU Meter Name → detailed SKU info (input/output, model version, etc.)

GPT Model	Usage Type	Effective Cost
gpt 4o 0513 Input global Tokens	Input	$292.77
gpt 4o 0513 Output global Tokens	Output	$23.40

Unit Cost Formula:
Unit Cost = EffectiveCost ÷ ConsumedQuantity

Power BI Measure Example:
Unit Cost = SUM(EffectiveCost) / SUM(ConsumedQuantity)

Pro tip:
Break out input and output token costs by model version to:

Track which workloads are driving spend.

Benchmark cost per token across GPT models.

Attribute costs back to teams or product features using Tags or ResourceId.

Power BI tip: Building a GPT cost breakdown matrix

To easily calculate token unit costs by GPT model and usage type, build a Matrix visual in Power BI using this hierarchy:

Rows:

SKU Meter Category

SKU Meter Subcategory

SKU Meter Name

Values:

EffectiveCost (sum)

ConsumedQuantity (sum)

Unit Cost (calculated measure)

Unit Cost = SUM(‘Costs’[EffectiveCost]) / SUM(‘Costs’[ConsumedQuantity])

Hierarchy Example:

Azure OpenAI

├── GPT 4o Input global Tokens

├── GPT 4o Output global Tokens

├── GPT 4.5 Input global Tokens

└── etc.

Power BI Matrix visual showing Azure OpenAI token usage and costs by SKU Meter Category, Subcategory, and Name. This breakdown enables calculation of unit cost per token across GPT models and usage types, supporting FinOps allocation and unit economics analysis.

What you can see at the token level

Metric	Description	Data Source
Token Volume	Total tokens consumed	Consumed Quantity
Effective Cost	Actual billed cost	BilledCost / Cost
Unit Cost per Token	Cost divided by token quantity	Effective Unit Price
SKU Category & Subcategory	Model, version, and token type (input/output)	Sku Meter Category, Subcategory, Meter Name
Resource Group / Business Unit	Logical or organizational grouping	Resource Group, Business Unit
Application	Application or workload responsible for usage	Application (tag)

This visibility allows teams to:

Benchmark cost efficiency across GPT models.

Track token costs over time.

Allocate AI costs to business units or features.

Detect usage anomalies and optimize workload design.

Tip: Apply consistent tagging (Cost Center, Application, Environment) to Azure OpenAI resources to enhance allocation and unit economics reporting.

How the FinOps Foundation’s AI working group informs this approach

The FinOps for AI overview, developed by the FinOps Foundation’s AI working group, highlights unique challenges in managing AI-related cloud costs, including:

Complex cost drivers (tokens, models, compute hours, data transfer).

Cross-functional collaboration between Finance, Engineering, and ML Ops teams.

The importance of tracking AI unit economics to connect spend with value.

By combining the FinOps toolkit, FOCUS-conformed data, and Power BI reporting, practitioners can implement many of the AI Working Group’s recommendations:

Establish token-level unit cost metrics.

Allocate costs to teams, models, and AI features.

Detect cost anomalies specific to AI usage patterns.

Improve forecasting accuracy despite AI workload variability.

Tip: Applying consistent tagging to AI workloads (model version, environment, business unit, and experiment ID) significantly improves cost allocation and reporting maturity.

Step 4: Allocate and Report Costs

With FOCUS + FinOps toolkit:

Allocate costs to teams, projects, or business units using Tags, ResourceId, or custom dimensions.

Showback/Chargeback AI usage costs to stakeholders.

Detect anomalies using the Toolkit’s patterns or integrate with Azure Monitor.

Tagging tip:
Add metadata to Azure OpenAI deployments for easier allocation and unit cost reporting. Example:

tags:

CostCenter: AI-Research

Environment: Production

Feature: Chatbot

Step 5: Iterate Using FinOps Best Practices

FinOps capability	Relevance
Reporting & analytics	Visualize token costs and trends
Allocation	Assign costs to teams or workloads
Unit economics	Track cost per token or business output
Forecasting	Predict future AI costs
Anomaly management	Identify unexpected usage spikes

Start small (Crawl), expand as you mature (Walk → Run).

Learn about the FinOps Framework

Next steps

Ready to take control of your Azure OpenAI costs?

Deploy the Microsoft FinOps toolkit
Start ingesting and analyzing your Azure billing data.
Get started

Adopt FOCUS
Normalize your cost data for clarity and cross-cloud consistency.
Explore FOCUS

Calculate AI unit economics
Track token consumption and unit costs using Power BI.

Customize Power BI reports
Extend toolkit templates to include token-based unit economics.

Join the conversation
Share insights or questions with the FinOps community on TechCommunity or in the FinOps Foundation Slack.

Advance Your Skills
Consider the FinOps Certified FOCUS Analyst certification.

Appendix: FOCUS column glossary

ConsumedQuantity: The number of tokens or units consumed for a given SKU. This is the key measure of usage.

ConsumedUnit: The type of unit being consumed, such as ‘tokens’, ‘GB’, or ‘vCPU hours’. Often appears as ‘Units’ in Azure exports for OpenAI workloads.

PricingUnit: The unit of measure used for pricing. Should match ‘ConsumedUnit’, e.g., ‘tokens’.

EffectiveCost: Final cost after amortization of reservations, discounts, and prepaid credits. Often derived from billing data.

BilledCost: The invoiced charge before applying commitment discounts or amortization.

PricingQuantity: The volume of usage after applying pricing rules such as tiered or block pricing. Used to calculate cost when multiplied by unit price.

Bringing AI to the edge: Hackathon Windows ML

[Launched] Generally Available: Azure Monitor Log Analytics UI query limit increased to 100K records

Bringing AI to the edge: Hackathon Windows ML

[Launched] Generally Available: Azure Monitor Log Analytics UI query limit increased to 100K records

Introduction

Why Azure OpenAI cost management is different

Step 1: Get visibility with the FinOps toolkit

Step 2: Normalize data with FOCUS

Step 3: Calculate unit economics

Real-world example: Calculating unit cost in Power BI

Power BI tip: Building a GPT cost breakdown matrix

What you can see at the token level

How the FinOps Foundation’s AI working group informs this approach

Step 4: Allocate and Report Costs

Next steps

Further Reading

Appendix: FOCUS column glossary

Related posts

Smart AI Integration with the Model Context Protocol (MCP) … part 4

Smart AI integration with the Model Context Protocol (MCP) … part 3

Smart AI integration with the Model Context Protocol (MCP) … part 2