Azure Update – 16th May 2025
May 17, 2025Cost tracking is a critical aspect of cloud operations—it helps you understand not just how much you’re spending, but also where that spend is going and which teams are responsible. When running a Machine Learning capability with multiple consumers across your organisation, it becomes especially challenging to attribute compute costs to the teams building and deploying models. With the extensive compute use in Machine Learning, these costs can add up quickly. In this article, we’ll explore how tools like Kubecost can help bring visibility and accountability to ML workloads.
Tracking costs in Azure can mostly be done through Azure Cost Management, however when we are running these ML models as endpoints and deployments in a Kubernetes cluster, things can get a bit trickier. Azure Cost Management will tell you the cost of the AKS cluster and nodes that are running, and if all you need is the total cost, then that is fine. However, as we look at implementing practices like Platform Engineering, there may be a common platform and set of Kubernetes clusters shared across multiple teams and business units. This brings about a need to be able to allocate costs to those specific teams, and for Azure ML this cost is going to be allocated to the deployments and endpoints running within the Kubernetes cluster.
What we need is a way to split the resources consumed in the Kubernetes cluster by endpoint and allocate a cost to the portion of those resources that are in use. For many workloads this cost could be allocated per-namespace, however Azure ML has additional complexity as it deploys its workloads into a single namespace per attached cluster. This means all Endpoints and Deployments end up in the same namespace. So we need a way to be more granular about these costs.
To address the challenge of attributing Kubernetes compute costs to specific Azure ML workloads, we need a tool that can provide visibility into how resources are being used within the cluster. One effective way to do this is by using Kubecost, a monitoring application that runs inside your AKS clusters and provides real-time cost visibility. With Kubecost, we can generate detailed cost reports that help us understand the resource consumption of specific Azure ML endpoints and deployments.
The Cost Management addon for AKS provides similar data, based on Opencost, and is integrated into the Azure Portal. If you are looking for costs per-namespace then this is the recommended solution as it is simpler to install, and display the data.
For our use case, we need to be more granular than Namespace and hence why we are deploying our own instance of Kubecost/Opencost.
Kubecost and Opencost
Kubecost and Opencost are two similar solutions that we can use to collect and monitor cost data for Kubernetes clusters.
- Kubecost is an open-core solution that’s quick to deploy and comes with a user-friendly interface. It offers a free tier with core functionality and an enterprise version with additional features.
- Opencost is a fully open-source CNCF project based on Kubecost’s core. It provides similar capabilities but typically requires more work to setup and configure.
For the purposes of this article, we will utilize Kubecost, as it is quicker to get up and running. If you would prefer to use Opencost, you can find instructions on deploying this into AKS here. You should be able to achieve the same reporting in Opencost.
Deploying Kubecost
There are two steps we need to take to get Kubecost up and running.
Install Kubecost in AKS
First, we need to deploy the software into the cluster using Helm. If you already have Helm installed, then this is a relatively straightforward process:
helm repo add kubecost https://kubecost.github.io/cost-analyzer/ helm repo update helm upgrade –install kubecost kubecost/cost-analyzer –namespace kubecost –create-namespace
Once this completes Kubecost should be running in your cluster, and you should be able to connect to it to test it out. Currently the application isn’t exposed to the outside world, so we will need to use port forwarding:
kubectl port-forward -n kubecost svc/kubecost-cost-analyzer 9090:9090
You should now be able to go to http://localhost:9090 in your browser and see the Kubecost homepage.
Integrate Kubecost with Azure Pricing
In its current state. Kubecost will collect data from the resources in the cluster and will allocate a cost to them. However, this cost is not based on the actual cost of the Azure resources at this point, as it has no data on Azure pricing to use. We can fix this in one of two ways:
- Connect Kubecost to the Azure Rate Card so that it can pull prices from Azure
- Export our actual cost data from Azure Cost Management to a storage account and have Kubecost pull in that data.
The first option requires providing Kubecost with a service principle it can use to query the Azure API to get the cost data. This will purely provide the rate card costs for the AKS resources. The second option will pull in the actual costs incurred from our Azure subscription, it takes a bit more work to setup, but it does mean that Kubecost has data on non-Kubernetes Azure resources as well. We can then use Kubecost to assign those cost as well, if you wish.
- To use the Azure Rate Card, follow the guide here.
- To use the cost export option, follow the guide here.
Once you complete this step, you should see that Kubecost now has data provided from Azure to accurately provide costs.
Reporting on Azure ML Resources
Now we have Kubecost setup you should be able to see that there is cost data available and there are multiple different ways to slice and report on this data. Let’s have a look at how we can get a view based on Azure ML resources.
When it comes to cost for Azure ML resources inside Kubernetes, we are going to focus on the inferencing endpoints that can be running long term inside your cluster. These consists of two components:
- Endpoint, which defines the entry points for access to your model
- Deployments, which are the specific version of a model, along with environment and scripts, that is hosted under an endpoint
An endpoint can host multiple deployments, with traffic distributed on a percentage basis between the deployments. When it comes to cost management, most of the time all deployments within an endpoint will be allocated to the same team, so aggregating the costs at the Endpoint level is enough. If you do want to aggregate costs at the deployment level, that is possible.
Pod Labels
Kubecost allows you to create reports that aggregate data by various metrics. For our solution, we will be looking at labels. We need to identify which Endpoints, and possible Deployments, a pod belongs to. Fortunately, when Azure ML deploys the pod, it adds multiple labels that give us this information.
For our scenario we are interested in three labels:
- ml.azure.com/endpoint-name gives us the name of the endpoint the pod is associated with
- ml.zure.com/deployment-name gives us the name of the deployment, if we want to be more granular
- isazuremlappgives us a simple Boolean to filter out non-ml pods
Create Cost Reports for Azure ML Workloads
Open up Kubecost in the browser and go to the reports tab on the left. We’re going to create a report that will allow us to break down costs by endpoint. Click the Create Report button and then select allocations to open a new report with default settings.
The first thing we need to do is aggregate by the label we are interested in. Click the aggregate button, which should currently be set to namespace.
At the bottom of the window that opens is a text box stating Find Label. In here enter the label you want to aggregate by, this will either be ml.azure.com/endpoint-name or ml.zure.com/deployment-name. When you enter the value, it should then find the label in the list, click on this to select it.
You may find that Kubecost adjusts the label names that are displayed so that ml.azure.com/endpoint-name becomes ml_azure_com_endpoint_name . Select the appropriate option for your setup.
The report should now show the workloads aggregated by the value of this label. You will, however, find a couple of other workloads added for “Unallocated Workloads” and “__idle__” and so our next step is to remove these.
the “__idle__” workload is a bucket for any cluster resources that are not in use at all. These resources are spare, and offer opportunities for cost optimization, but aren’t useful for our report. You can remove them by going to the Edit button at the top of the report and changing the option for Idle Costs. You can also make some other changes to how the metrics are displayed.
The other workload is for “Unallocated workloads” these are workloads that don’t have the label we are looking for, so are non-ML workloads. We are not interested in these, and will remove them. Click on the “Filter” button at the top and in the drop down select Custom Label . In the First text box enter “isazuremlapp” and in the second enter “true”. This will filter out any workloads that do not have the “isazuremlapplabel” set to true, and so are not Azure ML workloads.
What we should now be left with is a report that shows just our ML workloads by Endpoint. The table provides costs broken down by multiple different attributes.
Click Save at the top bar to save the report.
If you want to break this down by deployment, rather than Endpoint you would just change the label used in the aggregation to ml.zure.com/deployment-name or ml_azure_com_deployment_name.
Next Steps
Now that we have cost data for a our Kubernetes ML workloads, there are a few additional steps you could look to do.
- Make your Kubecost dashboard accessible outside of your cluster, without port forwarding and with authentication. See here for details on how this can be achieved.
- Import cloud provider costs and allocate cost for resources outside of your cluster to your workloads.
Conclusion
If you have a need to break down your usage and cost of Azure Machine Learning and need to include Kubernetes resources in this reporting, then tools like Kubecost and Opencost can help get this information from Kubernetes, and then join it together with your Azure cost information to provide real-time cost analysis. We can use the labels provide by Azure ML to aggregate this data by Endpoints and Deployments to get the cost data in a format that shows each team how much cost they are generating.