Deploy Machine Learning Models the Smart Way with Azure Blob & Web App

Distributed Databases: Adaptive Optimization with Graph Neural Networks and Causal Inference

June 27, 2025

Automating Data Vault processes on Microsoft Fabric with VaultSpeed

June 27, 2025

Published by azurefeeds on June 27, 2025

💡 Why This Approach?

Traditional deployments often include models inside the app, leading to:

Large container sizes

Long build times

Slow cold starts

Painful updates when models change

With Azure Blob Storage, you can offload the model and only fetch it at runtime — reducing size, improving flexibility, and enabling easier updates.

What You will Need

An ML model (model.pkl, model.pt, etc.)

An Azure Blob Storage account

A Python web app (FastAPI, Flask, or Streamlit)

Azure Web App (App Service for Python)

Azure Python SDK: azure-storage-blob

Step 1: Save and Upload Your Model to Blob Storage

First, save your trained model locally:

# PyTorch example
import torch
torch.save(model.state_dict(), “model.pt”)

Then, upload it to Azure Blob Storage:

from azure.storage.blob import BlobServiceClient

conn_str = “your_connection_string”
blob_service = BlobServiceClient.from_connection_string(conn_str)
container = blob_service.get_container_client(“models”)

with open(“model.pt”, “rb”) as f:
container.upload_blob(name=”model.pt”, data=f, overwrite=True)

Step 2: Build a Lightweight Inference App

Create a simple FastAPI app that loads the model from Blob Storage on startup:

from fastapi import FastAPI
from azure.storage.blob import BlobClient
import io, torch

app = FastAPI()

@app.on_event(“startup”)
def load_model():
print(“Loading model from Azure Blob Storage…”)
blob = BlobClient.from_connection_string(“your_connection_string”, container_name=”models”, blob_name=”model.pt”)
stream = io.BytesIO(blob.download_blob().readall())
global model
model = torch.load(stream, map_location=’cpu’)
model.eval()

@app.get(“/”)
def read_root():
return {“message”: “Model loaded and ready!”}

@app.post(“/predict”)
def predict(data: dict):
# Example input, dummy output
return {“result”: “prediction goes here”}

Step 3: Push Your App to GitHub and Deploy to Azure Model Loads at Runtime!

Now that your ML model is safely uploaded to Azure Blob Storage (Step 2), it’s time to push your inference app (without the model) to GitHub and deploy it via Azure Web App.

The trick? Your app will dynamically fetch the model from Blob Storage at runtime — keeping your repo light and deployment fast!

3.1 Push Your App Code (Without the Model) to GitHub

Your project structure should look like this:

azure-ml-deploy/
│
├── main.py # Your FastAPI/Flask app
├── requirements.txt # Python dependencies
├── README.md # Optional documentation

🚫 Do NOT include model.pt or any large model files in your GitHub repo!

3.2 main.py: Load the Model from Azure Blob Storage at Runtime

Here’s your main.py — which automatically pulls the model during startup:

from fastapi import FastAPI
from azure.storage.blob import BlobClient
import torch
import io

app = FastAPI()

@app.on_event(“startup”)
def load_model():
print(” Loading model from Azure Blob Storage…”)
blob = BlobClient.from_connection_string(
conn_str=”your_connection_string”, # You’ll set this in Azure Portal
container_name=”models”,
blob_name=”model.pt”
)
stream = io.BytesIO(blob.download_blob().readall())
global model
model = torch.load(stream, map_location=’cpu’)
model.eval()

@app.get(“/”)
def home():
return {“status”: “Model loaded from Azure Blob!”}

@app.post(“/predict”)
def predict(data: dict):
# Replace this with your own prediction logic
return {“prediction”: “sample output”}

3.3 requirements.txt

fastapi
uvicorn
torch
azure-storage-blob

3.4 Deploy to Azure Web App Using GitHub Repo

Go to Azure Portal

Create a new Web App
- Runtime Stack: Python 3.10
- OS: Linux

Under Deployment > GitHub, connect your GitHub repo

In Configuration > Application Settings, add:
- AZURE_STORAGE_CONN_STRING =

This way, your app doesn’t store any secrets in code.

Benefits of This Setup

Clean separation of model and code

Smaller, faster deployable packages

Easy model updates (just replace the blob!)

No need for GPUs or complex infrastructure

Ideal for web APIs, dashboards, and even chatbots.

Conclusion

In this blog, you learned how to separate your ML model storage from deployment, making your applications faster, cleaner, and more scalable using Microsoft Azure technologies.

By pushing a lightweight API to GitHub and having your application download the model from Azure Blob Storage at runtime, you:

Avoid bloated GitHub repos

Accelerate deployments via Azure Web App

Keep credentials and models secure with Azure App Settings

Enable dynamic updates to your model without redeploying your app

This architecture is perfect for real-world, production-grade ML systems whether you’re building prototypes or enterprise-grade APIs.

💡 Final Thought

Decouple. Deploy. Deliver.
With the power of Azure Blob Storage + Azure App Service, you can scale smarter — not heavier.

Happy Building! ✨

If you found this blog helpful or you’re working on something similar, I’d love to connect and exchange ideas join the Azure AI Foundry communitues or reach out to me on Linkedin Mohamed Faraazman Bin Farooq S | LinkedIn

Distributed Databases: Adaptive Optimization with Graph Neural Networks and Causal Inference

Automating Data Vault processes on Microsoft Fabric with VaultSpeed

Distributed Databases: Adaptive Optimization with Graph Neural Networks and Causal Inference

Automating Data Vault processes on Microsoft Fabric with VaultSpeed

💡 Why This Approach?

What You will Need

Step 1: Save and Upload Your Model to Blob Storage

Step 2: Build a Lightweight Inference App

Step 3: Push Your App to GitHub and Deploy to Azure Model Loads at Runtime!

Benefits of This Setup

Conclusion

💡 Final Thought

Related posts

Upcoming in Microsoft 365 Copilot Chat – Apps Pinning

Optimizing Resource Allocation with Microsoft Defender CSPM

Security Review for Microsoft Edge version 138