Distributed Databases: Adaptive Optimization with Graph Neural Networks and Causal Inference
June 27, 2025Automating Data Vault processes on Microsoft Fabric with VaultSpeed
June 27, 2025💡 Why This Approach?
Traditional deployments often include models inside the app, leading to:
- Large container sizes
- Long build times
- Slow cold starts
- Painful updates when models change
With Azure Blob Storage, you can offload the model and only fetch it at runtime — reducing size, improving flexibility, and enabling easier updates.
What You will Need
- An ML model (model.pkl, model.pt, etc.)
- An Azure Blob Storage account
- A Python web app (FastAPI, Flask, or Streamlit)
- Azure Web App (App Service for Python)
- Azure Python SDK: azure-storage-blob
Step 1: Save and Upload Your Model to Blob Storage
First, save your trained model locally:
# PyTorch example
import torch
torch.save(model.state_dict(), “model.pt”)
Then, upload it to Azure Blob Storage:
from azure.storage.blob import BlobServiceClient
conn_str = “your_connection_string”
blob_service = BlobServiceClient.from_connection_string(conn_str)
container = blob_service.get_container_client(“models”)
with open(“model.pt”, “rb”) as f:
container.upload_blob(name=”model.pt”, data=f, overwrite=True)
Step 2: Build a Lightweight Inference App
Create a simple FastAPI app that loads the model from Blob Storage on startup:
from fastapi import FastAPI
from azure.storage.blob import BlobClient
import io, torch
app = FastAPI()
@app.on_event(“startup”)
def load_model():
print(“Loading model from Azure Blob Storage…”)
blob = BlobClient.from_connection_string(“your_connection_string”, container_name=”models”, blob_name=”model.pt”)
stream = io.BytesIO(blob.download_blob().readall())
global model
model = torch.load(stream, map_location=’cpu’)
model.eval()
@app.get(“/”)
def read_root():
return {“message”: “Model loaded and ready!”}
@app.post(“/predict”)
def predict(data: dict):
# Example input, dummy output
return {“result”: “prediction goes here”}
Step 3: Push Your App to GitHub and Deploy to Azure Model Loads at Runtime!
Now that your ML model is safely uploaded to Azure Blob Storage (Step 2), it’s time to push your inference app (without the model) to GitHub and deploy it via Azure Web App.
The trick? Your app will dynamically fetch the model from Blob Storage at runtime — keeping your repo light and deployment fast!
 3.1 Push Your App Code (Without the Model) to GitHub
Your project structure should look like this:
azure-ml-deploy/
│
├── main.py # Your FastAPI/Flask app
├── requirements.txt # Python dependencies
├── README.md # Optional documentation
🚫 Do NOT include model.pt or any large model files in your GitHub repo!
 3.2 main.py: Load the Model from Azure Blob Storage at Runtime
Here’s your main.py — which automatically pulls the model during startup:
from fastapi import FastAPI
from azure.storage.blob import BlobClient
import torch
import io
app = FastAPI()
@app.on_event(“startup”)
def load_model():
print(” Loading model from Azure Blob Storage…”)
blob = BlobClient.from_connection_string(
conn_str=”your_connection_string”, # You’ll set this in Azure Portal
container_name=”models”,
blob_name=”model.pt”
)
stream = io.BytesIO(blob.download_blob().readall())
global model
model = torch.load(stream, map_location=’cpu’)
model.eval()
@app.get(“/”)
def home():
return {“status”: “Model loaded from Azure Blob!”}
@app.post(“/predict”)
def predict(data: dict):
# Replace this with your own prediction logic
return {“prediction”: “sample output”}
3.3 requirements.txt
fastapi
uvicorn
torch
azure-storage-blob
 3.4 Deploy to Azure Web App Using GitHub Repo
- Go to Azure Portal
- Create a new Web App
- Runtime Stack: Python 3.10
- OS: Linux
- Under Deployment > GitHub, connect your GitHub repo
- In Configuration > Application Settings, add:
- AZURE_STORAGE_CONN_STRING =
This way, your app doesn’t store any secrets in code.
Benefits of This Setup
- Clean separation of model and code
- Smaller, faster deployable packages
- Easy model updates (just replace the blob!)
- No need for GPUs or complex infrastructure
- Ideal for web APIs, dashboards, and even chatbots.
Conclusion
In this blog, you learned how to separate your ML model storage from deployment, making your applications faster, cleaner, and more scalable using Microsoft Azure technologies.
By pushing a lightweight API to GitHub and having your application download the model from Azure Blob Storage at runtime, you:
- Avoid bloated GitHub repos
- Accelerate deployments via Azure Web App
- Keep credentials and models secure with Azure App Settings
- Enable dynamic updates to your model without redeploying your app
This architecture is perfect for real-world, production-grade ML systems whether you’re building prototypes or enterprise-grade APIs.
💡 Final Thought
Decouple. Deploy. Deliver.
With the power of Azure Blob Storage + Azure App Service, you can scale smarter — not heavier.
Happy Building! ✨
If you found this blog helpful or you’re working on something similar, I’d love to connect and exchange ideas join the Azure AI Foundry communitues or reach out to me on Linkedin Mohamed Faraazman Bin Farooq S | LinkedIn