SQL Server 2025 Integration Services Public Preview
May 20, 2025Keep your Azure Landing Zones policy assignments up to date with Azure Governance Visualizer
May 20, 2025š Why Autoscaling Matters
Autoscaling ensures that your application can handle varying levels of traffic without manual intervention. It helps with:
- Cost Optimization: Scale down during periods of low demand to save on infrastructure costs.
- Performance: Scale up during peak traffic to maintain responsiveness.
- Reliability: Automatically recover from failures by provisioning new instances as needed.
š§± Understanding the Building Blocks
š¹ AKS (Azure Kubernetes Service)
AKS is a managed Kubernetes service that simplifies the deployment, management, and scaling of containerized applications. It supports features like node pools, the cluster autoscaler, and integration with Azure Monitor.
š¹ Knative Serving
Knative Serving provides autoscaling capabilities for HTTP-based workloads. It supports:
- Scale-to-zeroĀ when thereās no traffic
- Rapid scale-upĀ in response to incoming HTTP requests
Knative supports two types of autoscalers:
- KPA (Knative Pod Autoscaler)Ā ā The default, based on concurrency or requests per second (RPS)
- HPA (Horizontal Pod Autoscaler)Ā ā Optional, based on CPU/memory metrics
š¹ KEDA (Optional)
KEDA (Kubernetes-based Event Driven Autoscaling) is useful for event-driven workloads (e.g., Azure Service Bus, Kafka). It can complement Knative for hybrid autoscaling scenarios.
Ā
āļø Pod vs. Node Autoscaling
š§© Pod Autoscaling (Knative)
Knative handlesĀ pod-level autoscalingĀ based on HTTP traffic. It dynamically adjusts the number of pods depending on:
- Request concurrencyĀ (e.g., 100 requests per pod)
- Requests per second (RPS)
- CPU usageĀ (if HPA mode is enabled)
Knative can evenĀ scale to zeroĀ when thereās no traffic, making it ideal for bursty or event-driven workloads.
š§± Node Autoscaling (AKS)
AKS uses theĀ Cluster AutoscalerĀ forĀ node-level autoscaling. It ensures that the underlying infrastructure can support the number of pods requested by Knative.
- If there’s insufficient capacity for new pods, AKS adds nodes.
- If nodes are underutilized, AKS removes them to reduce cost.
Together, these layers ensure both application responsiveness and infrastructure efficiency.
š ļø Real Example: Deployment and Testing
Step 1: Create an AKS Cluster
az group create -l centralus -n MyResourceGroup az aks create –resource-group myResourceGroup –name myAKSCluster1 –enable-managed-identity –enable-aad –location centralus –node-count 2 –enable-addons monitoring –generate-ssh-keys
Ā
Add a new node pool and enable the cluster autoscaler:
az aks nodepool add –resource-group myResourceGroup –cluster-name myAKSCluster –name newpool –node-vm-size Standard_D4s_v3 –enable-cluster-autoscaler –min-count 1 –max-count 5
Ā
š§ Step 2: Connect to AKS
az aks get-credentials –resource-group –name –overwrite-existing
ā Ā Verify:
kubectl get nodes
Ā
š¦ Step 3: Install Knative Serving Core
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.18.0/serving-crds.yaml kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.18.0/serving-core.yaml
ā Ā Verify:
kubectl get pods -n knative-serving
Ensure all pods are inĀ RunningĀ orĀ CompletedĀ state.
š Step 4: Install Istio (Ingress for Knative)
Install Istio components
kubectl apply -f https://github.com/knative/net-istio/releases/download/knative-v1.18.0/istio.yaml
Install Knative-Istio integration
kubectl apply -f https://github.com/knative/net-istio/releases/download/knative-v1.18.0/net-istio.yaml
Ā
ā Ā Verify Istio installation:
kubectl get pods -n istio-system
Expected pods:
- istiod
- istio-ingressgateway
ā Ā Check Istio IngressGateway Service (with public IP):
kubectl get svc istio-ingressgateway -n istio-system
Ensure theĀ EXTERNAL-IPĀ field has a public IP (notĀ ) like in the screenshot below.
ā Ā Verify Knative uses IstioĀ
Ā
# Linux / Git Bash kubectl get configmap config-network -n knative-serving -o yaml | grep ingress-class # Windows kubectl get configmap config-network -n knative-serving -o yaml | findstr ingress-class
Ā
Expected Output:
š Step 6: Deploy a Sample Microservice with Autoscaling
Create autoscale-service.yaml:
apiVersion: serving.knative.dev/v1 kind: Service metadata: name: autoscale-demo spec: template: metadata: annotations: autoscaling.knative.dev/target: “50” autoscaling.knative.dev/minScale: “1” autoscaling.knative.dev/maxScale: “10” spec: containers: – image: gcr.io/knative-samples/helloworld-go env: – name: TARGET value: “Knative Autoscaler”
Ā
Apply it:
kubectl apply -f autoscale-service.yaml
ā Ā Verify:
kubectl get ksvc autoscale-demo
Ā
Ensure READY is True and URL is populated.
Sample output:-
š Optional: Configure a Custom Domain (For Simpler Access)
If you want to access Knative services like http://autoscale-demo.myapps.com directly:
- Point a domain/subdomain to the Istio EXTERNAL-IP.
- Update the config-domain ConfigMap:
kubectl edit configmap config-domain -n knative-serving
Add:
myapps.com: “”
Ā
Deploy the service again. It will now be available under:
http://autoscale-demo.default.myapps.com
Ā
š Step 7: Load Test the Service
Load testing is done by sending 500 concurrent requests for 60 seconds to the autoscale-demo web page. You should observe Knative scaling up pods in response to the load.
Using Internal Cluster URL
kubectl run -i –tty –rm loadgen –image=williamyeh/hey –restart=Never — -z 60s -c 500 http://autoscale-demo.default.svc.cluster.local
This command will run a pod in the cluster and run a load test.
Ā
Test hey tool using External endpoint
Find Istioās external IP:
kubectl get svc istio-ingressgateway -n istio-system
RunĀ heyĀ load test using host header and external IP:
InstallĀ https://github.com/rakyll/heyĀ to generate HTTP traffic:
hey -z 60s -c 500 -host autoscale-demo.default.example.com http://72.152.40.239
Replace with the IP from above Replace autoscale-demo.default.example.com with the host from the Knative URL if you have configured a custom domain.
š” autoscale-demo.default.example.com is the default domain assigned by Knative if you’re using the built-in example.com magic DNS (no custom domain).
Ā
š§° Optional (if you want to avoid Host headers): Use Magic DNS (like nip.io)
Since your EXTERNAL-IP in this example is 72.152.40.239, you can useĀ nip.io like this:
Edit the Knative config map:
kubectl edit configmap config-domain -n knative-serving
And replace:
data: nip.io: “”
Then your Knative URL will become:
http://autoscale-demo.default.72.152.40.239.nip.io
So you can directly run:
curl http://autoscale-demo.default.72.152.40.239.nip.io
# Or
hey -z 60s -c 500 http://autoscale-demo.default.72.152.40.239.nip.io
š”No Host header needed.
š Step 8: Monitor Autoscaling
kubectl get pods -l serving.knative.dev/service=autoscale-demo -w
š”Observe how the number of pods increases during load and scales down when idle.
Pods before starting Load test:
Running load test:
Pods while running load test:
Pods few minutes after running load test:
Ā
š§¼ Step 9: (Optional) Clean Up
kubectl delete ksvc autoscale-demo
š§¾ Conclusion
CombiningĀ KnativeĀ withĀ AKSĀ provides a robust, cloud-native platform for running microservices thatĀ scale dynamicallyĀ based on real-time HTTP traffic. Whether you’re building a FinOps-aligned landing zone or simply optimizing for elasticity, this setup offers:
- Elasticity: Scale from zero to high traffic effortlessly.
- Efficiency: Pay only for what you consume.
- Simplicity: Use YAML and native Kubernetes constructs to manage autoscaling.