Why your AI might be biased (and what you can do about it)
July 15, 2025Cloud forensics: Why enabling Microsoft Azure Key Vaults matters
July 15, 2025Deploy Secure Syslog Collection on Azure Kubernetes Service with Terraform
Organizations managing distributed infrastructure face a common challenge: collecting syslog data securely and reliably from various sources. Whether you’re aggregating logs from network devices, Linux servers, or applications, you need a solution that scales with your environment while maintaining security standards.
This post walks through deploying Logstash on Azure Kubernetes Service (AKS) to collect RFC 5425 syslog messages over TLS. The solution uses Terraform for infrastructure automation and forwards collected logs to Azure Event Hubs for downstream processing. You’ll learn how to build a production-ready deployment that integrates with Azure Sentinel, Azure Data Explorer, or other analytics platforms.
Solution Architecture
The deployment consists of several Azure components working together:
- Azure Kubernetes Service (AKS): Hosts the Logstash deployment with automatic scaling capabilities
- Internal Load Balancer: Provides a static IP endpoint for syslog sources within your network
- Azure Key Vault: Stores TLS certificates for secure syslog transmission
- Azure Event Hubs: Receives processed syslog data using the Kafka protocol
- Log Analytics Workspace: Monitors the AKS cluster health and performance
Syslog sources send RFC 5425-compliant messages over TLS to the Load Balancer on port 6514. Logstash processes these messages and forwards them to Event Hubs, where they can be consumed by various Azure services or third-party tools.
Prerequisites
Before starting the deployment, ensure you have these tools installed and configured:
- Terraform: Version 1.5 or later
- Azure CLI: Authenticated to your Azure subscription
- kubectl: For managing Kubernetes resources after deployment
Several Azure resources must be created manually before running Terraform, as the configuration references them. This approach provides flexibility in organizing resources across different teams or environments.
Step 1: Create Resource Groups
Create three resource groups to organize the solution components:
az group create –name rg-syslog-prod –location eastus
az group create –name rg-network-prod –location eastus
az group create –name rg-data-prod –location eastus
Each resource group serves a specific purpose:
rg-syslog-prod
: Contains the AKS cluster, Key Vault, and Log Analytics Workspacerg-network-prod
: Holds networking resources (Virtual Network and Subnets)rg-data-prod
: Houses the Event Hub Namespace for data ingestion
Step 2: Configure Networking
Create a Virtual Network with dedicated subnets for AKS and the Load Balancer:
az network vnet create
–resource-group rg-network-prod
–name vnet-syslog-prod
–address-prefixes 10.0.0.0/16
–location eastus
az network vnet subnet create
–resource-group rg-network-prod
–vnet-name vnet-syslog-prod
–name snet-aks-prod
–address-prefixes 10.0.1.0/24
az network vnet subnet create
–resource-group rg-network-prod
–vnet-name vnet-syslog-prod
–name snet-lb-prod
–address-prefixes 10.0.2.0/24
The network design uses non-overlapping CIDR ranges to prevent routing conflicts. The Load Balancer subnet will later be assigned the static IP address 10.0.2.100
.
Step 3: Set Up Event Hub Namespace
Create an Event Hub Namespace with a dedicated Event Hub for syslog data:
az eventhubs namespace create
–resource-group rg-data-prod
–name eh-syslog-prod
–location eastus
–sku Standard
az eventhubs eventhub create
–resource-group rg-data-prod
–namespace-name eh-syslog-prod
–name syslog
The Standard SKU provides Kafka protocol support, which Logstash uses for reliable message delivery. The namespace automatically includes a RootManageSharedAccessKey
for authentication.
Step 4: Configure Key Vault and TLS Certificate
Create a Key Vault to store the TLS certificate:
az keyvault create
–resource-group rg-syslog-prod
–name kv-syslog-prod
–location eastus
For production environments, import a certificate from your Certificate Authority:
az keyvault certificate import
–vault-name kv-syslog-prod
–name cert-syslog-prod
–file certificate.pfx
–password
For testing purposes, you can generate a self-signed certificate:
az keyvault certificate create
–vault-name kv-syslog-prod
–name cert-syslog-prod
–policy “$(az keyvault certificate get-default-policy)”
Important: The certificate’s Common Name (CN) or Subject Alternative Name (SAN) must match the DNS name your syslog sources will use to connect to the Load Balancer.
Step 5: Create Log Analytics Workspace
Set up a Log Analytics Workspace for monitoring the AKS cluster:
az monitor log-analytics workspace create
–resource-group rg-syslog-prod
–workspace-name log-syslog-prod
–location eastus
Understanding the Terraform Configuration
With the prerequisites in place, let’s examine the Terraform configuration that automates the remaining deployment. The configuration follows a modular approach, making it easy to customize for different environments.
Referencing Existing Resources
The Terraform configuration begins by importing references to the manually created resources:
data “azurerm_client_config” “current” {}
data “azurerm_resource_group” “rg-main” {
name = “rg-syslog-prod”
}
data “azurerm_resource_group” “rg-network” {
name = “rg-network-prod”
}
data “azurerm_resource_group” “rg-data” {
name = “rg-data-prod”
}
data “azurerm_virtual_network” “primary” {
name = “vnet-syslog-prod”
resource_group_name = data.azurerm_resource_group.rg-network.name
}
data “azurerm_subnet” “kube-cluster” {
name = “snet-aks-prod”
resource_group_name = data.azurerm_resource_group.rg-network.name
virtual_network_name = data.azurerm_virtual_network.primary.name
}
data “azurerm_subnet” “kube-lb” {
name = “snet-lb-prod”
resource_group_name = data.azurerm_resource_group.rg-network.name
virtual_network_name = data.azurerm_virtual_network.primary.name
}
These data sources establish connections to existing infrastructure, ensuring the AKS cluster and Load Balancer deploy into the correct network context.
Deploying the AKS Cluster
The AKS cluster configuration balances security, performance, and manageability:
resource “azurerm_kubernetes_cluster” “primary” {
name = “aks-syslog-prod”
location = data.azurerm_resource_group.rg-main.location
resource_group_name = data.azurerm_resource_group.rg-main.name
dns_prefix = “aks-syslog-prod”
default_node_pool {
name = “default”
node_count = 2
vm_size = “Standard_DS2_v2”
vnet_subnet_id = data.azurerm_subnet.kube-cluster.id
}
identity {
type = “SystemAssigned”
}
network_profile {
network_plugin = “azure”
load_balancer_sku = “standard”
network_plugin_mode = “overlay”
}
oms_agent {
log_analytics_workspace_id = data.azurerm_log_analytics_workspace.logstash.id
}
}
Key configuration choices:
- System-assigned managed identity: Eliminates the need for service principal credentials
- Azure CNI in overlay mode: Provides efficient pod networking without consuming subnet IPs
- Standard Load Balancer SKU: Enables zone redundancy and higher performance
- OMS agent integration: Sends cluster metrics to Log Analytics for monitoring
The cluster requires network permissions to create the internal Load Balancer:
resource “azurerm_role_assignment” “aks-netcontrib” {
scope = data.azurerm_virtual_network.primary.id
principal_id = azurerm_kubernetes_cluster.primary.identity[0].principal_id
role_definition_name = “Network Contributor”
}
Configuring Logstash Deployment
The Logstash deployment uses Kubernetes resources for reliability and scalability. First, create a dedicated namespace:
resource “kubernetes_namespace” “logstash” {
metadata {
name = “logstash”
}
}
The internal Load Balancer service exposes Logstash on a static IP:
resource “kubernetes_service” “loadbalancer-logstash” {
metadata {
name = “logstash-lb”
namespace = kubernetes_namespace.logstash.metadata[0].name
annotations = {
“service.beta.kubernetes.io/azure-load-balancer-internal” = “true”
“service.beta.kubernetes.io/azure-load-balancer-ipv4” = “10.0.2.100”
“service.beta.kubernetes.io/azure-load-balancer-internal-subnet” = data.azurerm_subnet.kube-lb.name
“service.beta.kubernetes.io/azure-load-balancer-resource-group” = data.azurerm_resource_group.rg-network.name
}
}
spec {
type = “LoadBalancer”
selector = {
app = kubernetes_deployment.logstash.metadata[0].name
}
port {
name = “logstash-tls”
protocol = “TCP”
port = 6514
target_port = 6514
}
}
}
The annotations configure Azure-specific Load Balancer behavior, including the static IP assignment and subnet placement.
Securing Logstash with TLS
Kubernetes Secrets store the TLS certificate and Logstash configuration:
resource “kubernetes_secret” “logstash-ssl” {
metadata {
name = “logstash-ssl”
namespace = kubernetes_namespace.logstash.metadata[0].name
}
data = {
“server.crt” = data.azurerm_key_vault_certificate_data.logstash.pem
“server.key” = data.azurerm_key_vault_certificate_data.logstash.key
}
type = “Opaque”
}
The certificate data comes directly from Key Vault, maintaining a secure chain of custody.
Logstash Container Configuration
The deployment specification defines how Logstash runs in the cluster:
resource “kubernetes_deployment” “logstash” {
metadata {
name = “logstash”
namespace = kubernetes_namespace.logstash.metadata[0].name
}
spec {
selector {
match_labels = {
app = “logstash”
}
}
template {
metadata {
labels = {
app = “logstash”
}
}
spec {
container {
name = “logstash”
image = “docker.elastic.co/logstash/logstash:8.17.4”
security_context {
run_as_user = 1000
run_as_non_root = true
allow_privilege_escalation = false
}
resources {
requests = {
cpu = “500m”
memory = “1Gi”
}
limits = {
cpu = “1000m”
memory = “2Gi”
}
}
volume_mount {
name = “logstash-config-volume”
mount_path = “/usr/share/logstash/pipeline/logstash.conf”
sub_path = “logstash.conf”
read_only = true
}
volume_mount {
name = “logstash-ssl-volume”
mount_path = “/etc/logstash/certs”
read_only = true
}
}
}
}
}
}
Security best practices include:
- Running as a non-root user (UID 1000)
- Disabling privilege escalation
- Mounting configuration and certificates as read-only
- Setting resource limits to prevent runaway containers
Automatic Scaling Configuration
The Horizontal Pod Autoscaler ensures Logstash scales with demand:
resource “kubernetes_horizontal_pod_autoscaler” “logstash_hpa” {
metadata {
name = “logstash-hpa”
namespace = kubernetes_namespace.logstash.metadata[0].name
}
spec {
scale_target_ref {
kind = “Deployment”
name = kubernetes_deployment.logstash.metadata[0].name
api_version = “apps/v1”
}
min_replicas = 1
max_replicas = 30
target_cpu_utilization_percentage = 80
}
}
This configuration maintains between 1 and 30 replicas, scaling up when CPU usage exceeds 80%.
Logstash Pipeline Configuration
The Logstash configuration file defines how to process syslog messages:
input {
tcp {
port => 6514
type => “syslog”
ssl_enable => true
ssl_cert => “/etc/logstash/certs/server.crt”
ssl_key => “/etc/logstash/certs/server.key”
ssl_verify => false
}
}
output {
stdout {
codec => rubydebug
}
kafka {
bootstrap_servers => “${name}.servicebus.windows.net:9093”
topic_id => “syslog”
security_protocol => “SASL_SSL”
sasl_mechanism => “PLAIN”
sasl_jaas_config => ‘org.apache.kafka.common.security.plain.PlainLoginModule required username=”$ConnectionString” password=”Endpoint=sb://${name}.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=${primary_key};EntityPath=syslog”;’
codec => “json”
}
}
The configuration:
- Listens on port 6514 for TLS-encrypted syslog messages
- Outputs to stdout for debugging (visible in container logs)
- Forwards processed messages to Event Hubs using the Kafka protocol
Deploying the Solution
With all components configured, deploy the solution using Terraform:
- Initialize Terraform in your project directory:
terraform init
- Review the planned changes:
terraform plan
- Apply the configuration:
terraform apply
- Connect to the AKS cluster:
az aks get-credentials
–resource-group rg-syslog-prod
–name aks-syslog-prod
- Verify the deployment:
kubectl -n logstash get pods
kubectl -n logstash get svc
kubectl -n logstash get hpa
Configuring Syslog Sources
After deployment, configure your syslog sources to send messages to the Load Balancer:
- Create a DNS record pointing to the Load Balancer IP (10.0.2.100). For example:
syslog.yourdomain.com
- Configure syslog clients to send RFC 5425 messages over TLS to port 6514
- Install the certificate chain on syslog clients if using a private CA or self-signed certificate
Example rsyslog configuration for a Linux client:
*.* @@syslog.yourdomain.com:6514;RSYSLOG_SyslogProtocol23Format
Monitoring and Troubleshooting
Monitor the deployment using several methods:
View Logstash logs to verify message processing:
kubectl -n logstash logs -l app=logstash –tail=50
Check autoscaling status:
kubectl -n logstash describe hpa logstash-hpa
Monitor in Azure Portal:
- Navigate to the Log Analytics Workspace to view AKS metrics
- Check Event Hub metrics to confirm message delivery
- Review Load Balancer health probes and connection statistics
Security Best Practices
This deployment incorporates several security measures:
- TLS encryption: All syslog traffic is encrypted using certificates from Key Vault
- Network isolation: The internal Load Balancer restricts access to the virtual network
- Managed identities: No credentials are stored in the configuration
- Container security: Logstash runs as a non-root user with minimal privileges
For production deployments, consider these additional measures:
- Enable client certificate validation in Logstash for mutual TLS
- Add Network Security Groups to restrict source IPs
- Implement Azure Policy for compliance validation
- Enable Azure Defender for Kubernetes
Integration with Azure Services
Once syslog data flows into Event Hubs, you can integrate with various Azure services:
Azure Sentinel: Configure Data Collection Rules to ingest syslog data for security analytics. See the Azure Sentinel documentation for detailed steps.
Azure Data Explorer: Create a data connection to analyze syslog data with KQL queries.
Azure Stream Analytics: Process syslog streams in real-time for alerting or transformation.
Logic Apps: Trigger workflows based on specific syslog patterns or events.
Cost Optimization
To optimize costs while maintaining performance:
- Right-size the AKS node pool based on actual syslog volume
- Use Azure Spot instances for non-critical environments
- Configure Event Hub retention based on compliance requirements
- Enable auto-shutdown for development environments
Conclusion
This Terraform-based solution provides a robust foundation for collecting syslog data in Azure. The combination of AKS, Logstash, and Event Hubs creates a scalable pipeline that integrates seamlessly with Azure’s security and analytics services.
The modular design allows easy customization for different environments and requirements. Whether you’re collecting logs from a handful of devices or thousands, this architecture scales to meet your needs while maintaining security and reliability.
For next steps, consider implementing additional Logstash filters for data enrichment, setting up automated certificate rotation, or expanding the solution to collect other log formats. The flexibility of this approach ensures it can grow with your organization’s logging requirements.