Critical SharePoint Exploits Exposed: MDVM Response and Protection Strategy
July 21, 2025Creating a Slurm Job Submission App in Open OnDemand with Copilot Agent
July 21, 2025When you deploy AKS, you deploy the control plan, which is managed by Microsoft, and one or more node pools, which contain the worker nodes used to run your Kubernetes workloads. These node pools are usually deployed as Virtual Machine Scale Sets. These scale sets are visible in your subscription, but generally you would not make changes to these directly, as they will be managed by AKS and all of the configuration and management of these is done through AKS. However, there are some scenarios where you do need to make changes to the underlying node configuration to be able to handle the workloads you need to run. Whilst you can make some changes to these nodes, you need to make sure you do it in a supported manner, which will be applied consistently to all your nodes.
An example of this requirement is a recent issue I saw with deploying Elasticsearch onto AKS. Let’s take a look at this issue and see how it can be resolved, both for this specific issue, but also for any other scenario were you need to make changes on the nodes.
The Issue
For the rest of this article, we will use a specific scenario to illustrate the requirement to make node changes, but this could be applied to any requirement to make changes to the nodes.
Elasticsearch has a requirement where it needs to increase the limit on mmap count, due to the way it uses “mmapfs” for storing indices. The docs state you can resolve this by running:
sysctl -w vm.max_map_count=262144
This command needs to be run on the machine that is running the container, not inside the container. In our case, this is the AKS nodes.
Whilst this is fairly easy to do on my laptop, this isn’t really feasible to run manually on all of our AKS nodes, especially because nodes could be destroyed and recreated during updates or downtime. We need to make the changes consistently on all nodes, and automate the process so it is applied to all nodes, even new ones.
Changes to Avoid
Whilst we want to make changes to our nodes, we want to do so in a way that doesn’t result in our nodes being in an unsupported state. One key example of this is making changes directly to the scale set. Using the IaaS/ARM APIs to make changes directly to the scale set, outside of Kubernetes, will result in your nodes being unsupported and should be avoided. This includes making changes to the CustomScriptExtension configured on the scale set.
Similarly, we want to avoid SSH’ing into the nodes operating system and making the changes manually. Whilst this will apply the change you want, as soon as that node is destroyed and recreated, your change will be gone. Similarly, if you want to use the node autoscaler, any new nodes won’t have your changes.
Solutions
There are a few different options that we could use to solve this issue and customise our node configuration. Let’s take a look at them in order of ease of use.
1. Customised Node Configuration
The simplest method to customise node configuration is through the use of node configuration files that can be applied at the creation of a cluster or a node pool. Using these configuration files you are able to customise a specific set of configuration settings for both the Node Operating System and the Kubelet configuration.
Below is an example of a Linux OS configuration:
{
“transparentHugePageEnabled”: “madvise”,
“transparentHugePageDefrag”: “defer+madvise”,
“swapFileSizeMB”: 1500,
“sysctls”: {
“netCoreSomaxconn”: 163849,
“netIpv4TcpTwReuse”: true,
“netIpv4IpLocalPortRange”: “32000 60000”
}
}
We would then apply this at the time of creating a cluster or node pool by providing the file to the CLI command. For example, creating a cluster:
az aks create –name myAKSCluster –resource-group myResourceGroup –linux-os-config ./linuxosconfig.jsonaz aks create –name myAKSCluster –resource-group myResourceGroup –linux-os-config ./linuxosconfig.json
Creating a node pool:
az aks nodepool add –name mynodepool1 –cluster-name myAKSCluster –resource-group myResourceGroup –kubelet-config ./linuxkubeletconfig.json
There are lots of different configuration settings that can be changed for both OS and Kublet, for both Linux and Windows nodes. The full list can be found here. For our scenario where want to change the vm.max_map_count setting. This is available as one of the configuration options in the virtual memory section. Our OS configuration would look like this:
{
“vmMaxMapCount”: 262144
}
Note that the value used in the JSON is a camel case version of the property name, so vm.max_map_count becomes vmMaxMapCount
2. Daemonsets
Another way we can make these changes using a Kubernetes native method is through the use of Daemonsets. As you may know, Daemonsets provide a mechanism to run a pod on every node in your cluster. We can use this Daemonset to execute a script that sets the appropriate settings on the nodes when run, and the Deamonset will ensure that this is done on every node, including any new nodes that get created but the autoscaler or during updates.
To be able to make changes to the node, we will need to run the Daemonset with some elevated privileges, and so you may want to consider whether the node customisation file option, listed above, works for your scenario, before using this option.
For this to work, we need two things, a container to run, and a Daemonset configuration.
Container
All our Daemonset does is run a container, it’s the container that defines what is done. There are two options that we can use for our scenario:
- Create our own container that has the script to run defined in the Docker file.
- Use a pre-built container, like BusyBox, which accepts parameters defining what commands to run.
The first option is a more secure option, as the container is fixed to running only the command you want, and any malicious changes would require someone to re-build and publish a new image and update the Daemonset configuration to run it.
The image we create is very basic, it just needs to have the tools you require for your script installed, and then run your script. The only caveat to this is that Daemonsets need to have their Restart Policy set to always, and so we can’t just run our script and stop, as the container will just be restarted. To avoid this, we can have our container sleep once it is done. If the node is ever restarted or replaced, the container will still run again. Here is the most simple Dockerfile we can use to solve our Elasticsearch issue:
FROM alpine
CMD sysctl -w vm.max_map_count=262144; sleep 365d
Daemonset Configuration
To run our Daemonset, we need to configure our YAML to do the following:
- Run our custom container, or use a pre-built container with the right parameters
- Grant the Daemonset the required privileges to be able to make changes to the node
- Set the Restart Policy to Always
If we want, we can also restrict our Daemonset to only run on nodes that we know are going to run this workload. For example, we can restrict this to only run on a specific node pool in AKSm using a node selector.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: node-config
spec:
template:
metadata:
labels:
name: node-config
spec:
containers:
– name: node-config
image: scdemo.azurecr.io/node-config:1
securityContext:
privileged: true
restartPolicy: Always
nodeSelector:
agentpool: elasticsearch
Once we deploy this to our cluster, the Daemonset will run and make the changes we require.
When using a Daemonset or Init container approach, pay special attention to security. This container will run in privileged mode, which gives it a high level of permissions, and not just the ability to change the specific configuration setting you are interested in. Ensure that access to these containers and their configuration is restricted. Consider using init containers if possible as their runtime is more limited.
3. Init Containers
This is a similar approach to Dameonsets, but instead of running on everything node, we use an init container in our application to only run on the nodes where our application is present. An init container allows us to specify that a specific container must run, and complete successfully prior to our main application being run. We can take our container that runs our custom script, as with the Daemonset option, and run this as an init container instead.
The benefit of this approach is that the init container only runs once when the application is started, and then stops. This avoids needing to have the sleep command that keeps the process running at all times. The downside is that using an init container requires editing the YAML for the application you are deploying, which may be difficult or impossible if you are using a third party application. Some third party applications will have Helm charts or similar configured that do allow passing in custom init containers, but many do now. If you are creating your own applications then this is easier.
Below is an example using this approach, in this example we use a pre-built container (BusyBox) for running our script, rather than a custom container. Either approach can be used.
apiVersion: v1
kind: Pod
metadata:
name: app-pod
labels:
app.kubernetes.io/name: MyApp
spec:
containers:
– name: main-app
image: scdemo.azurecr.io/main-app:1
initContainers:
– name: init-sysctl
image: busybox
command:
– sysctl
– -w
– vm.max_map_count=262144
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
Conclusions
Making changes to underlying AKS nodes is something that most people won’t need to do, most of the time. However, there are some scenarios you may hit where this is important. AKS comes with functionality to do is in a controlled and supported manner via the use the of configuration files. This approach is recommended if the configuration you need to change is supported, as is simpler to implement, doesn’t require creating custom containers and is the most secure approach. If the change you need is not supported then you still have a way to deal with this via Daemonsets or Init containers, but special attention should be paid to security when using this solution.