MSLE Newsletter – June 2025
June 18, 2025Mastering Agent Governance in Microsoft 365
June 18, 2025While working with enterprise HPC teams, one of the often-heard questions is: how do we use InfiniBand or GPUs on Azure with Redhat Linux? Since Redhat does not provide hpc-enabled images at this moment, the best solution is to build your own image for this.
To build your image close to the Almalinux-hpc images, we now provide a set of scripts in the azurehpc-images repository. Not only that, but there are also example scripts to create and deploy these images in an image-gallery, so that the creation and distribution can be done in a scalable and repeatable fashion.
$> git clone https://github.com/Azure/azhpc-images.git
$> cd azhpc-images/partners/rhel/imagebuilder
Set the main variables in the make_image.sh script, which sets up the resource-group, locations and some of the naming for resources:
The script will create the image base on the RHEL-hpc-sig-template.json, which includes the customization script. This is a good place to add any additional customizations you may want to include.
Now let’s start building:
$> ./make_image.sh
The first time you run the make_image.sh script, it will set up the resource group and the image gallery. It will also create a managed identity to allow the Azure Image Builder to interact with your resource group. While the script tries to set the right permissions, this may fail due to your policies on the resource group:
If this happens, add the RHEL_UserId as contributor to the RHEL-imagebuilder resource-group.
Now the image will be build and distributed, this may take a while (you may be able to speed things up by picking a bigger machine…):
Note that for creating the image, including Infiniband and gpu drivers, you do not actually need to be on a vm that supports either technology.
During the build, the imagebuilder will create a separate resource-group, where a storage account with the logs will be hosted. The packerlog is the best way to get insights into the results of the building process and to do any troubleshooting. After a successful build, feel free to remove this resource-group.
To validate the Infiniband drivers, this is how a fresh boot on a HBv3 looks like:
And again, a fresh boot on a NC_H100_v5 to validate the Nvidia gpu drivers:
Keep in mind that Redhat images use the RedHat Update Infrastructure (RHUI) for updates. These are protected by certificates which are included in the image. If these certificates expire, the bootup of vm’s with this image may trigger failures if used through e.g. Cyclecloud.