Coding in The Cloud – Getting Started with GitHub Codespaces
May 14, 2025Stronger Together: MVPs Reviving Local Tech User Groups with Global Support
May 14, 2025Table of Contents
Scenario 1: Burst to Azure from on-premises Data Center
Scenario 2: “24×7 Single Set Workload”
Scenario 3: “Data Center Supplement”
Abstract
Azure NetApp Files (ANF) is transforming Electronic Design Automation (EDA) workflows in the cloud by delivering unparalleled performance, scalability, and efficiency. This blog explores how ANF addresses critical challenges in three cloud compute scenarios: Cloud Bursting, 24×7 All-in-Cloud, and Cloud-based Data Center Supplement.
These solutions are tailored to optimize EDA processes, which rely on high-performance NFS file systems to design advanced semiconductor products. With the ability to support clusters exceeding 50,000 cores, ANF enhances productivity, shortens design cycles, and eliminates infrastructure concerns, making it the default choice for EDA workloads in Azure.
Additionally, innovations such as increased L3 cache and the transition to DDR5 memory enable performance boosts of up to 60%, further accelerating the pace of chip design and innovation.
Co-authors:
- Andy Chan, Principal Product Manager Azure NetApp Files
- Arnt de Gier, Technical Marketing Engineer Azure NetApp Files
Introduction
Azure NetApp Files (ANF) solutions support three major cloud compute scenarios running Electronic Design Automation (EDA) in Azure:
- Cloud Bursting
- 24×7 All-in-Cloud
- Cloud based Data Center Supplement
ANF solutions can address the key challenges associated with each scenario. By providing an optimized solution stack for EDA engineers ANF will increase productivity and shorten design cycles, making ANF the de facto standard file system for running EDA workloads in Azure.
Electronic Design Automation (EDA) processes are comprised of a suite of software tools and workflows used to design semiconductor products such as advanced computer processors (chips) which are all in need of high performance NFS file system solutions. The increasing demand for chips with superior performance, reduced size, and lower power consumption (PPA) is driven by today’s rapid pace of innovation to power workloads such as AI.
To meet this growing demand, EDA tools require numerous nodes and multiple CPUs (cores) in a cluster. This is where Azure NetApp Files (ANF) comes into play with its high-performance, scalable file system. ANF ensures that data is efficiently delivered to these compute nodes. This means a single cluster—sometimes encompassing more than 50,000 cores—can function as a unified entity, providing both scale-out performance and consistency which is essential for designing advanced semiconductor products. ANF is the most performance optimized NFS storage in Azure making it the De facto solution for EDA workloads.
According to Philip Steinke, AMD’s Fellow of CAD Infrastructure and Physical Design, the main priority is to maximize the productivity of chip designers by eliminating infrastructure concerns related to compute and file system expansion typically experienced with on-premises deployments that require long planning cycles and significant capital expenditure.
In register-transfer level (RTL) simulations, Microsoft Azure showcased that moving to a CPU with greater amounts of L3 Cache can give EDA users a performance boost of up to 60% for their workloads. This improvement is attributed to increased L3 cache, higher clock speeds (instructions per cycle), and the transition from DDR4 to DDR5 memory.
Azure’s commitment to providing high-performing, on-demand HPC (High-Performance Computing) infrastructure is a well-known advantage and has become the primary reason EDA companies are increasingly adopting Azure for their chip design needs. In this paper, three different scenarios of Azure for EDA are explored, namely “Cloud Bursting”, “24×7 Single Set Workload” and “Data Center Supplement” as a reference framework to help guide engineer’s Azure for EDA journey.
EDA Cloud-Compute scenarios
The following sections delve into three key scenarios that address the computational needs of EDA workflows: “Cloud Bursting,” “24×7 Single Set Workload,” and “Data Center Supplement.” Each scenario highlights how Azure’s robust infrastructure, combined with high-performance solutions like Azure NetApp Files, enables engineering teams to overcome traditional limitations, streamline chip design processes, and significantly enhance productivity.
Scenario 1: Burst to Azure from on-premises Data Center
An EDA workload is made up of a series of workflows where certain steps are bursty which can lead to incidents in semiconductor project cycles where compute demand exceeds the on-premises HPC server cluster capacity. Many EDA customers have been bursting to Azure to speed up their engineering projects. In one example, a total of 120,000 cores were deployed serving in many clusters, all were well supported with the high-performance capabilities of ANF.
As design projects approach completion, the design is continuously and incrementally modified to fix bugs, synthesis and timing issues, optimization of area, timing and power, resolving issues associated with manufacturing design rule checks, etc.
When design changes are made, many if not all the design steps must be re-run to ensure the change did not break the design. As a result, “design spins” or “large regression” jobs will put a large compute demand on the HPC server cluster. This leads to long job scheduler queues (IBM LSF and Univa Grid Engine are two common schedulers for EDA) where jobs wait to be dispatched to run on an available compute node.
Competing project schedules are another reason HPC server cluster demands can exceed on-premises fixed capacity. Most engineering divisions within a company share infrastructure resources across teams and projects which inevitably leads to oversubscription of compute capacity and long job queues resulting in production delays. Bursting EDA jobs into Azure with its available compute capacity, is a way to alleviate these delays. For example, Azure’s latest CPU offering can deliver up to 47% shorter turnaround times for RTL simulation than on-premises.
Engineering management tries to increase productivity with effective use of their EDA tool licensing. Utilizing Azure’s on-demand compute resources and high-performance storage solutions like Azure NetApp Files, enables engineering teams to accelerate design cycles and reduce Non-recurring Engineering (NRE) costs, enhancing productivity significantly.
For “burst to Azure” scenarios that allow engineers quick access to compute resources to finish a job without worrying about the underlying NFS infrastructure and traditional complex management overhead, ANF delivers:
- High Performance: up to 826,000 IOPS per large volume, serving the data for the most demanding simulations with ease to reduce turn-around-time.
- Scalability: As EDA projects advance, the data generated can grow exponentially. ANF provides large-capacity single namespaces with volumes up to 2PiB, enabling your storage solution to scale seamlessly, while supporting compute clusters with more than 50,000 cores.
- Ease of Use: ANF is designed for simplicity, with SaaS-like user experience, allowing deployment and management with a few clicks or API automation. Since storage deployment can be done rapidly, engineering to access their EDA HPC hardware quickly for their jobs.
- Cost-Effectiveness: ANF offers cool access, which transparently moves ‘cold’ data blocks to lower-cost Azure Storage. Additionally, Reserved Capacity (RC) can provide significant cost savings compared to pay-as-you-go pricing, further reducing the high upfront CapEx costs and long procurement cycle associated with on-premises storage solutions. Use the ANF effective pricing estimator to estimate your savings.
- Reliability and Security: ANF provides enterprise-grade data management and security features, ensuring that your critical EDA data is protected and available when you need it with key management and encryption built-in.
Scenario 2: “24×7 Single Set Workload”
As Azure for EDA matured over time and the value of providing engineers with available and faster HPC Infrastructure is becoming more widely shared, more users are now moving a entire sets of workloads into Azure that run 24×7.
In addition to SPICE or RTL simulations, one such set of workloads is “digital signoff” with the same goal of increasing productivity. Scenario 1 concerns cloud bursting which involves batch processes with high performance and rapid deployment, whereas Scenario 2 involves operating a set of workloads with additional ANF capabilities for data security and user control needs.
- QoS support: ANF’s QoS function fine-tunes storage utilization by establishing a direct correlation between volume size (quota) and performance, which set storage limit an EDA tool or workload may have access to.
- Snapshot data protection: As more users are using Azure resources, data protection is crucial. ANF snapshots protect primary data often and efficiently for fast recovery from corruption or loss, by restoring a volume to a snapshot in seconds or by restoring individual files from a snapshot. Enabling snapshots is recommended for user home directories and group shares for this reason as well.
- Large volume support: A set of workloads generates greater output than a single workload, and as such ANF’s large volume support is a feature that’s being widely adopted by EDA users of this scenario. ANF now supports single volumes up to 2PiB in size, allowing a more fine-tuned management of user’s storage footprint.
- Cool access: Cool access is an ANF feature that enables better cost control because only data that is being worked on at any given time remains in the hot tier. This functionality enables inactive data blocks from the volume and volume snapshots to be transferred from the hot tier to an Azure storage account (the cool tier), saving cost. Because EDA workloads are known to be metadata heavy, ANF does not relocate metadata to the cool tier, ensuring that metadata operations operate as expected.
- Dynamic capacity pool resizing: Cloud compute resources can be dynamically allocated. To support this deployment model, Azure NetApp Files (ANF) also offers dynamic pool resizing, which further enhances Azure-for-EDA’s value proposition. If the size of the pool remains constant but performance requirements fluctuate, enabling dynamic provisioning and deprovisioning of capacity pools of different types provides just-in-time performance. This approach lowers costs during periods when high performance is not needed.
- Reserved Capacity: Azure allows compute resources to be reserved as a way to guarantee access to that capacity and allowing you to receive significant cost savings compared to the standard “pay-as-you-go” pricing model. This Azure offering is available to ANF. A reservation in 100-TiB and 1-PiB units per month for a one- or three-year term for a particular service level within a region is now available.
Scenario 3: “Data Center Supplement”
This scenario builds on Scenarios 1 and 2, while Scenario 3 involves EDA users expanding their workflow into Azure as their data center. In this scenario, a mixed EDA flow is hosted with tools from several EDA ISVs, spanning frontend, backend, and Analog mixed signal are being deployed. EDA Companies such as d-Matrix were able to design an entire AI chip, all in Azure as an example of Scenario 3.
In this data center supplement scenario, data mobility and additional data life cycle management solutions are essential. Once again, Azure NetApp Files (ANF) rises to the challenge by offering additional features within its solution stack
- Backup support: ANF has a policy-based backup feature that uses AES-256-bit encryption during the encoding of the received backup data. Backup frequency is defined by a policy.
- Cross-region replication: ANF data can be replicated asynchronously between Azure NetApp Files volumes (source and destination) with cross-region replication. The source and destination volumes must be deployed in different Azure regions. The service level for the destination capacity pool might be the same or different, allowing customers to fine-tune their data protection demands as efficiently as possible.
- Cross-zone replication: Similar to the Azure NetApp Files cross-region replication feature, the cross-zone replication (CZR) capability provides data protection between volumes in different availability zones. You can asynchronously replicate data from an Azure NetApp Files volume (source) in one availability zone to another Azure NetApp Files volume (destination) in another availability zone. This capability enables you to fail over your critical application if a zone-wide outage or disaster happens.
- BC/DR: Users can construct their own solution based on their own goals by using a variety of BC/DR templates that include snapshots, various replication types, failover capabilities, backup, and support for REST API, Azure CLI, and Terraform.
Summary
The integration of ANF into the EDA workflow addresses the limitations of traditional on-premises infrastructure. By leveraging the latest CPU generations and Azure’s on-demand HPC infrastructure, EDA users can achieve significant performance gains and improve productivity, all while being connected by the most optimized, performant file system that’s simple to deploy and support.
The three Azure for EDA scenarios—Cloud Bursting, 24×7 Single Set Workload, and Data Center Supplement—showcase Azure’s adaptability and effectiveness in fulfilling the changing needs of the semiconductor industry. As a result, ANF has become the default NFS solution for EDA in Azure, allowing businesses to innovate even faster.