Blog Post

Azure High Performance Computing (HPC) Blog
5 MIN READ

Running DeepSeek-R1 on a single NDv5 MI300X VM

jesselopez's avatar
jesselopez
Icon for Microsoft rankMicrosoft
Feb 01, 2025

Contributors: Davide Vanzo, Yuval Mazor, Jesse Lopez

 

DeepSeek-R1 is an open-weights reasoning model built on DeepSeek-V3, designed for conversational AI, coding, and complex problem-solving. It has gained significant attention beyond the AI/ML community due to its strong reasoning capabilities, often competing with OpenAI’s models. One of its key advantages is that it can be run locally, giving users full control over their data. 

The NDv5 MI300X VM features 8x AMD Instinct MI300X GPUs, each equipped with 192GB of HBM3 and interconnected via Infinity Fabric 3.0. With up to 5.2 TB/s of memory bandwidth per GPU, the MI300X provides the necessary capacity and speed to process large models efficiently - enabling users to run DeepSeek-R1 at full precision on a single VM. 

In this blog post, we’ll walk you through the steps to provision an NDv5 MI300X instance on Azure and run DeepSeek-R1 for inference using the SGLang inference framework. 

Launching an NDv5 MI300X VM 

Prerequisites 

  • Check that your subscription has sufficient vCPU quota for the VM family “StandardNDI Sv 5MI300X” (see Quota documentation). 
  • If needed, contact your Microsoft account representative to request quota increase. 
  • A Bash terminal with Azure CLI installed and logged into the appropriate tenant. Alternatively, Azure Cloud Shell can also be employed. 

Provision the VM

1. Using Azure CLI, create an Ubuntu-22.04 VM on ND_MI300x_v5:

az group create --location <REGION> -n <RESOURCE_GROUP_NAME> 
az vm create --name mi300x --resource-group <RESOURCE_GROUP_NAME> --location <REGION> --image microsoft-dsvm:ubuntu-hpc:2204-rocm:22.04.2025030701 --size Standard_ND96isr_MI300X_v5 --security-type Standard --os-disk-size-gb 256 --os-disk-delete-option Delete --admin-username azureadmin --ssh-key-values <PUBLIC_SSH_PATH>

Optionally, the deployment can utilize the cloud-init.yaml file specified as --custom-data <CLOUD_INIT_FILE_PATH> to automate the additional preparation described below:

az vm create --name mi300x --resource-group <RESOURCE_GROUP_NAME> --location <REGION> --image microsoft-dsvm:ubuntu-hpc:2204-rocm:22.04.2025030701 --size Standard_ND96isr_MI300X_v5 --security-type Standard --os-disk-size-gb 256 --os-disk-delete-option Delete --admin-username azureadmin --ssh-key-values <PUBLIC_SSH_PATH> --custom-data <CLOUD_INIT_FILE_PATH>

 

Note:  The GPU drivers may take a couple of mintues to completely load after the VM has been initially created.

Additional preparation

Beyond provisioning the VM, there are additional steps to prepare the environment to optimally run DeepSeed, or other AI workloads including setting-up the 8 NVMe disks on the node in a RAID-0 configuration to act as the cache location for Docker and Hugging Face. 

The following steps assume you have connected to the VM and working in a Bash shell.

1. Prepare the NVMe disks in a RAID-0 configuration  

mkdir -p /mnt/resource_nvme/
sudo mdadm --create /dev/md128 -f --run --level 0 --raid-devices 8 $(ls /dev/nvme*n1)  
sudo mkfs.xfs -f /dev/md128 
sudo mount /dev/md128 /mnt/resource_nvme 
sudo chmod 1777 /mnt/resource_nvme  

2. Configure Hugging Face to use the RAID-0.  This environmental variable should also be propagated to any containers pulling images or data from Hugging Face.

mkdir –p /mnt/resource_nvme/hf_cache 
export HF_HOME=/mnt/resource_nvme/hf_cache 

3. Configure Docker to use the RAID-0

mkdir -p /mnt/resource_nvme/docker 
sudo tee /etc/docker/daemon.json > /dev/null <<EOF 
{ 
    "data-root": "/mnt/resource_nvme/docker" 
} 
EOF 
sudo chmod 0644 /etc/docker/daemon.json 
sudo systemctl restart docker 


All of these additional preperation steps can be automated in VM creation using cloud-init.  The example cloud-init.yaml file can be used in provisioning the VM as described above.

#cloud-config
package_update: true
write_files:
  - path: /opt/setup_nvme.sh
    permissions: '0755'
    owner: root:root
    content: |
      #!/bin/bash
      NVME_DISKS_NAME=`ls /dev/nvme*n1`
      NVME_DISKS=`ls -latr /dev/nvme*n1 | wc -l`

      echo "Number of NVMe Disks: $NVME_DISKS"

      if [ "$NVME_DISKS" == "0" ]
      then
          exit 0
      else
          mkdir -p /mnt/resource_nvme
          # Needed incase something did not unmount as expected. This will delete any data that may be left behind
          mdadm  --stop /dev/md*
          mdadm --create /dev/md128 -f --run --level 0 --raid-devices $NVME_DISKS $NVME_DISKS_NAME
          mkfs.xfs -f /dev/md128
          mount /dev/md128 /mnt/resource_nvme
      fi

      chmod 1777 /mnt/resource_nvme
  - path: /etc/profile.d/hf_home.sh
    permissions: '0755'
    content: |
      export HF_HOME=/mnt/resource_nvme/hf_cache
  - path: /etc/docker/daemon.json
    permissions: '0644'
    content: |
      {
        "data-root": "/mnt/resource_nvme/docker"
      }
runcmd:
  - ["/bin/bash", "/opt/setup_nvme.sh"]
  - mkdir -p /mnt/resource_nvme/docker
  - mkdir -p /mnt/resource_nvme/hf_cache
  # PAM group not working for docker group, so this will add all users to docker group
  - bash -c 'for USER in $(ls /home); do usermod -aG docker $USER; done'
  - systemctl restart docker

Using MI300X 

If you are familiar with Nvidia and CUDA tools and environment, AMD provides equivalents as part of the ROCm stack.

MI300X + ROCm 

Nvidia +
CUDA

Description 

rocm-smi 

nvidia-smi 

CLI for monitoring the system and making changes 

rccl 

nccl 

Library for communication between GPUs 

 

Running DeepSeek-R1 

1. Pull the container image.  It is O(10) GB in size, so it may take a few minutes to download.

docker pull rocm/sglang-staging:20250303

2. Start the SGLang serverThe model (~642 GB) is downloaded the first time it is launched and will take at least a few minutes to downloadOnce the application outputs “The server is fired up and ready to roll!”, you can begin making queries to the model. 

docker run \
  --device=/dev/kfd \
  --device=/dev/dri \
  --security-opt seccomp=unconfined \
  --cap-add=SYS_PTRACE \
  --group-add video \
  --privileged \
  --shm-size 32g \
  --ipc=host \
  -p 30000:30000 \
  -v /mnt/resource_nvme:/mnt/resource_nvme \
  -e HF_HOME=/mnt/resource_nvme/hf_cache \
  -e HSA_NO_SCRATCH_RECLAIM=1 \
  -e GPU_FORCE_BLIT_COPY_SIZE=64 \
  -e DEBUG_HIP_BLOCK_SYN=1024 \
  rocm/sglang-staging:20250303 \
  python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-R1 --tp 8 --trust-remote-code --host 0.0.0.0 

3. You can now make queries to DeepSeek-R1.  For example,  these requests to the model from another shell on same host provide model data and will generate a sample response.

curl http://localhost:30000/get_model_info 
{"model_path":"deepseek-ai/DeepSeek-R1","tokenizer_path":"deepseek-ai/DeepSeek-R1","is_generation":true} 
curl http://localhost:30000/generate -H "Content-Type: application/json" -d '{ "text": "Once upon a time,", "sampling_params": { "max_new_tokens": 16, "temperature": 0.6 } }'

Conclusion 

In this post, we detail how to run the full-size 671B DeepSeek-R1 model on a single Azure NDv5 MI300X instance. This includes setting up the machine, installing the necessary drivers, and executing the model. Happy inferencing!

References

 

Updated Mar 10, 2025
Version 9.0
  • caseygu's avatar
    caseygu
    Copper Contributor

    Super helpful, good stuff!

    Commenting with a couple of issues I encountered that may help others:

    1. The driver installation script tries to install azcopy from a domain that no longer exists.
      In in `azhpc-images/common/install_azcopy.sh`, change  `AZCOPY_DOWNLOAD_URL="https://azcopyvnext.azureedge.net/releases/release-${azcopy_release}/${TARBALL}"` to `AZCOPY_DOWNLOAD_URL="https://azcopyvnext-awgzd8g7aagqhzhe.b02.azurefd.net/releases/release-${azcopy_release}/${TARBALL}"` instead. 
    2. I followed the steps to create and save a custom VMI to my gallery, but any VM I created using the image had a lot of issues (different ones every time). Was never able to get SGLang running using a VM bootstrapped with the custom VMI.
      I went back to provisioning the VM manually which resolved all related issues. 
    3. Encountered oom issues when using docker image `rocm/sglang-staging:20250212` which couldn't be resolved by limiting mem block sizes; resolved by using newer `rocm/sglang-staging:20250303` instead. 
    • jesselopez's avatar
      jesselopez
      Icon for Microsoft rankMicrosoft

      Hi!  Thanks for the feedback.  A public image has been published and the blog has been updated to reflect that along with an example of using cloud-init to automate some of the additional steps recommended to ease onboarding.  The article now also recommends an updated container image.

  • Great stuff, that is what I want to try today!

    Just a small typo: it should be double dash before "location" argument in the "az group create" command.

    • jesselopez's avatar
      jesselopez
      Icon for Microsoft rankMicrosoft

      Thanks!  Somewhere the `--` was converted to an em-dash that I missed.

    • garymansell's avatar
      garymansell
      Brass Contributor

      What regions have these VMs as I don't see any in the regions I normally frequent?

      • jesselopez's avatar
        jesselopez
        Icon for Microsoft rankMicrosoft

        I'd recommend speaking with your Microsoft account representative about this.