Linux, 工作, 生活, 家人

AI, ARM, Ubuntu

GPU Passthrough on ARM64 with Libvirt/Virt-manager

In this article, I’ll walk you through the steps to set up GPU passthrough on an ARM64 system using Libvirt and Virt-manager. While using the ChatGPT to search answer, the steps may seem straightforward, but missing a critical detail can cause the process to fail.

System Specifications

Nvidia Driver: NVIDIA-Linux-aarch64-570.86.16.run
Host: Ampere Altra + ALTRAD8UD
Host OS: Ubuntu 22.04 with HWE kernel (6.8)
Guest OS: Ubuntu 22.04 (ubuntu-22.04-live-server-arm64.iso)
GPU: Nvidia RTX 4080 16GB

Assumptions

  1. You are familiar with Ubuntu and its basic commands.
  2. You have experience using Virt-manager.
  3. All commands are executed as the root user.

If anything is unclear, you can refer to external resources for additional guidance.

Host Configuration

Enable IOMMU

To enable IOMMU, you need to enable the SR-IOV option in the BIOS and verify whether the Linux kernel has IOMMU enabled by default.

You can check if IOMMU is enabled by running:

$ dmesg | grep -i iommu

Example output:

[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.8.0-52-generic root=UUID=6b78fa89-a575-432d-a445-1497c3467214 ro iommu=on
[ 0.000000] Unknown kernel command line parameters "BOOT_IMAGE=/boot/vmlinuz-6.8.0-52-generic iommu=on", will be passed to user space.
[ 11.561684] iommu: Default domain type: Translated
[ 11.566470] iommu: DMA domain TLB invalidation policy: strict mode

If IOMMU is not enabled, add iommu=on to the Linux kernel boot parameters:

$ vim /etc/default/grub 

Modify the line:

GRUB_CMDLINE_LINUX_DEFAULT="iommu=on"

Then update GRUB and reboot:

$ update-grub2  
$ reboot

Additionally, enable SR-IOV in the BIOS. The exact location of this setting varies depending on the BIOS, but it is typically found under the PCIe subsystem or related options.

Upgrade Host to HWE Kernel

I recommend using the Hardware Enablement (HWE) kernel on the host. While I’m unsure if the regular kernel works, the HWE kernel has been reliable in my experience. Install it with:

sudo apt install linux-generic-hwe-22.04

Configure VFIO on Host

The VM relies on the VFIO driver for GPU passthrough. To configure VFIO, you need to pass the PCIe device information to the VFIO driver.

First, identify the GPU’s PCIe device IDs:

$ lspci -nn

Example output:



0005:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2704] (rev a1)
0005:01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22bb] (rev a1)

Here, 10de:2704 is the GPU’s PCIe device ID, and 10de:22bb is the audio device ID. At a minimum, you need to pass through the GPU device.

Next, edit the VFIO configuration file to include these IDs:

$ vim /etc/modprobe.d/vfio.conf

Add the following line:

options vfio-pci ids=10de:2704,10de:22bb

Disable Nvidia Driver on HOST

To prevent the host from loading the Nvidia driver, add the Nvidia modules to the kernel’s blocklist:

$ vim /etc/modprobe.d/blacklist.conf

Add the following lines:

blacklist nvidia
blacklist nvidia_drm
blacklist nvidia_modeset

Update the initramfs and reboot:

$ update-initramfs -u
$ reboot

Configure VM

Install Virt-manager

In this article, we used virt-manager as VM manager, first step is install virt-manager, suppose Ubuntu will install all relative packages.

$ apt install virt-manager 

If you’re using SSH with X11 forwarding (e.g., ssh -X host) or MobaXTerm on Windows, Virt-manager will display the remote X window. If neither method works, consider installing a KDE desktop on the host and accessing it via the BMC remote console.

(Optional) Install KDE Plasma Desktop:

(option)
$ apt install kde-plasma-desktop

Create VM image

Virt-manager creates fixed-size VM images by default. If you prefer dynamic allocation, create the image manually:

$ qemu-img create -f qcow2 ubuntu2204.qcow2 200G

Add Nvidia device to VM

If the host is configured correctly, Virt-manager will list all PCIe devices, including the Nvidia GPU. Add the GPU and its audio device (e.g., 0005:01:00.0 and 0005:01:00.1) to the VM’s hardware list.

After adding the devices, proceed with the Ubuntu 22.04 installation.

After add hardware

Now, it can run begin install to install ubuntu 22.04

Disable secure Boot in UEFI

By default, Virt-manager enables Secure Boot. However, Nvidia drivers may not work with Secure Boot enabled. Even though the Nvidia installer includes a driver signing feature, the driver may still fail to load. To avoid issues, disable Secure Boot in the VM’s UEFI settings.

During the VM’s boot process, press the DEL key to enter UEFI settings and uncheck the Secure Boot option.

Before installing the Nvidia driver, ensure the necessary development packages are installed:

$ apt install build-essential

Then, install the Nvidia driver and reboot the VM.

GPU Passthrough Test

If everything is set up correctly, running nvidia-smi should display the GPU’s status.

For testing, you can use Ollama with the Deepseek-R1 model. Install Ollama with:

curl -fsSL https://ollama.com/install.sh | sh

Pull the Deepseek-R1 model. Since the GPU has 16GB of memory, the 14B model is a good choice (it requires ~10GB):

ollama run deepseek-r1:14b

Ask a question like, “Why is the sky blue?” This will trigger the model’s Chain-of-Thought (CoT) reasoning.

Monitor the GPU’s status using nvidia-smi to ensure it’s functioning correctly.

發佈留言