InstructLab toolbox container for AMD ROCm GPUsΒΆ

The ROCm container file is designed for AMD GPUs with RDNA3 architecture (gfx1100). The container can be build for RDNA2 (gfx1030) and older GPUs, too. Please refer to AMD’s system requirements for a list of officially supported cards. ROCm is known to work on more consumer GPUs.

The container file creates a toolbox container for toolbox(1) command line tool. A toolbox containers has seamless access to the entire system including user’s home directory, networking, hardware, SSH agent, and more.

The container has all Python dependencies installed in a virtual env. The virtual env is already activated when you enter the container.

Quick startΒΆ

  1. git clone the instructlab and taxonomy project into a common folder in your home directory (e.g. ~/path/to/instructlab)

  2. add your account to render and video group: sudo usermod -a -G render,video $LOGNAME

  3. install build dependency for this container: sudo dnf install toolbox podman make rocminfo

  4. build the container: make rocm or make rocm-gfx1100

  5. create a toolbox make rocm-toolbox

  6. enter toolbox toolbox enter instructlab. The container has your home directory mounted.

To update InstructLab CLI to latest version: pip install -e ~/path/to/instructlab/instructlab

ilab data generate and ilab model chat use the GPU automatically. ilab model train needs more powerful and recent GPU and therefore does not use GPU by default. To train on a GPU, run ilab model train --device cuda.

Building for other GPU architecturesΒΆ

Use the amdgpu-arch or rocminfo tool to get the short name

dnf install clang-tools-extra rocminfo
amdgpu-arch
rocminfo | grep gfx

Map the name to a LLVM GPU target and an override GFX version. PyTorch 2.2.1+rocm5.7 provides a limited set of rocBLAS Kernels. Fedora 40’s ROCm packages have more Kernels. For now we are limited to what PyTorch binaries provide until Fedora ships python-torch with ROCm support.

Name

XNACK/USM

Version

PyTorch

Fedora

gfx900

9.0.0

βœ…

βœ…

gfx906

xnack-

9.0.6

βœ…

βœ…

gfx908

xnack-

9.0.8

βœ…

βœ…

gfx90a

xnack-

9.0.10

βœ…

βœ…

gfx90a

xnack+

9.0.10

βœ…

βœ…

gfx940

❌

βœ…

gfx941

❌

βœ…

gfx942

❌

βœ…

gfx1010

❌

βœ…

gfx1012

❌

βœ…

gfx1030

10.3.0

βœ…

βœ…

gfx1100

11.0.0

βœ…

βœ…

gfx1101

❌

βœ…

gfx1102

❌

βœ…

If your card is not listed or unsupported, try the closest smaller value, e.g. for gfx1031 use target gfx1030 and override 10.3.0. See ROCm/ROCR-Runtime isa.cpp and LLVM User Guide for AMDGPU for more information.

Marketing Name

Name

Arch

Target

GFX version

Memory

Chat

Train

AMD Radeon RX 7900 XT

gfx1100

RDNA3

gfx1100

11.0.0

20 GiB

βœ…

βœ…

AMD Radeon RX 7900 XTX

RDNA3

24 GiB

βœ…

βœ…

AMD Radeon RX 6700

gfx1031

RDNA2

gfx1030

10.3.0

10 GiB

βœ…

❌

Build the container with additional build arguments:

make rocm-gfx1100 BUILD_ARGS=

Known issuesΒΆ

AMD Instinct MI210 with ISA amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- is not supported by Fedora build rocblas-6.0.2-3. As of late April 2024, Fedora has gfx90a:xnack+ and gfx90a:xnack- but lacks gfx90a:sramecc+:xnack-.