Guide to gpu Setup and Configuration#

This is a short guide to document the things that QuantEcon has learnt when setting up servers to support gpu based worklows including using the new jax library.

Historically jax has been very sensitive to the hardware environment including the mix between:

  1. Nvidia Drivers

  2. CUDA

  3. CUDnn

OS#

The OS of choice has been Ubuntu.

We opt for the latest Desktop LTS release as it provides a graphical user environment for local use as a terminal, in addition to being able to access the server remotely via ssh

Installing ssh#

The desktop version does not come with a ssh server installed

We follow this guide

sudo apt-get install openssh-server

and then enabling the service

sudo systemctl enable ssh --now

Installing tailscale#

Warning

You will need credentials to add it to the tailscale network

We need to provide access to the server using the private QuantEcon tailscale network

There are great installation instructions here

curl -fsSL https://tailscale.com/install.sh | sh

Installing nvidia cuda and cudnn#

In my experience having control over the nvidia cuda and cudnn installs is better than trying to use apt to get these libraries.

Tip

Getting access to cudnn will require an nvidia developer account

You can get the CUDA installer from here

You select the OS, Architecture, OS, and then Version

This will then present you with the Installer type and I always use deb (local)

This will then present a set of install instructions such as:

../_images/nvidia-cuda-download.png

Tip

CUDA comes packaged with a set of nvidia drivers.

For example: cuda-repo-ubuntu2204-12-1-local_12.1.0-530.30.02-1_amd64.deb comes with the 530.30.02 set of nvidia drivers.

It is good to use this installer to set the driver as they are released jointly but if your driver support does not work you may need to install the drivers separately. See Installing nvidia drivers.


To install cudnn you need to login with a developer registration and click on cuDNN Download and then agree to the Terms and Conditions.

../_images/nvidia-cudnn-download.png

and make sure you select the correct version of Ubuntu

Tip

The install instructions link at the top of that page is useful to help get the deb file installed.

Installing jax#

Once you have installed cuda and cudnn libraries succesfully you should then install jax using the CUDA installed locally instructions.

pip install --upgrade "jax[cuda12_local]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

and this currently requries cuda>=12 and cudnn>=8.8

Installing nvidia drivers#

The drivers are relatively straight forward in linux

You should check the recommended version using the nvidia driver tool

Then use the ubuntu graphical tool for installing Additional drivers

This can be accessed as a tab through Software Update

Using eGPU with Ubuntu#

This blog post is an excellent resource to help diagnose issues when using external GPU’s (via thunderbolt)

Specifically disabling Wayland in preference for X11 by editing /etc/gdm3/custom.conf and uncommenting

WaylandEnable=false

then edit /usr/share/X11/xorg.conf.d/10-nvidia.conf to enable external GPU support

Option "AllowExternalGpus" "True"

These updates greatly improved the stability of using the eGPU for the quantecon server.