Intro to GPU computing

Overview

Teaching: 30 min
Exercises: 0 min

Questions

What is a GPU?

How do I get access to a GPU?

GPU or CPU?

How do I develop code for GPU computing?

Objectives

Understand why GPUs are cool

Setting up a development environment

This episode introduces a few ways we can set up a working GPU environemnt. You will find to get a well-functioning CUDA development environment there is a precious balance between your operating system, your model of GPU, your NVIDIA GPU driver, and your CUDA version that straddle across all the other pieces (not to mention the software, e.g. tensorflow, python, keras, matlab, that depends on the underlying set-up).

What is a GPU?

A Graphics Processing Unit as opposed to a Central Processing Unit (CPU). Originally intended for sending an image to the pixels on the screen, someone then figured out that by treating data as a texture map you could perfom lots of calculations simultanously, and the age of the GPGPU (general-purpose graphics processing unit) began.

How do I get access to a GPU?

Your PC!

Sydney University HPC Artemis

Sydney University Virtual Research Desktop Argus

NCI

Pawsey

Other clouds like GCP, AWS, Azure, NGC, etc

Why do I want to use a GPU?

We can visualise the CPU and GPU as something like this:

CPU strengths	GPU strengths
Very large main memory	High bandwidth main memory
Very fast clock speeds	Latency tolerant via parallelism
Latency optimized via large caches	Significantly more compute resources
Small number of threads can run very quickly	High throughput
	High performance/watt

CPU weaknesses	GPU weaknesses
Relatively low memory bandwidth	Relatively low memory capacity
Low performance/watt	Low per-thread performance

An informative way to compare CPU and GPU computing comparing speed and throughput.

Some keywords

A thread is like a single computational task or instruction or kernel that is run on the GPU (or CPU).

A thread block is a group of threads in the same location on the GPU so they can communicate easily with each other.

A grid is the collection of thread blocks.

When executing your kernel (i.e. function), the typical CUDA command looks like this:

//The numbers possible here are dependent on your GPU hardware.
YourKernel<<<NumberOfBlocks,NumberOfThreadsPerBlock>>>(...) 

Use device query (or gpuQuerey in Matlab) to see some details about your GPU.

Many different ways to use and interface GPUs.

Having proposed GPU (or CPU) compute time on grant applications can show you are conducting accelerated research! Get in touch with us for help estimating your resource usage, or for any assitance, https://informatics.sydney.edu.au/

How do you develop code for GPU computing?

The most common langauges for developing code for GPUs are CUDA, OpenCL, and OpenACC. These are all low-level and require fairly strong programming capabilities. However, high-level languages like Python and Matlab and subsequent packages within them (keras, tensorflow, theano, etc), can make use of your GPU by essentially writing CUDA (or OpenCL) for you!

We will see a few examples and you can decide what language best suits your use cases. We will be focusing on CUDA-capable cards (i.e NVIDIA). OpenCL works on the other most popular GPU card, AMD Radeons (Mac GPU of choice). Plus there has been a push from mobile markets to get more into the GPU space, so now companies like ARM and Intel (which also support OpenCL) are starting to have more of an impact on GPU computing.

Key Points

GPU vs CPU

Suitable problems for GPU

CUDA, and high level GPU environments

Where to get one

lesson home

Introduction to GPU computing on HPC

next episode