Parallel Python: multithreading, multiprocessing and parallel python programming for HPC: Glossary

Key Points

1. Accelerating Python
  • SIH is availble to researchers to help them research!

  • Many ways to make Python go faster.

2. Simple methods
  • Understand there are different ways to accelerate

  • The best method depends on your algorithms, code and data

3. Connecting to Artemis HPC
  • Several methods and tools to connect to a remote machine

  • Get access to more resources than your local computer

4. Traditional python approaches to multi cpu and nodes
  • load multiprocessing library to execute a function in a parallel manner

5. Intro to Dask and Dask Dataframes
  • Dask builds on numpy and pandas APIs but operates in a parallel manner

  • Computations are by default lazy and must be triggered - this reduces unneccessary computation time

6. Introducing dask Bag and more dask examples
  • Dask Bag uses map filter and group by operations on python objects or semi/unstrucutred data

  • dask.multiprocessing is under the hood

  • Xarray for holding scientific data

Glossary

FIXME