Parallel Python

This course is aimed at researchers, students, and industry professionals who want to learn intermediate python skills applied to scientific computing and data science.

Trainers

  • Kristian Maras (Kris) (MSc Mathematics / Ba Commerce)
  • Thomas Mauch (Tom) (PhD in astronomy)
  • Nathaniel (Nate) Butterworth (PhD Computational Geophysics)

Course pre-requisites and setup requirements

Introductory Python experience recommended.

Code of Conduct

We expect all attendees of our training to follow our code of conduct, including bullying, harassment and discrimination prevention policies.

In order to foster a positive and professional learning environment we encourage the following kinds of behaviours at all our events and on our platforms:

  • Use welcoming and inclusive language
  • Be respectful of different viewpoints and experiences
  • Gracefully accept constructive criticism
  • Focus on what is best for the community
  • Show courtesy and respect towards other community members

Our full CoC, with incident reporting guidelines, is available here.

General session timings

  • A. Intoduction and Revise Python Data Manipulation and Pandas Data Structure
  • B. Why Polars is a better option for dataframes
  • C. Why Dask provides an ecosystem of tools that can run on clusters of machines.

Setup Instructions

For local installation:

git clone https://github.com/Sydney-Informatics-Hub/ParallelPython.git

cd ParallelPython

conda env create -f environment.yml

conda activate parallel

Google Colab:

Alternatively, you can use Google co-lab, which requires you to sign into your google account. Go to Google Colab, and click “new notebook”. Colab is very similar to jupyter notebook except the compute is run on google cloud infrastructure.

Most packages are by defualt installed. If a package is needed you can run the pip install with the “!” prefix. ie. ! pip install ucimlrepo. This access the underlying terminal.