- Thu 27 July 2023
- unix
- Georgie Samaha
- #rclone, #onedrive, #sharepoint
Since the death of Cloudstor (RIP), we've had to rely on OneDrive to share data with external collaborators who cannot gain access to USyd supported compute platforms. When working with large datasets, up/downloading data to OneDrive requires a CLI solution that can be automated and sped up.
While OneDrive has its own user interface for accessing the OneDrive API from the command-line its functionality is limited. Rclone is also capable of interacting with the OneDrive API, support other cloud services beyond OneDrive, and offers advanced features like synchronising directories and data verification using checksums.
Configuration when you don't have sudo permissions (i.e. on Artemis/RDS)
NOTE: on Artemis/rds you will not be able to open the provided link. To avoid this you'll have to configure rclone on your local machine. So ensure it is installed on a machine with external network access before you start this process. See section below for installation instructions.
If on Artemis/RDS, load the module:
module load rclone/1.62.2
Then start an interactive session to configure your installation and follow the steps below to configure for OneDrive:
rclone config
- Select new remote (n)
- Name your remote (e.g. Georgie-OneDrive)
- Select 'onedrive' as storage type
- Leave client ID and secret empty (hit enter a few times)
- Do not edit advanced config (n), use auto config (y)
- When asked whether or not to use a web bowser to authenticate rclone, select no (n)
- On your local machine (requires rclone is installed) run
rclone authorize "onedrive"
- Copy the provided secret token and paste it in Artemis/rds terminal
- Select 'onedrive' as config type
- Specify which drive to use
- Accept configuration (y)
- Quit config (q)
Configuration when you do have sudo permissions
Start by installing and configuring rclone if on a VM or local machine:
sudo apt install rclone
Then start an interactive session to configure your installation and follow the steps below to configure for OneDrive:
rclone config
- Select new remote (n)
- Name your remote (e.g. Georgie-OneDrive)
- Select 'onedrive' as storage type
- Leave client ID and secret empty (hit enter a few times)
- Do not edit advanced config (n), use auto config (y)
- Open link in your browser, accept permission request
- Back on CLI, select onedrive
- Specify which drive to use
- Accept configuration (y)
- Quit config (q)
Transfer data
Once configuration is set you can view your rclone configuration with:
rclone config show
To transfer data to your configured OneDrive account, simply run:
rclone copy <source>:<path> <dest>:<path>
For example:
rclone copy ./genome-1.bam Georgie-OneDrive:Documents/Genome-1
Where I am transfering the file genome-1.bam
to my pre-configured OneDrive remote named Georgie-OneDrive
at the path Documents/Genome-1
To transfer data from rds or Artemis to OneDrive via a PBS job using the dtq queue:
#!/bin/bash
#PBS -P SIHsandbox
#PBS -N transfer
#PBS -q dtq
#PBS -l select=1:ncpus=1:mem=20gb
#PBS -l walltime=05:00:00
#PBS -W umask=022
#PBS -j oe
#PBS -m e
#PBS -M georgina.samaha@sydney.edu.au
module load rclone
source= # file/directory to transfer
destination= # name of configured onedrive drive
path= # onedrive path
rclone copy ${source} ${destination}:${path}