Custom workflows

Summary

Welcome to Session 2.

The harvest() function used in the last session was a wrapper for several download_*() functions already available in dataharvester (see source). In this session, we willl show you how to call these functions directly for more control over the data download process.

  • download_dem() for Geoscience Australia DEM grid data
  • download_slga() for Soil and Landscape Grid Australia (SLGA) - Soil Atributes
  • download_landscape() for SLGA - Landscape Attributes
  • download_radiometric() for Geosciences Australia’s Radiometric Map of Australia
  • download_silo() for Scientific Information for Land Owners (SILO) climate data
  • collect_ee(), preprocess_ee() and download_ee() for access to datasets available in the Google Earth Engine Data Catalog

Bounding boxes

Tip

The bounding box is defined as c(xmin, ymin, xmax, ymax) in EPSG:4326

Let’s define some areas of interest as bounding boxes for the rest of this Demo:

# Australian
llara_nsw <- c(149.769, -30.335, 149.969, -30.135)  
corrigin_wa <- c(118.015, -32.356, 118.215, -32.156)  
nedscorner_vic <- c(141.215, -34.241, 141.415, -34.0414) 

# Earth Engine only
atlantis <- c(-11.10, 21.35, -11.70, 20.95)

You can try to find the bounding box for your area of interest using this bounding box tool. Make sure to select “CSV” output for easier copy and paste, and to pick an Australian location (unless using Google Earth Engine functions).

Demo 1: Manual downloads

Function syntax

Each download_*() function has similar syntax:

download_*(layer,
           out_path,
           bounding_box,
           <<other arguments>>)

where:

  • layer determines the name(s) of the layer(s) to download
  • out_path is the path to the folder where the data should be saved
  • bounding_box is the EPSG:4328 coordinates used to define the area to download as a bounding box

For example, to download the Radiometric Grid of Australia (Radmap) v4 2019 unfiltered terrestrial dose rate, you would use:

radio <- download_radiometric(
  layer = "radmap2019_grid_dose_terr_awags_rad_2019",
  bounding_box = corrigin_wa,
  out_path = "downloads/session2/"
)

Finding products to download (layers)

Each download_*() function has a layers argument that accepts a pre-defined list of products. These are documented in the function help() or ?. For example, to see the available layers for the download_dem() function, you can use:

?download_dem

The documentation will show you the available layers and their descriptions. Try the following:

?download_slga
?download_landscape
?download_radiometric

There are also some additional arguments available to each download_*() function. For example, download_slga() has the arguments depth_min, depth_max and get_ci to specify the minimum and maximum soil depth and whether to download the soil confidence interval (CI) layers:

download_slga(
    layer = "Clay",
    out_path = "downloads/session2/",
    bounding_box = llara,
    depth_min = 0,
    depth_max = 5,
    get_ci = TRUE
)

Make sure to read the documentation for each function to see what additional arguments are available.

Demo 2: Google Earth Engine

The Google Earth Engine (GEE) API is a cloud-based platform for planetary-scale geospatial analysis. It provides access to a large collection of satellite imagery, elevation data, and other geospatial datasets. We provide curated access to the GEE API through the dataharvester package, which allows you to download data from the GEE Data Catalogue.

Function syntax

Accessing GEE is simple once you have an account and have initialised the dataharvester package:

  • The collect_ee() function is used to search the GEE Data Catalogue and return a list of available datasets based on pre-determined filters.
  • The preprocess_ee() function is used to calculate summaries within the datasets before downloading or mapping.
  • The map_ee() and download_ee() functions are then available to map and download the data.

Workflow

The workflow for using the GEE API is as follows:

  1. Use collect_ee() to search the GEE Data Catalogue and return a list of available datasets based on pre-determined filters.
  2. Use preprocess_ee() to generate summaries and spectral indices.
  3. Use map_ee() to map the data (optional), and download_ee() to download the data.
Important

Note that the order of processing must follow the above, or there may be unintended consequences. If you find any irregular behaviour, let us know! The dataharvester package is in early development and we are keen to fix any undocumented issues.

Let’s use a full example to demonstrate and explain these functions. Make sure that you have initialised the dataharvester package with your GEE credentials:

## Create the GEE object
img <- collect_ee(
  collection = "LANDSAT/LC09/C02/T1_L2",
  coords = llara_nsw,
  date = "2021-06-01",
  end_date = "2022-06-01"
)
## Preprocess - scaling, offsetting and cloud masking are done by default
img <- preprocess_ee(object = img, reduce = "median")

## Preview
img <- map_ee(img, bands = c("SR_B2", "SR_B3", "SR_B4"))

## Download
img <- download_ee(
  img,
  bands = c("SR_B2", "SR_B3", "SR_B4"),
  out_path = "downloads/session2/")
img <-
  collect_ee(
    collection = "LANDSAT/LC09/C02/T1_L2",
    coords = llara,
    date = "2021-06-01",
    end_date = "2022-06-01") %>%
  preprocess_ee(reduce = "median") %>%
  map_ee(bands = c("SR_B2", "SR_B3", "SR_B4")) %>%
  download_ee(bands = c("SR_B2", "SR_B3", "SR_B4"),
    out_path = "downloads/session2/"

Your instructor may demonstrate more examples during the session.

Demo 3: Preview rasters

Agenda

Time to preview the downloaded data. This demo will:

  • Introduce plotting of raster data using terra
  • Discuss how to plot single-band and multi-band rasters

Once data has been downloaded, it is a trivial task to preview the data using the terra package. The terra package is a powerful package for working with raster data in R. First, we will load the terra package which is automatically installed when you install the dataharvester package:

library(terra)

Any image that you have downloaded using the dataharvester package will need to be converted to a SpatRaster object in R using the rast() function. For example, to conver the SLGA image that we downloaded earlier, we can use:

clay <- rast("downloads/session2/SLGA_Clay_0-5cm.tif")

Then, to plot, we can use the plot() function:

plot(clay)

If the image contains multiple bands, you can still plot each band individually using the plot() function. For example, to plot the first band of the Landsat image that we downloaded earlier, we can use:

gee_path <- paste0(img$outpath, img$filenames)
gee_llara <- rast(gee_path)
plot(gee_llara, 1)

Plotting objects created by download_*() functions

If you use the download_*() functions and save the output to an object, then you can use the plot() function to plot the data immediately. The plots will be identical to what was produced in terra, since the package is used to create the S3 plot object.

As an example, to plot the data downloaded using the download_dea() function, we can use:

dea <- download_dea(
  layer = c("landsat_barest_earth", "ga_ls_fc_pc_cyear_3"),
  bounding_box = llara_nsw,
  out_path = "downloads/session2/",
  years = 2021,
  resolution = 6
)
plot(dea)

The code above will plot all layers and their bands in the same plot.

If the file has already been downloaded and you want to plot it, then you can use the above code using terra, or use the plot_rasters() function:

plot_rasters("downloads/session2/landsat_barest_earth.tif")

Note that plot_rasters() does not currently have the option to view each band individually.

Exercise: More plots

On your own

For objects that are created using the download_*() functions, additional arguments can be used to refine the plots. For example, if you want to plot each layer separately, then you can use the plot() function with the choose argument:

plot(x, choose = 1)

In a data layer, mutltiple bands can also be plotted separately. To do so, you can use the plot() function with the index argument:

plot(x, choose = 1, index = 1)
# or
# plot(x, 1, 1)

Task

Using the same code as above to download data from DEA:

dea <- download_dea(
  layer = c("landsat_barest_earth", "ga_ls_fc_pc_cyear_3"),
  bounding_box = llara_nsw,
  out_path = "downloads/session2/",
  years = 2021,
  resolution = 6
)
  1. plot the data using the plot() function
  2. plot only the second layer "ga_ls_fc_pc_cyear_3"
  3. plot only the pv_pc_50 band from the second layer
# 1
plot(dea)

# 2
plot(dea, 2)

# 3
plot(dea, 2, 2)

Wrapping up

This is the end of Session 2. In this session we looked at using manual methods to download data from various API sources and the GEE Catalog. We also briefly looked at how to preview the data. In the next session, we will look at more Google Earth Engine processing and how to extract samples from the downloaded images.