Geodata-Harvester
Jumpstart your analysis with a ready-made set of spatial-temporal aligned raster maps and dataframes.
What is it?
The Geodata-Harvester project enables researchers with reusable workflows and provides open-source software for automatic data extraction from a wide range of data sources including spatial-temporal processing. User provided data is auto-completed with a suitable set of spatial- and temporal-aligned covariates as a ready-made dataset for machine learning models. All data layer maps are automatically extracted and aligned for a specific region and time period.
Data Sources
The following data sources are currently integrated:
- Soil and Landscape Grid of Australia (SLGA)
- SILO Climate Database (Australia)
- National Digital Elevation Model (DEM)
- Digital Earth Australia (DEA) Geoscience Earth Observations
- Radiometric Data (Australia)
- Google Earth Engine Data (account needed)
Functionality
The main goal of the Data Harvester is to enable researchers with reusable workflows for automatic data extraction and processing:
- Retrieve: automatically access geospatial and soil data sources, minimal handling of individual APIs
- Process: Spatial and temporal processing, filter, mask, reduce and convert data
- Output: download data as GeoTIFF and ready-made data frames for use in additional modelling and machine learning workflows
Data-Harvester is designed as a modular and maintainable project in the form of a multi-stage pipeline by providing explicit boundaries among tasks. To encourage interaction and experimentation with the pipeline, we provide multiple frontend notebooks and use case scenarios as Jupyter and R notebooks, as well as standalone Python and R packages. The core features are:
- automatic data retrieval from geospatial APIs for given locations and dates
- data experimentation frontends via Jupyter and R notebooks
- enables reusable workflows via interactive widgets and YAML files to save/load settings.
- automatic geospatial-temporal processing
- support for multiple temporal aggregation options
- automatic extraction of retrieved data into ready-made dataframes for ML training
- automatic generation of ready-made aligned maps and dataframes for ML prediction models
- preview of data map layers
If you would like to learn more about the Geodata-Harvester, please visit our Workshop webpage.