Now that you’ve learned how Notebook-Based Workflows provide structured, step-by-step guides for agricultural machine learning, it’s time to explore the engine that powers these workflows: Prediction Workflows.
Think of this relationship like a cookbook and kitchen equipment. The notebook-based workflows are your recipe cards that guide you through each step, while prediction workflows are the specialized kitchen tools - your food processor, stand mixer, and precision thermometer - that actually do the heavy lifting to create your final dish.
Imagine you’re a photographer with three different assignments:
Each scenario requires different camera settings, techniques, and equipment. You wouldn’t use the same approach for all three - and the same applies to soil modeling.
Agricultural researchers face similar diverse scenarios:
Each of these scenarios needs a specialized approach, and that’s exactly what Prediction Workflows provide.
Like having different camera modes, AgReFed-ML provides three specialized prediction workflows:
This workflow creates detailed maps of soil properties for a single point in time. Perfect for: - Baseline soil mapping - Property assessment for land purchases - Creating reference maps for crop planning
This workflow compares soil properties between two specific time periods. Ideal for: - Carbon accounting and verification - Monitoring soil health improvements - Assessing impact of management practices
This workflow models how soil properties vary continuously across both space and time. Great for: - Soil moisture forecasting - Dynamic nutrient management - Climate change impact studies
Let’s walk through using each workflow type with practical examples.
Suppose you want to create a soil organic carbon map for a 500-hectare farm. Here’s how you’d configure and run the static workflow:
# Configure the static soil model
settings = {
'model_function': 'rf-gp', # Random Forest + Gaussian Process
'name_target': 'organic_carbon',
'axistype': 'vertical',
'integrate_block': False # Use point predictions
}This tells the system you want to predict organic carbon using a combined Random Forest and Gaussian Process model, focusing on vertical (depth-based) soil mapping.
# Run the static prediction workflow
from soilmod_predict import main
main('settings_static_carbon.yaml')What you get: The system produces soil carbon maps for different depths (e.g., 0-10cm, 10-20cm, 20-30cm) along with uncertainty maps showing prediction confidence.
Now let’s say you need to verify carbon sequestration for a carbon credit program by comparing 2018 and 2023 measurements:
# Configure the change detection model
settings = {
'model_function': 'blr-gp', # Bayesian Linear Regression + GP
'name_target': 'organic_carbon',
'list_t_pred': [2018, 2023], # Two specific years
'axistype': 'temporal'
}# Run the change detection workflow
from soilmod_predict_change import main
main('settings_carbon_change.yaml')What you get: Maps showing carbon levels for both years, plus a change map highlighting areas of carbon gain or loss, complete with statistical significance testing.
For dynamic soil moisture management across a growing season:
# Configure spatial-temporal model
settings = {
'model_function': 'rf-gp',
'name_target': 'soil_moisture',
'list_t_pred': [1, 30, 60, 90, 120], # Days after planting
'integrate_block': True # Use block averaging
}# Run the spatial-temporal workflow
from soilmod_predict_st import main
main('settings_moisture_timeseries.yaml')What you get: A series of soil moisture maps for each time point, allowing you to see how moisture patterns evolve across your field throughout the growing season.
When you run a prediction workflow, here’s the step-by-step process that occurs:
Let’s break this down:
Data Loading and Validation: The workflow first loads your soil measurements and covariate data, checking for missing values and coordinate system consistency.
Preprocessing: The Data Preprocessing Pipeline cleans your data, handles outliers, and prepares features for modeling.
Model Training: Depending on your choice, the system trains either Mean Function Models (like Random Forest) or Gaussian Process Models to learn relationships between soil properties and environmental factors.
Spatial Prediction: The trained model generates predictions across your study area, using the Spatial-Temporal Modeling Framework to handle geographic relationships.
Uncertainty Quantification: The Uncertainty Quantification System calculates prediction confidence intervals and maps areas of high/low uncertainty.
Each prediction workflow is implemented as a specialized Python module that orchestrates the complete modeling process. Here’s how the core prediction engine works:
def select_workflow_type(settings):
"""Choose appropriate workflow based on settings"""
if len(settings.list_z_pred) == 1:
return 'static' # Single time point
elif len(settings.list_z_pred) == 2:
return 'change' # Two time points for comparison
else:
return 'spatial_temporal' # Multiple time pointsThis simple logic routes your analysis to the appropriate specialized workflow based on how many time points you want to analyze.
# Train the mean function (Random Forest or Bayesian Linear Regression)
if mean_function == 'rf':
rf_model = rf.rf_train(X_train, y_train)
y_pred_mean, noise_pred = rf.rf_predict(X_test, rf_model)
# Add Gaussian Process for spatial modeling (if selected)
if not calc_mean_only:
gp_pred, gp_uncertainty = gp.train_predict_3D(
points3D_train, points3D_pred,
y_residuals, noise_train, gp_params
)This two-stage approach first learns the main relationships using machine learning, then uses Gaussian Processes to model spatial patterns in the residuals.
For different spatial support (points, blocks, or polygons), the workflows use different aggregation strategies:
# For block predictions - average over multiple points
if settings.integrate_block:
block_mean, block_std = averagestats(point_predictions, covariance_matrix)
# For point predictions - direct mapping
else:
prediction_map = align_nearest_neighbor(grid_coords, predictions)This ensures your predictions match the spatial resolution and support you need for your specific application.
All workflows automatically generate: - Prediction maps as both images (PNG) and GIS files (GeoTIFF) - Uncertainty maps showing prediction confidence - Summary statistics for model performance assessment - Visualization plots for immediate interpretation
# Save results in multiple formats
array2geotiff(prediction_map, coordinates, resolution,
output_file + '.tif', projection)
plt.savefig(output_file + '.png', dpi=300)
np.savetxt(output_file + '.txt', prediction_values)Each workflow is optimized for its specific use case:
This specialization means you get better results faster, without having to worry about configuring complex modeling parameters for each scenario.
Prediction Workflows are the specialized engines that power AgReFed-ML’s soil modeling capabilities. Like having the right camera mode for different photography scenarios, these workflows ensure you’re using the optimal approach for your specific agricultural question.
Whether you need a single snapshot of soil conditions, want to track changes over time, or require dynamic spatial-temporal predictions, there’s a workflow designed specifically for your needs. The workflows handle all the complex data processing, model training, and spatial prediction generation automatically, while giving you complete control over the modeling approach through simple configuration files.
Ready to dive deeper into how your data gets prepared for these workflows? The next chapter covers the Data Preprocessing Pipeline, where we’ll explore how AgReFed-ML transforms your raw soil measurements into analysis-ready datasets.