SmartCane/webapps/docs/ARCHITECTURE_INTEGRATION_GUIDE.md

19 KiB
Raw Permalink Blame History

SmartCane Architecture Integration Guide

This document ties together all SmartCane architecture components: the unified data pipeline, client type branching, and two execution models (SOBIT production vs developer laptop).

Quick Navigation


System Architecture: Three Dimensions

SmartCane's architecture is defined by three orthogonal dimensions:

┌────────────────────────────────────────────────────────────┐
│                   SmartCane Dimensions                     │
├────────────────────────────────────────────────────────────┤
│                                                             │
│  1. PIPELINE STAGES (Data Flow)                            │
│     Stage 00 (Python) ──> Stage 10 ──> Stage 20 ──> ... ──> Stage 91 (Output)
│                                                             │
│  2. CLIENT TYPES (Business Logic)                          │
│     Agronomic Support (AURA)  vs  Cane Supply (ANGATA)    │
│     ↓                                                       │
│     Different KPIs, data requirements, report formats      │
│                                                             │
│  3. EXECUTION MODELS (Deployment)                          │
│     SOBIT Server (Job Queue)  vs  Dev Laptop (PowerShell)  │
│     ↓                                                       │
│     Different execution paths, error handling, monitoring   │
│                                                             │
└────────────────────────────────────────────────────────────┘

Dimension 1: Pipeline Stages (Data Flow)

Unified for all projects/clients. Stages 0040 are identical regardless of client type or execution model.

Stage Script Client Type Execution Model Purpose
00 Python 00_download_*.py All Both Download 4-band satellite imagery
10 R 10_create_per_field_tiffs.R All Both Split merged TIFF into field tiles
20 R 20_ci_extraction_per_field.R All Both Extract Canopy Index per pixel/field
30 R 30_interpolate_growth_model.R All Both Smooth CI time series (LOESS)
40 R 40_mosaic_creation_per_field.R All Both Create weekly MAX-composites
80 R 80_calculate_kpis.R Specific Both Calculate client-type KPIs
90 R 90_CI_report_*.Rmd agronomic_support Both Render agronomic report
91 R 91_CI_report_*.Rmd cane_supply Both Render cane client report

Dimension 2: Client Types (Business Logic)

Diverges at Stage 80 for KPI calculation and reporting.

Client Type Configuration

# In r_app/parameters_project.R
CLIENT_TYPE_MAP <- list(
  "angata"     = "cane_supply",           # Sugarcane operations
  "chemba"     = "agronomic_support",     # Agronomic advisory
  "xinavane"   = "agronomic_support",
  "esa"        = "agronomic_support",
  "simba"      = "agronomic_support",
  "aura"       = "agronomic_support"
)

KPI Differences

Aspect Agronomic Support (AURA) Cane Supply (ANGATA)
Primary Audience Agronomist / farm consultant Mill operations manager
Key Question "Is the crop healthy? Are yields on track?" "Which fields are ready to harvest this week?"
Data Requirements pivot.geojson (required); harvest.xlsx (optional) pivot.geojson + harvest.xlsx (both required)
KPI Count 6 KPIs 4 KPIs + harvest integration
Utility Script 80_utils_agronomic_support.R 80_utils_cane_supply.R
Report Script 90_CI_report_with_kpis_agronomic_support.Rmd 91_CI_report_with_kpis_cane_supply.Rmd
Harvest Integration Minimal (season grouping only) Central (harvest readiness, phase, tonnage)

When to Switch Client Types

Scenario Action
New project launched Add to CLIENT_TYPE_MAP in parameters_project.R; create pivot.geojson + harvest.xlsx as needed
Aura switching to harvest-focused operations Change mapping: "aura" → "cane_supply"; ensure harvest.xlsx exists
Testing both client types on same project Update mapping; re-run Stages 8091

Dimension 3: Execution Models (Deployment)

Orthogonal to pipeline stages and client types. Both SOBIT and dev laptop can run any stage for any client type.

SOBIT Server (Production)

When to use: Production deployment, scheduled runs, multi-user farm management platform

Flow:

  1. User clicks button in web UI (Laravel controller)
  2. Job dispatched to queue (database or Redis)
  3. Background queue worker picks up job
  4. Shell script wrapper executed
  5. Python/R script runs
  6. Results stored in laravel_app/storage/app/{PROJECT}/
  7. Next job automatically dispatched (chaining)
  8. User monitors progress via dashboard

Key Characteristics:

  • Async execution: Long-running stages don't block web requests
  • Job chaining: Automatic pipeline orchestration (00 → 10 → 20 → ... → 91)
  • Error handling: Failed jobs logged; retries configurable
  • Monitoring: Web dashboard shows job history, results, logs
  • Multi-user: Multiple projects can run concurrently

Infrastructure:

  • Laravel application server (PHP)
  • Queue backend (database or Redis)
  • Supervisor daemon (manages queue workers)
  • Cron for scheduled pipeline runs

Developer Laptop (Development)

When to use: Local development, testing, one-off analysis, debugging

Flow:

  1. Developer opens PowerShell terminal
  2. Sources parameters_project.R to set PROJECT, dates, etc.
  3. Manually calls Rscript or python for each stage
  4. Waits for completion (synchronous)
  5. Reviews output in terminal
  6. Proceeds to next stage or adjusts parameters
  7. Final outputs saved to laravel_app/storage/app/{PROJECT}/

Key Characteristics:

  • Synchronous execution: Developer sees immediate output/errors
  • Manual control: Run individual stages, skip stages, rerun stages
  • No job queue: Direct shell execution
  • Simple setup: Just R, Python, and terminal
  • Single user: Developer's machine

Infrastructure:

  • R 4.4.0+
  • Python 3.9+
  • PowerShell (Windows) or Bash (Linux/Mac)
  • Text editor or RStudio

Decision Matrix: Which Execution Model?

┌─────────────────────────────────────────────────────────────────────────┐
│ Scenario                                     │ SOBIT    │ Dev Laptop    │
├──────────────────────────────────────────────┼──────────┼───────────────┤
│ Production farm management platform          │ ✅       │ ❌            │
│ Scheduled weekly pipeline (Monday 5am)       │ ✅       │ ❌ (need cron)│
│ Multi-user concurrent projects               │ ✅       │ ❌            │
│ New feature development & debugging          │ ❌       │ ✅            │
│ Testing on specific date range               │ ❌       │ ✅            │
│ Ad-hoc "regenerate this week's report"      │ ❌       │ ✅            │
│ CI/CD pipeline (automated testing)          │ ✅       │ ✅ (both OK)  │
│ One-off analysis for farm manager            │ ⚠️ (OK) │ ✅            │
│ Minimal setup (no server)                    │ ❌       │ ✅            │
│ Persistent monitoring & alerting              │ ✅       │ ❌            │
│ Educational demo for agronomist              │ ❌       │ ✅            │
└──────────────────────────────────────────────┴──────────┴───────────────┘

System Integration: Data Flow with Client Types & Execution

Here's how all three dimensions interact in a complete workflow:

Scenario 1: SOBIT Production (Angata, Cane Supply)

1. PIPELINE DIMENSION
   User clicks "Generate Weekly Report" in web UI
   ↓
2. EXECUTION DIMENSION (SOBIT)
   Laravel ProjectReportGeneratorJob dispatched to queue
   ↓
3. CLIENT TYPE DIMENSION
   parameters_project.R loaded with PROJECT="angata"
   CLIENT_TYPE = CLIENT_TYPE_MAP[["angata"]] = "cane_supply"
   ↓
4. PIPELINE STAGES
   Stage 80 runs:
   - source("80_utils_cane_supply.R")  ← Client-specific utilities
   - Calculate 4 KPIs (acreage, phase, harvest readiness, stress)
   - Save RDS + Excel
   ↓
5. STAGE 91 REPORT
   rmarkdown::render("91_CI_report_with_kpis_cane_supply.Rmd")
   ↓
6. OUTPUT
   SmartCane_Report_cane_supply_angata_week07_2026.docx
   (Per-field status, harvest alerts, tonnage forecast)
   ↓
7. USER NOTIFICATION
   Dashboard shows report ready; email sent to mill manager

Scenario 2: Dev Laptop Testing (Aura, Agronomic Support)

1. EXECUTION DIMENSION (Dev Laptop)
   Developer opens PowerShell, sets:
   $PROJECT = "aura"
   $END_DATE = "2026-02-19"
   ↓
2. CLIENT TYPE DIMENSION
   Developer manually checks parameters_project.R:
   CLIENT_TYPE_MAP[["aura"]] = "agronomic_support"
   ↓
3. PIPELINE STAGES (Manual Run)
   $ & $R_EXE r_app/80_calculate_kpis.R 2026-02-19 aura 7
   ↓
4. STAGE 80 EXECUTION
   parameters_project.R loaded with PROJECT="aura"
   source("80_utils_agronomic_support.R")  ← Client-specific utilities
   Calculate 6 KPIs (uniformity, area change, TCH, growth decline, weeds, gap fill)
   ↓
5. STAGE 90 REPORT (Manual)
   $ & $R_EXE -e "rmarkdown::render('r_app/90_CI_report_with_kpis_agronomic_support.Rmd', 
                   params=list(data_dir='aura', report_date=as.Date('2026-02-19')), ...)"
   ↓
6. OUTPUT
   SmartCane_Report_agronomic_support_aura_week07_2026.docx
   (Farm-level KPI averages, uniformity trends, spatial analysis)
   ↓
7. DEVELOPER REVIEW
   Opens .docx file locally, reviews; makes adjustments to plotting code if needed

Data Dependency Map: Python ↔ R Integration

This table shows which R stages depend on Python outputs and third-party inputs.

R Stage Depends On Input File From Notes
10 Python 00 merged_tif/{DATE}.tif Stage 00 4-band daily TIFF
20 Stage 10 field_tiles/{FIELD}/{DATE}.tif Stage 10 4-band per-field daily
20 External Data/pivot.geojson User upload Field boundaries (REQUIRED)
30 Stage 20 combined_CI_data.rds Stage 20 Wide format CI data
30 External Data/harvest.xlsx User upload Harvest dates (optional for agronomic_support, REQUIRED for cane_supply)
40 Stage 20 field_tiles_CI/{FIELD}/{DATE}.tif Stage 20 5-band daily per-field
80 Stage 40 weekly_mosaic/{FIELD}/week_*.tif Stage 40 5-band weekly per-field
80 Stage 30 All_pivots_Cumulative_CI_*.rds Stage 30 Interpolated growth model
80 External Data/pivot.geojson User upload Field boundaries (REQUIRED)
80 External Data/harvest.xlsx User upload Harvest dates (REQUIRED for cane_supply)
80 Python 31 harvest_imminent_weekly.csv Python 31 Harvest probability (optional, improves cane_supply KPI)
90 Stage 80 kpi_summary_tables_week{WW}.rds Stage 80 KPI summary for rendering
90 Stage 20 combined_CI_data.rds Stage 20 For trend plots
91 Stage 80 kpi_summary_tables_week{WW}.rds Stage 80 KPI summary for rendering
91 Python 31 harvest_imminent_weekly.csv Python 31 Harvest readiness probabilities

Configuration Checklist: Before Running Pipeline

Universal Requirements (All Projects)

  • laravel_app/storage/app/{PROJECT}/Data/pivot.geojson exists and is valid
  • At least one satellite TIFF in merged_tif/ (or Stage 00 will download)
  • R packages installed: Rscript r_app/package_manager.R (one-time)
  • Python dependencies installed if running Stage 00

Agronomic Support Projects (AURA type)

  • parameters_project.R maps project to "agronomic_support"
  • (Optional) harvest.xlsx for better season grouping

Cane Supply Projects (ANGATA type)

  • parameters_project.R maps project to "cane_supply"
  • REQUIRED: harvest.xlsx with planting/harvest dates (Stage 30 and Stage 80 need it)
  • (Optional) harvest_imminent_weekly.csv from Python 31 for better harvest predictions

Output Files Reference

By Client Type

Agronomic Support (AURA-type projects)

Output Format Location Created By Content
Report Word reports/SmartCane_Report_agronomic_support_*.docx Stage 90 Farm KPIs, uniformity trends, spatial maps
KPI Excel Excel reports/{PROJECT}_field_analysis_week*.xlsx Stage 80 Field-by-field metrics (6 KPIs per field)
KPI RDS R object reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds Stage 80 Summary data for Stage 90 rendering

Cane Supply (ANGATA-type projects)

Output Format Location Created By Content
Report Word reports/SmartCane_Report_cane_supply_*.docx Stage 91 Harvest alerts, phase assignment, tonnage forecast
KPI Excel Excel reports/{PROJECT}_field_analysis_week*.xlsx Stage 80 Field-by-field metrics (4 KPIs + harvest data)
KPI RDS R object reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds Stage 80 Summary data for Stage 91 rendering

By Execution Model

SOBIT Server

  • Reports auto-saved to laravel_app/storage/app/{PROJECT}/reports/
  • User downloads via web dashboard
  • Optional email delivery (configured in Laravel)

Dev Laptop

  • Reports saved to same location (same Laravel storage directory)
  • Developer manually opens files
  • Can share via USB/cloud if needed

Troubleshooting Decision Tree

Pipeline Error?
├─ Stage 00 (Python download) fails
│  ├─ Auth error → Check Planet API credentials
│  ├─ Network → Check internet connection
│  └─ Cloud cover → That date is too cloudy; try different date
│
├─ Stage 1040 fails → Check prerequisites
│  ├─ File not found → Run previous stage first
│  ├─ pivot.geojson invalid → Repair GeoJSON geometry
│  └─ GDAL error → Check TIFF file integrity
│
├─ Stage 80 fails
│  ├─ harvest.xlsx missing (cane_supply) → REQUIRED; upload file
│  ├─ combined_CI_data.rds missing → Run Stage 20 first
│  ├─ Client type unknown → Check parameters_project.R mapping
│  └─ KPI calculation error → Stage-specific utility function bug
│
├─ Stage 90/91 fails
│  ├─ RMarkdown not found → Check r_app/90_*.Rmd or 91_*.Rmd path
│  ├─ Missing params → Check data_dir, report_date params
│  └─ RMarkdown error → Check knitr output; may need renv reinstall
│
└─ SOBIT-specific error
   ├─ Job stuck in queue → Check job table; restart worker
   ├─ Shell script permission denied → chmod +x *.sh
   └─ r_app directory not found → Update shell scripts with correct paths

Glossary

Term Definition
Canopy Index (CI) Normalized vegetation index: (NIR/Green) - 1; ~1.02.0 range
LOESS Locally Estimated Scatterplot Smoothing; statistical interpolation method
Coefficient of Variation (CV) Standard deviation ÷ mean; measure of field uniformity
Moran's I Spatial autocorrelation metric; detects clustered anomalies
Phase Assignment Growth stage classification (germination, tillering, grand growth, maturation)
Harvest Readiness Probability field is mature and ready for harvest (0.01.0)
MAX Composite Weekly mosaic using maximum CI pixel across the week; handles clouds
Job Queue Async task management system (Laravel); decouples web UI from long computations
RMarkdown Dynamic document format; combines R code + text → Word/.html output

Key Files Summary

File Purpose Edited By
r_app/parameters_project.R Central config; project → client type mapping Dev / DevOps
r_app/00_common_utils.R Shared utility functions Development
r_app/80_utils_agronomic_support.R Agronomic KPI calculations Development (agronomic_support)
r_app/80_utils_cane_supply.R Cane supply KPI calculations Development (cane_supply)
r_app/90_CI_report_with_kpis_agronomic_support.Rmd Agronomic report template Development (agronomic_support)
r_app/91_CI_report_with_kpis_cane_supply.Rmd Cane supply report template Development (cane_supply)
laravel_app/storage/app/{PROJECT}/Data/pivot.geojson Field boundaries GIS / User upload
laravel_app/storage/app/{PROJECT}/Data/harvest.xlsx Harvest calendar User upload (required for cane_supply)

Next Steps

  1. Understand your deployment: Are you on SOBIT server or local dev laptop?

  2. Understand your client type: Are you managing agronomic advisory (AURA) or harvest operations (ANGATA)?

  3. Understand the data flow: How does data transform through 8 pipeline stages?

  4. Run your first pipeline: Choose development laptop and follow DEV_LAPTOP_EXECUTION.md commands.