# SmartCane Architecture Integration Guide This document ties together all SmartCane architecture components: the unified data pipeline, client type branching, and two execution models (SOBIT production vs developer laptop). ## Quick Navigation - **[ARCHITECTURE_DATA_FLOW.md](ARCHITECTURE_DATA_FLOW.md)**: Complete Stage 00–91 pipeline with data transformations, file formats, and storage locations - **[CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md)**: How agronomic_support (AURA) and cane_supply (ANGATA) differ in KPIs, reports, and requirements - **[SOBIT_DEPLOYMENT.md](SOBIT_DEPLOYMENT.md)**: Production server execution via Laravel job queue, web UI, and shell wrappers - **[DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md)**: Manual PowerShell execution for developer testing and one-off analysis --- ## System Architecture: Three Dimensions SmartCane's architecture is defined by three orthogonal dimensions: ``` ┌────────────────────────────────────────────────────────────┐ │ SmartCane Dimensions │ ├────────────────────────────────────────────────────────────┤ │ │ │ 1. PIPELINE STAGES (Data Flow) │ │ Stage 00 (Python) ──> Stage 10 ──> Stage 20 ──> ... ──> Stage 91 (Output) │ │ │ 2. CLIENT TYPES (Business Logic) │ │ Agronomic Support (AURA) vs Cane Supply (ANGATA) │ │ ↓ │ │ Different KPIs, data requirements, report formats │ │ │ │ 3. EXECUTION MODELS (Deployment) │ │ SOBIT Server (Job Queue) vs Dev Laptop (PowerShell) │ │ ↓ │ │ Different execution paths, error handling, monitoring │ │ │ └────────────────────────────────────────────────────────────┘ ``` ### Dimension 1: Pipeline Stages (Data Flow) **Unified for all projects/clients**. Stages 00–40 are identical regardless of client type or execution model. | Stage | Script | Client Type | Execution Model | Purpose | |-------|--------|-------------|-----------------|---------| | 00 | Python 00_download_*.py | All | Both | Download 4-band satellite imagery | | 10 | R 10_create_per_field_tiffs.R | All | Both | Split merged TIFF into field tiles | | 20 | R 20_ci_extraction_per_field.R | All | Both | Extract Canopy Index per pixel/field | | 30 | R 30_interpolate_growth_model.R | All | Both | Smooth CI time series (LOESS) | | 40 | R 40_mosaic_creation_per_field.R | All | Both | Create weekly MAX-composites | | **80** | **R 80_calculate_kpis.R** | **Specific** | **Both** | **Calculate client-type KPIs** | | **90** | **R 90_CI_report_*.Rmd** | **agronomic_support** | **Both** | **Render agronomic report** | | **91** | **R 91_CI_report_*.Rmd** | **cane_supply** | **Both** | **Render cane client report** | --- ### Dimension 2: Client Types (Business Logic) **Diverges at Stage 80 for KPI calculation and reporting**. #### Client Type Configuration ```r # In r_app/parameters_project.R CLIENT_TYPE_MAP <- list( "angata" = "cane_supply", # Sugarcane operations "chemba" = "agronomic_support", # Agronomic advisory "xinavane" = "agronomic_support", "esa" = "agronomic_support", "simba" = "agronomic_support", "aura" = "agronomic_support" ) ``` #### KPI Differences | Aspect | **Agronomic Support (AURA)** | **Cane Supply (ANGATA)** | |--------|------------------------------|------------------------| | **Primary Audience** | Agronomist / farm consultant | Mill operations manager | | **Key Question** | "Is the crop healthy? Are yields on track?" | "Which fields are ready to harvest this week?" | | **Data Requirements** | pivot.geojson (required); harvest.xlsx (optional) | pivot.geojson + harvest.xlsx (both required) | | **KPI Count** | 6 KPIs | 4 KPIs + harvest integration | | **Utility Script** | `80_utils_agronomic_support.R` | `80_utils_cane_supply.R` | | **Report Script** | `90_CI_report_with_kpis_agronomic_support.Rmd` | `91_CI_report_with_kpis_cane_supply.Rmd` | | **Harvest Integration** | Minimal (season grouping only) | Central (harvest readiness, phase, tonnage) | #### When to Switch Client Types | Scenario | Action | |----------|--------| | **New project** launched | Add to `CLIENT_TYPE_MAP` in parameters_project.R; create pivot.geojson + harvest.xlsx as needed | | **Aura switching** to harvest-focused operations | Change mapping: "aura" → "cane_supply"; ensure harvest.xlsx exists | | **Testing** both client types on same project | Update mapping; re-run Stages 80–91 | --- ### Dimension 3: Execution Models (Deployment) **Orthogonal to pipeline stages and client types**. Both SOBIT and dev laptop can run any stage for any client type. #### SOBIT Server (Production) **When to use**: Production deployment, scheduled runs, multi-user farm management platform **Flow**: 1. User clicks button in web UI (Laravel controller) 2. Job dispatched to queue (database or Redis) 3. Background queue worker picks up job 4. Shell script wrapper executed 5. Python/R script runs 6. Results stored in `laravel_app/storage/app/{PROJECT}/` 7. Next job automatically dispatched (chaining) 8. User monitors progress via dashboard **Key Characteristics**: - **Async execution**: Long-running stages don't block web requests - **Job chaining**: Automatic pipeline orchestration (00 → 10 → 20 → ... → 91) - **Error handling**: Failed jobs logged; retries configurable - **Monitoring**: Web dashboard shows job history, results, logs - **Multi-user**: Multiple projects can run concurrently **Infrastructure**: - Laravel application server (PHP) - Queue backend (database or Redis) - Supervisor daemon (manages queue workers) - Cron for scheduled pipeline runs --- #### Developer Laptop (Development) **When to use**: Local development, testing, one-off analysis, debugging **Flow**: 1. Developer opens PowerShell terminal 2. Sources `parameters_project.R` to set PROJECT, dates, etc. 3. Manually calls `Rscript` or `python` for each stage 4. Waits for completion (synchronous) 5. Reviews output in terminal 6. Proceeds to next stage or adjusts parameters 7. Final outputs saved to `laravel_app/storage/app/{PROJECT}/` **Key Characteristics**: - **Synchronous execution**: Developer sees immediate output/errors - **Manual control**: Run individual stages, skip stages, rerun stages - **No job queue**: Direct shell execution - **Simple setup**: Just R, Python, and terminal - **Single user**: Developer's machine **Infrastructure**: - R 4.4.0+ - Python 3.9+ - PowerShell (Windows) or Bash (Linux/Mac) - Text editor or RStudio --- ## Decision Matrix: Which Execution Model? ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Scenario │ SOBIT │ Dev Laptop │ ├──────────────────────────────────────────────┼──────────┼───────────────┤ │ Production farm management platform │ ✅ │ ❌ │ │ Scheduled weekly pipeline (Monday 5am) │ ✅ │ ❌ (need cron)│ │ Multi-user concurrent projects │ ✅ │ ❌ │ │ New feature development & debugging │ ❌ │ ✅ │ │ Testing on specific date range │ ❌ │ ✅ │ │ Ad-hoc "regenerate this week's report" │ ❌ │ ✅ │ │ CI/CD pipeline (automated testing) │ ✅ │ ✅ (both OK) │ │ One-off analysis for farm manager │ ⚠️ (OK) │ ✅ │ │ Minimal setup (no server) │ ❌ │ ✅ │ │ Persistent monitoring & alerting │ ✅ │ ❌ │ │ Educational demo for agronomist │ ❌ │ ✅ │ └──────────────────────────────────────────────┴──────────┴───────────────┘ ``` --- ## System Integration: Data Flow with Client Types & Execution Here's how all three dimensions interact in a complete workflow: ### Scenario 1: SOBIT Production (Angata, Cane Supply) ``` 1. PIPELINE DIMENSION User clicks "Generate Weekly Report" in web UI ↓ 2. EXECUTION DIMENSION (SOBIT) Laravel ProjectReportGeneratorJob dispatched to queue ↓ 3. CLIENT TYPE DIMENSION parameters_project.R loaded with PROJECT="angata" CLIENT_TYPE = CLIENT_TYPE_MAP[["angata"]] = "cane_supply" ↓ 4. PIPELINE STAGES Stage 80 runs: - source("80_utils_cane_supply.R") ← Client-specific utilities - Calculate 4 KPIs (acreage, phase, harvest readiness, stress) - Save RDS + Excel ↓ 5. STAGE 91 REPORT rmarkdown::render("91_CI_report_with_kpis_cane_supply.Rmd") ↓ 6. OUTPUT SmartCane_Report_cane_supply_angata_week07_2026.docx (Per-field status, harvest alerts, tonnage forecast) ↓ 7. USER NOTIFICATION Dashboard shows report ready; email sent to mill manager ``` ### Scenario 2: Dev Laptop Testing (Aura, Agronomic Support) ``` 1. EXECUTION DIMENSION (Dev Laptop) Developer opens PowerShell, sets: $PROJECT = "aura" $END_DATE = "2026-02-19" ↓ 2. CLIENT TYPE DIMENSION Developer manually checks parameters_project.R: CLIENT_TYPE_MAP[["aura"]] = "agronomic_support" ↓ 3. PIPELINE STAGES (Manual Run) $ & $R_EXE r_app/80_calculate_kpis.R 2026-02-19 aura 7 ↓ 4. STAGE 80 EXECUTION parameters_project.R loaded with PROJECT="aura" source("80_utils_agronomic_support.R") ← Client-specific utilities Calculate 6 KPIs (uniformity, area change, TCH, growth decline, weeds, gap fill) ↓ 5. STAGE 90 REPORT (Manual) $ & $R_EXE -e "rmarkdown::render('r_app/90_CI_report_with_kpis_agronomic_support.Rmd', params=list(data_dir='aura', report_date=as.Date('2026-02-19')), ...)" ↓ 6. OUTPUT SmartCane_Report_agronomic_support_aura_week07_2026.docx (Farm-level KPI averages, uniformity trends, spatial analysis) ↓ 7. DEVELOPER REVIEW Opens .docx file locally, reviews; makes adjustments to plotting code if needed ``` --- ## Data Dependency Map: Python ↔ R Integration This table shows which R stages depend on Python outputs and third-party inputs. | R Stage | Depends On | Input File | From | Notes | |---------|------------|------------|------|-------| | **10** | Python 00 | `merged_tif/{DATE}.tif` | Stage 00 | 4-band daily TIFF | | **20** | Stage 10 | `field_tiles/{FIELD}/{DATE}.tif` | Stage 10 | 4-band per-field daily | | **20** | External | `Data/pivot.geojson` | User upload | Field boundaries (REQUIRED) | | **30** | Stage 20 | `combined_CI_data.rds` | Stage 20 | Wide format CI data | | **30** | External | `Data/harvest.xlsx` | User upload | Harvest dates (optional for agronomic_support, REQUIRED for cane_supply) | | **40** | Stage 20 | `field_tiles_CI/{FIELD}/{DATE}.tif` | Stage 20 | 5-band daily per-field | | **80** | Stage 40 | `weekly_mosaic/{FIELD}/week_*.tif` | Stage 40 | 5-band weekly per-field | | **80** | Stage 30 | `All_pivots_Cumulative_CI_*.rds` | Stage 30 | Interpolated growth model | | **80** | External | `Data/pivot.geojson` | User upload | Field boundaries (REQUIRED) | | **80** | External | `Data/harvest.xlsx` | User upload | Harvest dates (REQUIRED for cane_supply) | | **80** | Python 31 | `harvest_imminent_weekly.csv` | Python 31 | Harvest probability (optional, improves cane_supply KPI) | | **90** | Stage 80 | `kpi_summary_tables_week{WW}.rds` | Stage 80 | KPI summary for rendering | | **90** | Stage 20 | `combined_CI_data.rds` | Stage 20 | For trend plots | | **91** | Stage 80 | `kpi_summary_tables_week{WW}.rds` | Stage 80 | KPI summary for rendering | | **91** | Python 31 | `harvest_imminent_weekly.csv` | Python 31 | Harvest readiness probabilities | --- ## Configuration Checklist: Before Running Pipeline ### Universal Requirements (All Projects) - [ ] `laravel_app/storage/app/{PROJECT}/Data/pivot.geojson` exists and is valid - [ ] At least one satellite TIFF in `merged_tif/` (or Stage 00 will download) - [ ] R packages installed: `Rscript r_app/package_manager.R` (one-time) - [ ] Python dependencies installed if running Stage 00 ### Agronomic Support Projects (AURA type) - [ ] `parameters_project.R` maps project to "agronomic_support" - [ ] (Optional) `harvest.xlsx` for better season grouping ### Cane Supply Projects (ANGATA type) - [ ] `parameters_project.R` maps project to "cane_supply" - [ ] ✅ **REQUIRED**: `harvest.xlsx` with planting/harvest dates (Stage 30 and Stage 80 need it) - [ ] (Optional) `harvest_imminent_weekly.csv` from Python 31 for better harvest predictions --- ## Output Files Reference ### By Client Type #### Agronomic Support (AURA-type projects) | Output | Format | Location | Created By | Content | |--------|--------|----------|------------|---------| | Report | Word | `reports/SmartCane_Report_agronomic_support_*.docx` | Stage 90 | Farm KPIs, uniformity trends, spatial maps | | KPI Excel | Excel | `reports/{PROJECT}_field_analysis_week*.xlsx` | Stage 80 | Field-by-field metrics (6 KPIs per field) | | KPI RDS | R object | `reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds` | Stage 80 | Summary data for Stage 90 rendering | #### Cane Supply (ANGATA-type projects) | Output | Format | Location | Created By | Content | |--------|--------|----------|------------|---------| | Report | Word | `reports/SmartCane_Report_cane_supply_*.docx` | Stage 91 | Harvest alerts, phase assignment, tonnage forecast | | KPI Excel | Excel | `reports/{PROJECT}_field_analysis_week*.xlsx` | Stage 80 | Field-by-field metrics (4 KPIs + harvest data) | | KPI RDS | R object | `reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds` | Stage 80 | Summary data for Stage 91 rendering | ### By Execution Model #### SOBIT Server - Reports auto-saved to `laravel_app/storage/app/{PROJECT}/reports/` - User downloads via web dashboard - Optional email delivery (configured in Laravel) #### Dev Laptop - Reports saved to same location (same Laravel storage directory) - Developer manually opens files - Can share via USB/cloud if needed --- ## Troubleshooting Decision Tree ``` Pipeline Error? ├─ Stage 00 (Python download) fails │ ├─ Auth error → Check Planet API credentials │ ├─ Network → Check internet connection │ └─ Cloud cover → That date is too cloudy; try different date │ ├─ Stage 10–40 fails → Check prerequisites │ ├─ File not found → Run previous stage first │ ├─ pivot.geojson invalid → Repair GeoJSON geometry │ └─ GDAL error → Check TIFF file integrity │ ├─ Stage 80 fails │ ├─ harvest.xlsx missing (cane_supply) → REQUIRED; upload file │ ├─ combined_CI_data.rds missing → Run Stage 20 first │ ├─ Client type unknown → Check parameters_project.R mapping │ └─ KPI calculation error → Stage-specific utility function bug │ ├─ Stage 90/91 fails │ ├─ RMarkdown not found → Check r_app/90_*.Rmd or 91_*.Rmd path │ ├─ Missing params → Check data_dir, report_date params │ └─ RMarkdown error → Check knitr output; may need renv reinstall │ └─ SOBIT-specific error ├─ Job stuck in queue → Check job table; restart worker ├─ Shell script permission denied → chmod +x *.sh └─ r_app directory not found → Update shell scripts with correct paths ``` --- ## Glossary | Term | Definition | |------|-----------| | **Canopy Index (CI)** | Normalized vegetation index: (NIR/Green) - 1; ~1.0–2.0 range | | **LOESS** | Locally Estimated Scatterplot Smoothing; statistical interpolation method | | **Coefficient of Variation (CV)** | Standard deviation ÷ mean; measure of field uniformity | | **Moran's I** | Spatial autocorrelation metric; detects clustered anomalies | | **Phase Assignment** | Growth stage classification (germination, tillering, grand growth, maturation) | | **Harvest Readiness** | Probability field is mature and ready for harvest (0.0–1.0) | | **MAX Composite** | Weekly mosaic using maximum CI pixel across the week; handles clouds | | **Job Queue** | Async task management system (Laravel); decouples web UI from long computations | | **RMarkdown** | Dynamic document format; combines R code + text → Word/.html output | --- ## Key Files Summary | File | Purpose | Edited By | |------|---------|-----------| | `r_app/parameters_project.R` | Central config; project → client type mapping | Dev / DevOps | | `r_app/00_common_utils.R` | Shared utility functions | Development | | `r_app/80_utils_agronomic_support.R` | Agronomic KPI calculations | Development (agronomic_support) | | `r_app/80_utils_cane_supply.R` | Cane supply KPI calculations | Development (cane_supply) | | `r_app/90_CI_report_with_kpis_agronomic_support.Rmd` | Agronomic report template | Development (agronomic_support) | | `r_app/91_CI_report_with_kpis_cane_supply.Rmd` | Cane supply report template | Development (cane_supply) | | `laravel_app/storage/app/{PROJECT}/Data/pivot.geojson` | Field boundaries | GIS / User upload | | `laravel_app/storage/app/{PROJECT}/Data/harvest.xlsx` | Harvest calendar | User upload (required for cane_supply) | --- ## Next Steps 1. **Understand your deployment**: Are you on SOBIT server or local dev laptop? - SOBIT → See [SOBIT_DEPLOYMENT.md](SOBIT_DEPLOYMENT.md) - Dev Laptop → See [DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md) 2. **Understand your client type**: Are you managing agronomic advisory (AURA) or harvest operations (ANGATA)? - AURA → See [CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md) - Harvest → See [CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md) 3. **Understand the data flow**: How does data transform through 8 pipeline stages? - See [ARCHITECTURE_DATA_FLOW.md](ARCHITECTURE_DATA_FLOW.md) 4. **Run your first pipeline**: Choose development laptop and follow [DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md) commands.