SmartCane/webapps/docs/ARCHITECTURE_INTEGRATION_GUIDE.md

405 lines
19 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# SmartCane Architecture Integration Guide
This document ties together all SmartCane architecture components: the unified data pipeline, client type branching, and two execution models (SOBIT production vs developer laptop).
## Quick Navigation
- **[ARCHITECTURE_DATA_FLOW.md](ARCHITECTURE_DATA_FLOW.md)**: Complete Stage 0091 pipeline with data transformations, file formats, and storage locations
- **[CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md)**: How agronomic_support (AURA) and cane_supply (ANGATA) differ in KPIs, reports, and requirements
- **[SOBIT_DEPLOYMENT.md](SOBIT_DEPLOYMENT.md)**: Production server execution via Laravel job queue, web UI, and shell wrappers
- **[DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md)**: Manual PowerShell execution for developer testing and one-off analysis
---
## System Architecture: Three Dimensions
SmartCane's architecture is defined by three orthogonal dimensions:
```
┌────────────────────────────────────────────────────────────┐
│ SmartCane Dimensions │
├────────────────────────────────────────────────────────────┤
│ │
│ 1. PIPELINE STAGES (Data Flow) │
│ Stage 00 (Python) ──> Stage 10 ──> Stage 20 ──> ... ──> Stage 91 (Output)
│ │
│ 2. CLIENT TYPES (Business Logic) │
│ Agronomic Support (AURA) vs Cane Supply (ANGATA) │
│ ↓ │
│ Different KPIs, data requirements, report formats │
│ │
│ 3. EXECUTION MODELS (Deployment) │
│ SOBIT Server (Job Queue) vs Dev Laptop (PowerShell) │
│ ↓ │
│ Different execution paths, error handling, monitoring │
│ │
└────────────────────────────────────────────────────────────┘
```
### Dimension 1: Pipeline Stages (Data Flow)
**Unified for all projects/clients**. Stages 0040 are identical regardless of client type or execution model.
| Stage | Script | Client Type | Execution Model | Purpose |
|-------|--------|-------------|-----------------|---------|
| 00 | Python 00_download_*.py | All | Both | Download 4-band satellite imagery |
| 10 | R 10_create_per_field_tiffs.R | All | Both | Split merged TIFF into field tiles |
| 20 | R 20_ci_extraction_per_field.R | All | Both | Extract Canopy Index per pixel/field |
| 30 | R 30_interpolate_growth_model.R | All | Both | Smooth CI time series (LOESS) |
| 40 | R 40_mosaic_creation_per_field.R | All | Both | Create weekly MAX-composites |
| **80** | **R 80_calculate_kpis.R** | **Specific** | **Both** | **Calculate client-type KPIs** |
| **90** | **R 90_CI_report_*.Rmd** | **agronomic_support** | **Both** | **Render agronomic report** |
| **91** | **R 91_CI_report_*.Rmd** | **cane_supply** | **Both** | **Render cane client report** |
---
### Dimension 2: Client Types (Business Logic)
**Diverges at Stage 80 for KPI calculation and reporting**.
#### Client Type Configuration
```r
# In r_app/parameters_project.R
CLIENT_TYPE_MAP <- list(
"angata" = "cane_supply", # Sugarcane operations
"chemba" = "agronomic_support", # Agronomic advisory
"xinavane" = "agronomic_support",
"esa" = "agronomic_support",
"simba" = "agronomic_support",
"aura" = "agronomic_support"
)
```
#### KPI Differences
| Aspect | **Agronomic Support (AURA)** | **Cane Supply (ANGATA)** |
|--------|------------------------------|------------------------|
| **Primary Audience** | Agronomist / farm consultant | Mill operations manager |
| **Key Question** | "Is the crop healthy? Are yields on track?" | "Which fields are ready to harvest this week?" |
| **Data Requirements** | pivot.geojson (required); harvest.xlsx (optional) | pivot.geojson + harvest.xlsx (both required) |
| **KPI Count** | 6 KPIs | 4 KPIs + harvest integration |
| **Utility Script** | `80_utils_agronomic_support.R` | `80_utils_cane_supply.R` |
| **Report Script** | `90_CI_report_with_kpis_agronomic_support.Rmd` | `91_CI_report_with_kpis_cane_supply.Rmd` |
| **Harvest Integration** | Minimal (season grouping only) | Central (harvest readiness, phase, tonnage) |
#### When to Switch Client Types
| Scenario | Action |
|----------|--------|
| **New project** launched | Add to `CLIENT_TYPE_MAP` in parameters_project.R; create pivot.geojson + harvest.xlsx as needed |
| **Aura switching** to harvest-focused operations | Change mapping: "aura" → "cane_supply"; ensure harvest.xlsx exists |
| **Testing** both client types on same project | Update mapping; re-run Stages 8091 |
---
### Dimension 3: Execution Models (Deployment)
**Orthogonal to pipeline stages and client types**. Both SOBIT and dev laptop can run any stage for any client type.
#### SOBIT Server (Production)
**When to use**: Production deployment, scheduled runs, multi-user farm management platform
**Flow**:
1. User clicks button in web UI (Laravel controller)
2. Job dispatched to queue (database or Redis)
3. Background queue worker picks up job
4. Shell script wrapper executed
5. Python/R script runs
6. Results stored in `laravel_app/storage/app/{PROJECT}/`
7. Next job automatically dispatched (chaining)
8. User monitors progress via dashboard
**Key Characteristics**:
- **Async execution**: Long-running stages don't block web requests
- **Job chaining**: Automatic pipeline orchestration (00 → 10 → 20 → ... → 91)
- **Error handling**: Failed jobs logged; retries configurable
- **Monitoring**: Web dashboard shows job history, results, logs
- **Multi-user**: Multiple projects can run concurrently
**Infrastructure**:
- Laravel application server (PHP)
- Queue backend (database or Redis)
- Supervisor daemon (manages queue workers)
- Cron for scheduled pipeline runs
---
#### Developer Laptop (Development)
**When to use**: Local development, testing, one-off analysis, debugging
**Flow**:
1. Developer opens PowerShell terminal
2. Sources `parameters_project.R` to set PROJECT, dates, etc.
3. Manually calls `Rscript` or `python` for each stage
4. Waits for completion (synchronous)
5. Reviews output in terminal
6. Proceeds to next stage or adjusts parameters
7. Final outputs saved to `laravel_app/storage/app/{PROJECT}/`
**Key Characteristics**:
- **Synchronous execution**: Developer sees immediate output/errors
- **Manual control**: Run individual stages, skip stages, rerun stages
- **No job queue**: Direct shell execution
- **Simple setup**: Just R, Python, and terminal
- **Single user**: Developer's machine
**Infrastructure**:
- R 4.4.0+
- Python 3.9+
- PowerShell (Windows) or Bash (Linux/Mac)
- Text editor or RStudio
---
## Decision Matrix: Which Execution Model?
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Scenario │ SOBIT │ Dev Laptop │
├──────────────────────────────────────────────┼──────────┼───────────────┤
│ Production farm management platform │ ✅ │ ❌ │
│ Scheduled weekly pipeline (Monday 5am) │ ✅ │ ❌ (need cron)│
│ Multi-user concurrent projects │ ✅ │ ❌ │
│ New feature development & debugging │ ❌ │ ✅ │
│ Testing on specific date range │ ❌ │ ✅ │
│ Ad-hoc "regenerate this week's report" │ ❌ │ ✅ │
│ CI/CD pipeline (automated testing) │ ✅ │ ✅ (both OK) │
│ One-off analysis for farm manager │ ⚠️ (OK) │ ✅ │
│ Minimal setup (no server) │ ❌ │ ✅ │
│ Persistent monitoring & alerting │ ✅ │ ❌ │
│ Educational demo for agronomist │ ❌ │ ✅ │
└──────────────────────────────────────────────┴──────────┴───────────────┘
```
---
## System Integration: Data Flow with Client Types & Execution
Here's how all three dimensions interact in a complete workflow:
### Scenario 1: SOBIT Production (Angata, Cane Supply)
```
1. PIPELINE DIMENSION
User clicks "Generate Weekly Report" in web UI
2. EXECUTION DIMENSION (SOBIT)
Laravel ProjectReportGeneratorJob dispatched to queue
3. CLIENT TYPE DIMENSION
parameters_project.R loaded with PROJECT="angata"
CLIENT_TYPE = CLIENT_TYPE_MAP[["angata"]] = "cane_supply"
4. PIPELINE STAGES
Stage 80 runs:
- source("80_utils_cane_supply.R") ← Client-specific utilities
- Calculate 4 KPIs (acreage, phase, harvest readiness, stress)
- Save RDS + Excel
5. STAGE 91 REPORT
rmarkdown::render("91_CI_report_with_kpis_cane_supply.Rmd")
6. OUTPUT
SmartCane_Report_cane_supply_angata_week07_2026.docx
(Per-field status, harvest alerts, tonnage forecast)
7. USER NOTIFICATION
Dashboard shows report ready; email sent to mill manager
```
### Scenario 2: Dev Laptop Testing (Aura, Agronomic Support)
```
1. EXECUTION DIMENSION (Dev Laptop)
Developer opens PowerShell, sets:
$PROJECT = "aura"
$END_DATE = "2026-02-19"
2. CLIENT TYPE DIMENSION
Developer manually checks parameters_project.R:
CLIENT_TYPE_MAP[["aura"]] = "agronomic_support"
3. PIPELINE STAGES (Manual Run)
$ & $R_EXE r_app/80_calculate_kpis.R 2026-02-19 aura 7
4. STAGE 80 EXECUTION
parameters_project.R loaded with PROJECT="aura"
source("80_utils_agronomic_support.R") ← Client-specific utilities
Calculate 6 KPIs (uniformity, area change, TCH, growth decline, weeds, gap fill)
5. STAGE 90 REPORT (Manual)
$ & $R_EXE -e "rmarkdown::render('r_app/90_CI_report_with_kpis_agronomic_support.Rmd',
params=list(data_dir='aura', report_date=as.Date('2026-02-19')), ...)"
6. OUTPUT
SmartCane_Report_agronomic_support_aura_week07_2026.docx
(Farm-level KPI averages, uniformity trends, spatial analysis)
7. DEVELOPER REVIEW
Opens .docx file locally, reviews; makes adjustments to plotting code if needed
```
---
## Data Dependency Map: Python ↔ R Integration
This table shows which R stages depend on Python outputs and third-party inputs.
| R Stage | Depends On | Input File | From | Notes |
|---------|------------|------------|------|-------|
| **10** | Python 00 | `merged_tif/{DATE}.tif` | Stage 00 | 4-band daily TIFF |
| **20** | Stage 10 | `field_tiles/{FIELD}/{DATE}.tif` | Stage 10 | 4-band per-field daily |
| **20** | External | `Data/pivot.geojson` | User upload | Field boundaries (REQUIRED) |
| **30** | Stage 20 | `combined_CI_data.rds` | Stage 20 | Wide format CI data |
| **30** | External | `Data/harvest.xlsx` | User upload | Harvest dates (optional for agronomic_support, REQUIRED for cane_supply) |
| **40** | Stage 20 | `field_tiles_CI/{FIELD}/{DATE}.tif` | Stage 20 | 5-band daily per-field |
| **80** | Stage 40 | `weekly_mosaic/{FIELD}/week_*.tif` | Stage 40 | 5-band weekly per-field |
| **80** | Stage 30 | `All_pivots_Cumulative_CI_*.rds` | Stage 30 | Interpolated growth model |
| **80** | External | `Data/pivot.geojson` | User upload | Field boundaries (REQUIRED) |
| **80** | External | `Data/harvest.xlsx` | User upload | Harvest dates (REQUIRED for cane_supply) |
| **80** | Python 31 | `harvest_imminent_weekly.csv` | Python 31 | Harvest probability (optional, improves cane_supply KPI) |
| **90** | Stage 80 | `kpi_summary_tables_week{WW}.rds` | Stage 80 | KPI summary for rendering |
| **90** | Stage 20 | `combined_CI_data.rds` | Stage 20 | For trend plots |
| **91** | Stage 80 | `kpi_summary_tables_week{WW}.rds` | Stage 80 | KPI summary for rendering |
| **91** | Python 31 | `harvest_imminent_weekly.csv` | Python 31 | Harvest readiness probabilities |
---
## Configuration Checklist: Before Running Pipeline
### Universal Requirements (All Projects)
- [ ] `laravel_app/storage/app/{PROJECT}/Data/pivot.geojson` exists and is valid
- [ ] At least one satellite TIFF in `merged_tif/` (or Stage 00 will download)
- [ ] R packages installed: `Rscript r_app/package_manager.R` (one-time)
- [ ] Python dependencies installed if running Stage 00
### Agronomic Support Projects (AURA type)
- [ ] `parameters_project.R` maps project to "agronomic_support"
- [ ] (Optional) `harvest.xlsx` for better season grouping
### Cane Supply Projects (ANGATA type)
- [ ] `parameters_project.R` maps project to "cane_supply"
- [ ]**REQUIRED**: `harvest.xlsx` with planting/harvest dates (Stage 30 and Stage 80 need it)
- [ ] (Optional) `harvest_imminent_weekly.csv` from Python 31 for better harvest predictions
---
## Output Files Reference
### By Client Type
#### Agronomic Support (AURA-type projects)
| Output | Format | Location | Created By | Content |
|--------|--------|----------|------------|---------|
| Report | Word | `reports/SmartCane_Report_agronomic_support_*.docx` | Stage 90 | Farm KPIs, uniformity trends, spatial maps |
| KPI Excel | Excel | `reports/{PROJECT}_field_analysis_week*.xlsx` | Stage 80 | Field-by-field metrics (6 KPIs per field) |
| KPI RDS | R object | `reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds` | Stage 80 | Summary data for Stage 90 rendering |
#### Cane Supply (ANGATA-type projects)
| Output | Format | Location | Created By | Content |
|--------|--------|----------|------------|---------|
| Report | Word | `reports/SmartCane_Report_cane_supply_*.docx` | Stage 91 | Harvest alerts, phase assignment, tonnage forecast |
| KPI Excel | Excel | `reports/{PROJECT}_field_analysis_week*.xlsx` | Stage 80 | Field-by-field metrics (4 KPIs + harvest data) |
| KPI RDS | R object | `reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds` | Stage 80 | Summary data for Stage 91 rendering |
### By Execution Model
#### SOBIT Server
- Reports auto-saved to `laravel_app/storage/app/{PROJECT}/reports/`
- User downloads via web dashboard
- Optional email delivery (configured in Laravel)
#### Dev Laptop
- Reports saved to same location (same Laravel storage directory)
- Developer manually opens files
- Can share via USB/cloud if needed
---
## Troubleshooting Decision Tree
```
Pipeline Error?
├─ Stage 00 (Python download) fails
│ ├─ Auth error → Check Planet API credentials
│ ├─ Network → Check internet connection
│ └─ Cloud cover → That date is too cloudy; try different date
├─ Stage 1040 fails → Check prerequisites
│ ├─ File not found → Run previous stage first
│ ├─ pivot.geojson invalid → Repair GeoJSON geometry
│ └─ GDAL error → Check TIFF file integrity
├─ Stage 80 fails
│ ├─ harvest.xlsx missing (cane_supply) → REQUIRED; upload file
│ ├─ combined_CI_data.rds missing → Run Stage 20 first
│ ├─ Client type unknown → Check parameters_project.R mapping
│ └─ KPI calculation error → Stage-specific utility function bug
├─ Stage 90/91 fails
│ ├─ RMarkdown not found → Check r_app/90_*.Rmd or 91_*.Rmd path
│ ├─ Missing params → Check data_dir, report_date params
│ └─ RMarkdown error → Check knitr output; may need renv reinstall
└─ SOBIT-specific error
├─ Job stuck in queue → Check job table; restart worker
├─ Shell script permission denied → chmod +x *.sh
└─ r_app directory not found → Update shell scripts with correct paths
```
---
## Glossary
| Term | Definition |
|------|-----------|
| **Canopy Index (CI)** | Normalized vegetation index: (NIR/Green) - 1; ~1.02.0 range |
| **LOESS** | Locally Estimated Scatterplot Smoothing; statistical interpolation method |
| **Coefficient of Variation (CV)** | Standard deviation ÷ mean; measure of field uniformity |
| **Moran's I** | Spatial autocorrelation metric; detects clustered anomalies |
| **Phase Assignment** | Growth stage classification (germination, tillering, grand growth, maturation) |
| **Harvest Readiness** | Probability field is mature and ready for harvest (0.01.0) |
| **MAX Composite** | Weekly mosaic using maximum CI pixel across the week; handles clouds |
| **Job Queue** | Async task management system (Laravel); decouples web UI from long computations |
| **RMarkdown** | Dynamic document format; combines R code + text → Word/.html output |
---
## Key Files Summary
| File | Purpose | Edited By |
|------|---------|-----------|
| `r_app/parameters_project.R` | Central config; project → client type mapping | Dev / DevOps |
| `r_app/00_common_utils.R` | Shared utility functions | Development |
| `r_app/80_utils_agronomic_support.R` | Agronomic KPI calculations | Development (agronomic_support) |
| `r_app/80_utils_cane_supply.R` | Cane supply KPI calculations | Development (cane_supply) |
| `r_app/90_CI_report_with_kpis_agronomic_support.Rmd` | Agronomic report template | Development (agronomic_support) |
| `r_app/91_CI_report_with_kpis_cane_supply.Rmd` | Cane supply report template | Development (cane_supply) |
| `laravel_app/storage/app/{PROJECT}/Data/pivot.geojson` | Field boundaries | GIS / User upload |
| `laravel_app/storage/app/{PROJECT}/Data/harvest.xlsx` | Harvest calendar | User upload (required for cane_supply) |
---
## Next Steps
1. **Understand your deployment**: Are you on SOBIT server or local dev laptop?
- SOBIT → See [SOBIT_DEPLOYMENT.md](SOBIT_DEPLOYMENT.md)
- Dev Laptop → See [DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md)
2. **Understand your client type**: Are you managing agronomic advisory (AURA) or harvest operations (ANGATA)?
- AURA → See [CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md)
- Harvest → See [CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md)
3. **Understand the data flow**: How does data transform through 8 pipeline stages?
- See [ARCHITECTURE_DATA_FLOW.md](ARCHITECTURE_DATA_FLOW.md)
4. **Run your first pipeline**: Choose development laptop and follow [DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md) commands.