# SmartCane Architecture Integration Guide

This document ties together all SmartCane architecture components: the unified data pipeline, client type branching, and two execution models (SOBIT production vs developer laptop).

## Quick Navigation

- **[ARCHITECTURE_DATA_FLOW.md](ARCHITECTURE_DATA_FLOW.md)**: Complete Stage 00–91 pipeline with data transformations, file formats, and storage locations
- **[CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md)**: How agronomic_support (AURA) and cane_supply (ANGATA) differ in KPIs, reports, and requirements
- **[SOBIT_DEPLOYMENT.md](SOBIT_DEPLOYMENT.md)**: Production server execution via Laravel job queue, web UI, and shell wrappers
- **[DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md)**: Manual PowerShell execution for developer testing and one-off analysis

---

## System Architecture: Three Dimensions

SmartCane's architecture is defined by three orthogonal dimensions:

```
┌────────────────────────────────────────────────────────────┐
│                   SmartCane Dimensions                     │
├────────────────────────────────────────────────────────────┤
│                                                             │
│  1. PIPELINE STAGES (Data Flow)                            │
│     Stage 00 (Python) ──> Stage 10 ──> Stage 20 ──> ... ──> Stage 91 (Output)
│                                                             │
│  2. CLIENT TYPES (Business Logic)                          │
│     Agronomic Support (AURA)  vs  Cane Supply (ANGATA)    │
│     ↓                                                       │
│     Different KPIs, data requirements, report formats      │
│                                                             │
│  3. EXECUTION MODELS (Deployment)                          │
│     SOBIT Server (Job Queue)  vs  Dev Laptop (PowerShell)  │
│     ↓                                                       │
│     Different execution paths, error handling, monitoring   │
│                                                             │
└────────────────────────────────────────────────────────────┘
```

### Dimension 1: Pipeline Stages (Data Flow)

**Unified for all projects/clients**. Stages 00–40 are identical regardless of client type or execution model.

| Stage | Script | Client Type | Execution Model | Purpose |
|-------|--------|-------------|-----------------|---------|
| 00 | Python 00_download_*.py | All | Both | Download 4-band satellite imagery |
| 10 | R 10_create_per_field_tiffs.R | All | Both | Split merged TIFF into field tiles |
| 20 | R 20_ci_extraction_per_field.R | All | Both | Extract Canopy Index per pixel/field |
| 30 | R 30_interpolate_growth_model.R | All | Both | Smooth CI time series (LOESS) |
| 40 | R 40_mosaic_creation_per_field.R | All | Both | Create weekly MAX-composites |
| **80** | **R 80_calculate_kpis.R** | **Specific** | **Both** | **Calculate client-type KPIs** |
| **90** | **R 90_CI_report_*.Rmd** | **agronomic_support** | **Both** | **Render agronomic report** |
| **91** | **R 91_CI_report_*.Rmd** | **cane_supply** | **Both** | **Render cane client report** |

---

### Dimension 2: Client Types (Business Logic)

**Diverges at Stage 80 for KPI calculation and reporting**.

#### Client Type Configuration

```r
# In r_app/parameters_project.R
CLIENT_TYPE_MAP <- list(
  "angata"     = "cane_supply",           # Sugarcane operations
  "chemba"     = "agronomic_support",     # Agronomic advisory
  "xinavane"   = "agronomic_support",
  "esa"        = "agronomic_support",
  "simba"      = "agronomic_support",
  "aura"       = "agronomic_support"
)
```

#### KPI Differences

| Aspect | **Agronomic Support (AURA)** | **Cane Supply (ANGATA)** |
|--------|------------------------------|------------------------|
| **Primary Audience** | Agronomist / farm consultant | Mill operations manager |
| **Key Question** | "Is the crop healthy? Are yields on track?" | "Which fields are ready to harvest this week?" |
| **Data Requirements** | pivot.geojson (required); harvest.xlsx (optional) | pivot.geojson + harvest.xlsx (both required) |
| **KPI Count** | 6 KPIs | 4 KPIs + harvest integration |
| **Utility Script** | `80_utils_agronomic_support.R` | `80_utils_cane_supply.R` |
| **Report Script** | `90_CI_report_with_kpis_agronomic_support.Rmd` | `91_CI_report_with_kpis_cane_supply.Rmd` |
| **Harvest Integration** | Minimal (season grouping only) | Central (harvest readiness, phase, tonnage) |

#### When to Switch Client Types

| Scenario | Action |
|----------|--------|
| **New project** launched | Add to `CLIENT_TYPE_MAP` in parameters_project.R; create pivot.geojson + harvest.xlsx as needed |
| **Aura switching** to harvest-focused operations | Change mapping: "aura" → "cane_supply"; ensure harvest.xlsx exists |
| **Testing** both client types on same project | Update mapping; re-run Stages 80–91 |

---

### Dimension 3: Execution Models (Deployment)

**Orthogonal to pipeline stages and client types**. Both SOBIT and dev laptop can run any stage for any client type.

#### SOBIT Server (Production)

**When to use**: Production deployment, scheduled runs, multi-user farm management platform

**Flow**:
1. User clicks button in web UI (Laravel controller)
2. Job dispatched to queue (database or Redis)
3. Background queue worker picks up job
4. Shell script wrapper executed
5. Python/R script runs
6. Results stored in `laravel_app/storage/app/{PROJECT}/`
7. Next job automatically dispatched (chaining)
8. User monitors progress via dashboard

**Key Characteristics**:
- **Async execution**: Long-running stages don't block web requests
- **Job chaining**: Automatic pipeline orchestration (00 → 10 → 20 → ... → 91)
- **Error handling**: Failed jobs logged; retries configurable
- **Monitoring**: Web dashboard shows job history, results, logs
- **Multi-user**: Multiple projects can run concurrently

**Infrastructure**:
- Laravel application server (PHP)
- Queue backend (database or Redis)
- Supervisor daemon (manages queue workers)
- Cron for scheduled pipeline runs

---

#### Developer Laptop (Development)

**When to use**: Local development, testing, one-off analysis, debugging

**Flow**:
1. Developer opens PowerShell terminal
2. Sources `parameters_project.R` to set PROJECT, dates, etc.
3. Manually calls `Rscript` or `python` for each stage
4. Waits for completion (synchronous)
5. Reviews output in terminal
6. Proceeds to next stage or adjusts parameters
7. Final outputs saved to `laravel_app/storage/app/{PROJECT}/`

**Key Characteristics**:
- **Synchronous execution**: Developer sees immediate output/errors
- **Manual control**: Run individual stages, skip stages, rerun stages
- **No job queue**: Direct shell execution
- **Simple setup**: Just R, Python, and terminal
- **Single user**: Developer's machine

**Infrastructure**:
- R 4.4.0+
- Python 3.9+
- PowerShell (Windows) or Bash (Linux/Mac)
- Text editor or RStudio

---

## Decision Matrix: Which Execution Model?

```
┌─────────────────────────────────────────────────────────────────────────┐
│ Scenario                                     │ SOBIT    │ Dev Laptop    │
├──────────────────────────────────────────────┼──────────┼───────────────┤
│ Production farm management platform          │ ✅       │ ❌            │
│ Scheduled weekly pipeline (Monday 5am)       │ ✅       │ ❌ (need cron)│
│ Multi-user concurrent projects               │ ✅       │ ❌            │
│ New feature development & debugging          │ ❌       │ ✅            │
│ Testing on specific date range               │ ❌       │ ✅            │
│ Ad-hoc "regenerate this week's report"      │ ❌       │ ✅            │
│ CI/CD pipeline (automated testing)          │ ✅       │ ✅ (both OK)  │
│ One-off analysis for farm manager            │ ⚠️ (OK) │ ✅            │
│ Minimal setup (no server)                    │ ❌       │ ✅            │
│ Persistent monitoring & alerting              │ ✅       │ ❌            │
│ Educational demo for agronomist              │ ❌       │ ✅            │
└──────────────────────────────────────────────┴──────────┴───────────────┘
```

---

## System Integration: Data Flow with Client Types & Execution

Here's how all three dimensions interact in a complete workflow:

### Scenario 1: SOBIT Production (Angata, Cane Supply)

```
1. PIPELINE DIMENSION
   User clicks "Generate Weekly Report" in web UI
   ↓
2. EXECUTION DIMENSION (SOBIT)
   Laravel ProjectReportGeneratorJob dispatched to queue
   ↓
3. CLIENT TYPE DIMENSION
   parameters_project.R loaded with PROJECT="angata"
   CLIENT_TYPE = CLIENT_TYPE_MAP[["angata"]] = "cane_supply"
   ↓
4. PIPELINE STAGES
   Stage 80 runs:
   - source("80_utils_cane_supply.R")  ← Client-specific utilities
   - Calculate 4 KPIs (acreage, phase, harvest readiness, stress)
   - Save RDS + Excel
   ↓
5. STAGE 91 REPORT
   rmarkdown::render("91_CI_report_with_kpis_cane_supply.Rmd")
   ↓
6. OUTPUT
   SmartCane_Report_cane_supply_angata_week07_2026.docx
   (Per-field status, harvest alerts, tonnage forecast)
   ↓
7. USER NOTIFICATION
   Dashboard shows report ready; email sent to mill manager
```

### Scenario 2: Dev Laptop Testing (Aura, Agronomic Support)

```
1. EXECUTION DIMENSION (Dev Laptop)
   Developer opens PowerShell, sets:
   $PROJECT = "aura"
   $END_DATE = "2026-02-19"
   ↓
2. CLIENT TYPE DIMENSION
   Developer manually checks parameters_project.R:
   CLIENT_TYPE_MAP[["aura"]] = "agronomic_support"
   ↓
3. PIPELINE STAGES (Manual Run)
   $ & $R_EXE r_app/80_calculate_kpis.R 2026-02-19 aura 7
   ↓
4. STAGE 80 EXECUTION
   parameters_project.R loaded with PROJECT="aura"
   source("80_utils_agronomic_support.R")  ← Client-specific utilities
   Calculate 6 KPIs (uniformity, area change, TCH, growth decline, weeds, gap fill)
   ↓
5. STAGE 90 REPORT (Manual)
   $ & $R_EXE -e "rmarkdown::render('r_app/90_CI_report_with_kpis_agronomic_support.Rmd', 
                   params=list(data_dir='aura', report_date=as.Date('2026-02-19')), ...)"
   ↓
6. OUTPUT
   SmartCane_Report_agronomic_support_aura_week07_2026.docx
   (Farm-level KPI averages, uniformity trends, spatial analysis)
   ↓
7. DEVELOPER REVIEW
   Opens .docx file locally, reviews; makes adjustments to plotting code if needed
```

---

## Data Dependency Map: Python ↔ R Integration

This table shows which R stages depend on Python outputs and third-party inputs.

| R Stage | Depends On | Input File | From | Notes |
|---------|------------|------------|------|-------|
| **10** | Python 00 | `merged_tif/{DATE}.tif` | Stage 00 | 4-band daily TIFF |
| **20** | Stage 10 | `field_tiles/{FIELD}/{DATE}.tif` | Stage 10 | 4-band per-field daily |
| **20** | External | `Data/pivot.geojson` | User upload | Field boundaries (REQUIRED) |
| **30** | Stage 20 | `combined_CI_data.rds` | Stage 20 | Wide format CI data |
| **30** | External | `Data/harvest.xlsx` | User upload | Harvest dates (optional for agronomic_support, REQUIRED for cane_supply) |
| **40** | Stage 20 | `field_tiles_CI/{FIELD}/{DATE}.tif` | Stage 20 | 5-band daily per-field |
| **80** | Stage 40 | `weekly_mosaic/{FIELD}/week_*.tif` | Stage 40 | 5-band weekly per-field |
| **80** | Stage 30 | `All_pivots_Cumulative_CI_*.rds` | Stage 30 | Interpolated growth model |
| **80** | External | `Data/pivot.geojson` | User upload | Field boundaries (REQUIRED) |
| **80** | External | `Data/harvest.xlsx` | User upload | Harvest dates (REQUIRED for cane_supply) |
| **80** | Python 31 | `harvest_imminent_weekly.csv` | Python 31 | Harvest probability (optional, improves cane_supply KPI) |
| **90** | Stage 80 | `kpi_summary_tables_week{WW}.rds` | Stage 80 | KPI summary for rendering |
| **90** | Stage 20 | `combined_CI_data.rds` | Stage 20 | For trend plots |
| **91** | Stage 80 | `kpi_summary_tables_week{WW}.rds` | Stage 80 | KPI summary for rendering |
| **91** | Python 31 | `harvest_imminent_weekly.csv` | Python 31 | Harvest readiness probabilities |

---

## Configuration Checklist: Before Running Pipeline

### Universal Requirements (All Projects)

- [ ] `laravel_app/storage/app/{PROJECT}/Data/pivot.geojson` exists and is valid
- [ ] At least one satellite TIFF in `merged_tif/` (or Stage 00 will download)
- [ ] R packages installed: `Rscript r_app/package_manager.R` (one-time)
- [ ] Python dependencies installed if running Stage 00

### Agronomic Support Projects (AURA type)

- [ ] `parameters_project.R` maps project to "agronomic_support"
- [ ] (Optional) `harvest.xlsx` for better season grouping

### Cane Supply Projects (ANGATA type)

- [ ] `parameters_project.R` maps project to "cane_supply"
- [ ] ✅ **REQUIRED**: `harvest.xlsx` with planting/harvest dates (Stage 30 and Stage 80 need it)
- [ ] (Optional) `harvest_imminent_weekly.csv` from Python 31 for better harvest predictions

---

## Output Files Reference

### By Client Type

#### Agronomic Support (AURA-type projects)

| Output | Format | Location | Created By | Content |
|--------|--------|----------|------------|---------|
| Report | Word | `reports/SmartCane_Report_agronomic_support_*.docx` | Stage 90 | Farm KPIs, uniformity trends, spatial maps |
| KPI Excel | Excel | `reports/{PROJECT}_field_analysis_week*.xlsx` | Stage 80 | Field-by-field metrics (6 KPIs per field) |
| KPI RDS | R object | `reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds` | Stage 80 | Summary data for Stage 90 rendering |

#### Cane Supply (ANGATA-type projects)

| Output | Format | Location | Created By | Content |
|--------|--------|----------|------------|---------|
| Report | Word | `reports/SmartCane_Report_cane_supply_*.docx` | Stage 91 | Harvest alerts, phase assignment, tonnage forecast |
| KPI Excel | Excel | `reports/{PROJECT}_field_analysis_week*.xlsx` | Stage 80 | Field-by-field metrics (4 KPIs + harvest data) |
| KPI RDS | R object | `reports/kpis/{PROJECT}_kpi_summary_tables_week*.rds` | Stage 80 | Summary data for Stage 91 rendering |

### By Execution Model

#### SOBIT Server
- Reports auto-saved to `laravel_app/storage/app/{PROJECT}/reports/`
- User downloads via web dashboard
- Optional email delivery (configured in Laravel)

#### Dev Laptop
- Reports saved to same location (same Laravel storage directory)
- Developer manually opens files
- Can share via USB/cloud if needed

---

## Troubleshooting Decision Tree

```
Pipeline Error?
├─ Stage 00 (Python download) fails
│  ├─ Auth error → Check Planet API credentials
│  ├─ Network → Check internet connection
│  └─ Cloud cover → That date is too cloudy; try different date
│
├─ Stage 10–40 fails → Check prerequisites
│  ├─ File not found → Run previous stage first
│  ├─ pivot.geojson invalid → Repair GeoJSON geometry
│  └─ GDAL error → Check TIFF file integrity
│
├─ Stage 80 fails
│  ├─ harvest.xlsx missing (cane_supply) → REQUIRED; upload file
│  ├─ combined_CI_data.rds missing → Run Stage 20 first
│  ├─ Client type unknown → Check parameters_project.R mapping
│  └─ KPI calculation error → Stage-specific utility function bug
│
├─ Stage 90/91 fails
│  ├─ RMarkdown not found → Check r_app/90_*.Rmd or 91_*.Rmd path
│  ├─ Missing params → Check data_dir, report_date params
│  └─ RMarkdown error → Check knitr output; may need renv reinstall
│
└─ SOBIT-specific error
   ├─ Job stuck in queue → Check job table; restart worker
   ├─ Shell script permission denied → chmod +x *.sh
   └─ r_app directory not found → Update shell scripts with correct paths
```

---

## Glossary

| Term | Definition |
|------|-----------|
| **Canopy Index (CI)** | Normalized vegetation index: (NIR/Green) - 1; ~1.0–2.0 range |
| **LOESS** | Locally Estimated Scatterplot Smoothing; statistical interpolation method |
| **Coefficient of Variation (CV)** | Standard deviation ÷ mean; measure of field uniformity |
| **Moran's I** | Spatial autocorrelation metric; detects clustered anomalies |
| **Phase Assignment** | Growth stage classification (germination, tillering, grand growth, maturation) |
| **Harvest Readiness** | Probability field is mature and ready for harvest (0.0–1.0) |
| **MAX Composite** | Weekly mosaic using maximum CI pixel across the week; handles clouds |
| **Job Queue** | Async task management system (Laravel); decouples web UI from long computations |
| **RMarkdown** | Dynamic document format; combines R code + text → Word/.html output |

---

## Key Files Summary

| File | Purpose | Edited By |
|------|---------|-----------|
| `r_app/parameters_project.R` | Central config; project → client type mapping | Dev / DevOps |
| `r_app/00_common_utils.R` | Shared utility functions | Development |
| `r_app/80_utils_agronomic_support.R` | Agronomic KPI calculations | Development (agronomic_support) |
| `r_app/80_utils_cane_supply.R` | Cane supply KPI calculations | Development (cane_supply) |
| `r_app/90_CI_report_with_kpis_agronomic_support.Rmd` | Agronomic report template | Development (agronomic_support) |
| `r_app/91_CI_report_with_kpis_cane_supply.Rmd` | Cane supply report template | Development (cane_supply) |
| `laravel_app/storage/app/{PROJECT}/Data/pivot.geojson` | Field boundaries | GIS / User upload |
| `laravel_app/storage/app/{PROJECT}/Data/harvest.xlsx` | Harvest calendar | User upload (required for cane_supply) |

---

## Next Steps

1. **Understand your deployment**: Are you on SOBIT server or local dev laptop?
   - SOBIT → See [SOBIT_DEPLOYMENT.md](SOBIT_DEPLOYMENT.md)
   - Dev Laptop → See [DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md)

2. **Understand your client type**: Are you managing agronomic advisory (AURA) or harvest operations (ANGATA)?
   - AURA → See [CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md)
   - Harvest → See [CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md)

3. **Understand the data flow**: How does data transform through 8 pipeline stages?
   - See [ARCHITECTURE_DATA_FLOW.md](ARCHITECTURE_DATA_FLOW.md)

4. **Run your first pipeline**: Choose development laptop and follow [DEV_LAPTOP_EXECUTION.md](DEV_LAPTOP_EXECUTION.md) commands.