SmartCane/webapps/docs/DEV_LAPTOP_EXECUTION.md

637 lines
17 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Developer Laptop Manual Execution
This document explains how to manually run the SmartCane pipeline on a Windows developer machine, without the SOBIT Laravel job queue. This is the **primary workflow for development and testing**.
## Overview: Manual Execution Architecture
Instead of web UI buttons and job queues, developers execute R and Python scripts directly in PowerShell, controlling each stage manually.
```mermaid
%% Manual Execution Architecture
flowchart TD
A["Developer<br/>PowerShell Terminal"] -->|Edit params| B["parameters_project.R<br/>Set PROJECT, dates, paths"]
B -->|Run Stage 00| C["python 00_download_8band_pu_optimized.py<br/>Stage 00"]
C -->|Run Stage 10| D["Rscript 10_create_per_field_tiffs.R<br/>Stage 10"]
D -->|Run Stage 20| E["Rscript 20_ci_extraction_per_field.R<br/>Stage 20"]
E -->|Run Stage 30| F["Rscript 30_interpolate_growth_model.R<br/>Stage 30"]
F -->|Run Stage 40| G["Rscript 40_mosaic_creation_per_field.R<br/>Stage 40"]
G -->|Run Stage 80| H["Rscript 80_calculate_kpis.R<br/>Stage 80"]
H -->|Run Stage 90 OR 91| I{"Client Type?"}
I -->|agronomic_support| J["rmarkdown::render<br/>90_CI_report_*.Rmd"]
I -->|cane_supply| K["rmarkdown::render<br/>91_CI_report_*.Rmd"]
J -->|Output| L["Word Report<br/>Excel KPI Tables<br/>GeoTIFFs"]
K -->|Output| L
```
---
## Prerequisites & Environment Setup
### System Requirements
- **OS**: Windows 10+
- **R**: Version 4.4.0+ (from https://cran.r-project.org/)
- **Python**: Version 3.9+ (from https://www.python.org/)
- **RStudio**: Optional but recommended (for debugging)
### One-Time Configuration
#### Step 1: Install R Packages
```powershell
cd c:\Users\{YOUR_USERNAME}\Documents\SmartCane_code
# Run package manager to install/update all dependencies
& "C:\Program Files\R\R-4.4.3\bin\x64\Rscript.exe" r_app\package_manager.R
# This reads renv.lock and installs exact versions into renv/ folder
```
**What happens**:
- `package_manager.R` uses `renv::restore()` to install packages from `renv.lock`
- All packages isolated to project (not system-wide)
- Ensures reproducibility across team members
#### Step 2: Verify R Installation
```powershell
# Check R installation path
& "C:\Program Files\R\R-4.4.3\bin\x64\Rscript.exe" --version
# Should output: R version 4.4.3 (or similar)
```
#### Step 3: Verify Python Installation
```powershell
# Check Python
python --version
# Should output: Python 3.9.x or higher
# Create/activate virtual environment (optional but recommended)
python -m venv venv_smartcane
.\venv_smartcane\Scripts\Activate.ps1
# Install Python dependencies
pip install -r python_app\requirements_linux.txt
```
#### Step 4: Set Environment Variables (Optional)
```powershell
# Define R executable path as variable (for easier copy-paste)
$R_EXE = "C:\Program Files\R\R-4.4.3\bin\x64\Rscript.exe"
$PROJECT = "angata" # or "chemba", "aura", etc.
$END_DATE = "2026-02-19"
$OFFSET = 7
# Or add to PowerShell $PROFILE for persistence
# Add these lines to C:\Users\{YOUR_USERNAME}\Documents\PowerShell\profile.ps1
```
---
## Stage-by-Stage Execution
### Stage 00 (Optional): Download Satellite Imagery
**Purpose**: Fetch 4-band GeoTIFFs from Planet API
**When to run**:
- When you need fresh data (weekly or on-demand)
- Not needed if satellite TIFFs already in `merged_tif/` directory
**Command**:
```powershell
cd python_app
# Download for a specific date
python 00_download_8band_pu_optimized.py angata --date 2026-02-19
# Or use batch download for multiple dates
python download_planet_missing_dates.py --start 2025-12-24 --end 2026-02-19 --project angata
cd ..
```
**Expected Output**:
```
laravel_app/storage/app/angata/merged_tif/2026-02-19.tif (~200 MB)
```
**Troubleshooting**:
- **Auth error**: Check Planet API credentials in environment
- **Date missing**: Download script skips dates already saved
- **Cloud cover**: Script applies UDM1 cloud mask; may skip high-cloud days
---
### Stage 10: Create Per-Field Tiles
**Purpose**: Split merged farm TIFF into individual field files
**Prerequisite**:
- `merged_tif/2026-02-19.tif` must exist
- `Data/pivot.geojson` must exist and be valid
**Command**:
```powershell
$R_EXE = "C:\Program Files\R\R-4.4.3\bin\x64\Rscript.exe"
$PROJECT = "angata"
$END_DATE = "2026-02-19"
$OFFSET = 7
& $R_EXE r_app/10_create_per_field_tiffs.R $PROJECT $END_DATE $OFFSET
```
**Parameters**:
- `PROJECT`: Project name (angata, chemba, aura, etc.)
- `END_DATE`: Date in YYYY-MM-DD format
- `OFFSET`: Days to look back (7 = last week)
**Expected Output**:
```
laravel_app/storage/app/angata/field_tiles/
├── Field_001/
│ ├── 2026-02-12.tif (4-band)
│ ├── 2026-02-13.tif
│ └── 2026-02-19.tif
├── Field_002/
│ └── ...
└── ...
```
**Console Output**:
```
[1] "Loading parameters..."
[1] "Processing dates 2026-02-12 to 2026-02-19"
[1] "Field_001: splitting tile..."
[1] "Field_002: splitting tile..."
[1] "Stage 10 completed successfully"
```
---
### Stage 20: Extract Canopy Index (CI)
**Purpose**: Calculate CI per field and per pixel from 4-band TIFFs
**Prerequisite**:
- Stage 10 completed (`field_tiles/` populated)
- `Data/pivot.geojson` exists
**Command**:
```powershell
& $R_EXE r_app/20_ci_extraction_per_field.R $PROJECT $END_DATE $OFFSET
```
**Expected Output**:
```
laravel_app/storage/app/angata/field_tiles_CI/
├── Field_001/
│ ├── 2026-02-12.tif (5-band: R,G,B,NIR,CI)
│ └── 2026-02-19.tif
Data/extracted_ci/
├── daily_vals/
│ └── Field_001/
│ ├── 2026-02-12.rds
│ └── 2026-02-19.rds
└── cumulative_vals/
└── combined_CI_data.rds (WIDE format: fields × dates)
```
**Console Output**:
```
[1] "Computing CI index..."
[1] "Field_001: CI = 1.23 (mean), 0.45 (sd)"
[1] "Field_002: CI = 1.19 (mean), 0.38 (sd)"
[1] "Saving combined_CI_data.rds..."
[1] "Stage 20 completed successfully"
```
---
### Stage 30: Interpolate Growth Model
**Purpose**: Smooth CI time series and fill gaps (handles clouds)
**Prerequisite**:
- Stage 20 completed (`combined_CI_data.rds` exists)
- `Data/harvest.xlsx` recommended (required for cane_supply projects)
**Command**:
```powershell
# No date/offset parameters for Stage 30 — it processes all available CI data
& $R_EXE r_app/30_interpolate_growth_model.R $PROJECT
```
**Expected Output**:
```
Data/extracted_ci/cumulative_vals/
└── All_pivots_Cumulative_CI_quadrant_year_v2.rds
# (long format: field × date × interpolated_ci × daily_change × cumulative_ci)
```
**Console Output**:
```
[1] "Loading combined CI data..."
[1] "Applying LOESS interpolation (span=0.3)..."
[1] "Season 2025-10 → 2026-03: Field_001 interpolated 42 dates, filled 3 gaps"
[1] "Saving interpolated growth model..."
[1] "Stage 30 completed successfully"
```
---
### Stage 40: Create Weekly Mosaics
**Purpose**: Aggregate daily per-field TIFFs into weekly MAX-composites
**Prerequisite**:
- Stage 20 completed (`field_tiles_CI/` populated)
**Command**:
```powershell
# Process mosaics for END_DATE week, looking back OFFSET days
& $R_EXE r_app/40_mosaic_creation_per_field.R $END_DATE $OFFSET $PROJECT
```
**Expected Output**:
```
laravel_app/storage/app/angata/weekly_mosaic/
├── Field_001/
│ ├── week_07_2026.tif (5-band, MAX-aggregated for ISO week 7)
│ ├── week_06_2026.tif
│ └── ...
├── Field_002/
│ └── ...
```
**Console Output**:
```
[1] "Computing weekly mosaics for week 07 (2026-02-16 to 2026-02-22)..."
[1] "Field_001: aggregating 7 daily TIFFs..."
[1] "Field_002: aggregating 7 daily TIFFs..."
[1] "Saving weekly_mosaic/Field_001/week_07_2026.tif..."
[1] "Stage 40 completed successfully"
```
---
### Stage 80: Calculate KPIs
**Purpose**: Compute field-level KPIs from weekly mosaics (client-type dependent)
**Prerequisite**:
- Stage 40 completed (`weekly_mosaic/` populated)
- Stage 30 completed (growth model data for trends)
- `Data/pivot.geojson` exists
- `Data/harvest.xlsx` exists (required for cane_supply)
**Command**:
```powershell
# KPI calculation (client type determined from PROJECT name in parameters_project.R)
& $R_EXE r_app/80_calculate_kpis.R $END_DATE $PROJECT $OFFSET
```
**Expected Output**:
```
laravel_app/storage/app/angata/reports/
├── angata_field_analysis_week07_2026.xlsx (21-column spreadsheet)
└── kpis/
└── angata_kpi_summary_tables_week07.rds
```
**Console Output** (agronomic_support type):
```
[1] "Client type: agronomic_support"
[1] "Loading weekly mosaic data..."
[1] "Computing uniformity KPI (CV)..."
[1] "Computing area change KPI..."
[1] "Computing TCH forecast..."
[1] "Computing growth decline..."
[1] "Computing weed presence (Moran's I)..."
[1] "Computing gap fill quality..."
[1] "Saving kpi_summary_tables_week07.rds..."
[1] "Stage 80 completed successfully"
```
---
### Stages 90/91: Generate Word Reports
**Purpose**: Render RMarkdown to Microsoft Word (client-type specific)
**Prerequisite**:
- Stage 80 completed (KPI summary RDS + Excel exist)
- `Data/pivot.geojson` exists
- `Data/extracted_ci/cumulative_vals/combined_CI_data.rds` exists
**Command for Agronomic Support (Aura, Chemba, etc.)**:
```powershell
& $R_EXE -e `
"rmarkdown::render('r_app/90_CI_report_with_kpis_agronomic_support.Rmd', `
params=list(data_dir='$PROJECT', report_date=as.Date('$END_DATE')), `
output_file='SmartCane_Report_agronomic_support_${PROJECT}_week07_2026.docx', `
output_dir='laravel_app/storage/app/$PROJECT/reports')"
```
**Command for Cane Supply (Angata)**:
```powershell
& $R_EXE -e `
"rmarkdown::render('r_app/91_CI_report_with_kpis_cane_supply.Rmd', `
params=list(data_dir='$PROJECT', report_date=as.Date('$END_DATE')), `
output_file='SmartCane_Report_cane_supply_${PROJECT}_week07_2026.docx', `
output_dir='laravel_app/storage/app/$PROJECT/reports')"
```
**Expected Output**:
```
laravel_app/storage/app/angata/reports/
└── SmartCane_Report_cane_supply_angata_week07_2026.docx (Word file with tables, charts, maps)
```
**Console Output**:
```
[1] "Rendering RMarkdown..."
[1] "Loading KPI summary data..."
[1] "Loading weekly mosaics..."
[1] "Creating plots..."
[1] "Rendering Word document..."
[1] "Output: laravel_app/storage/app/angata/reports/SmartCane_Report_*.docx"
```
---
## Complete Pipeline: Single Command Sequence
### One-Liner Scripts (PowerShell)
**Setup Variables** (run once per session):
```powershell
$R_EXE = "C:\Program Files\R\R-4.4.3\bin\x64\Rscript.exe"
$PROJECT = "angata"
$END_DATE = "2026-02-19"
$OFFSET = 7
```
**Full Pipeline (if all data already downloaded)**:
```powershell
Write-Host "Starting SmartCane pipeline for $PROJECT on $END_DATE..."
# Stage 10
Write-Host "[Stage 10] Creating field tiles..."
& $R_EXE r_app/10_create_per_field_tiffs.R $PROJECT $END_DATE $OFFSET
# Stage 20
Write-Host "[Stage 20] Extracting CI..."
& $R_EXE r_app/20_ci_extraction_per_field.R $PROJECT $END_DATE $OFFSET
# Stage 30
Write-Host "[Stage 30] Interpolating growth model..."
& $R_EXE r_app/30_interpolate_growth_model.R $PROJECT
# Stage 40
Write-Host "[Stage 40] Creating weekly mosaics..."
& $R_EXE r_app/40_mosaic_creation_per_field.R $END_DATE $OFFSET $PROJECT
# Stage 80
Write-Host "[Stage 80] Calculating KPIs..."
& $R_EXE r_app/80_calculate_kpis.R $END_DATE $PROJECT $OFFSET
# Stage 90/91 (client-type dependent)
Write-Host "[Stage 90/91] Rendering report..."
$CLIENT_TYPE = "cane_supply" # Determine from parameters_project.R
if ($CLIENT_TYPE -eq "agronomic_support") {
$TEMPLATE = "r_app/90_CI_report_with_kpis_agronomic_support.Rmd"
} else {
$TEMPLATE = "r_app/91_CI_report_with_kpis_cane_supply.Rmd"
}
& $R_EXE -e `
"rmarkdown::render('$TEMPLATE', `
params=list(data_dir='$PROJECT', report_date=as.Date('$END_DATE')), `
output_file='SmartCane_Report_${PROJECT}_week07_2026.docx', `
output_dir='laravel_app/storage/app/$PROJECT/reports')"
Write-Host "Pipeline completed! Report: laravel_app/storage/app/$PROJECT/reports/"
```
### Batch Processing Multiple Weeks
**Use Batch Runner** (R script that loops weeks):
```powershell
# Aura batch processing (weeks 498, Dec 3 2025 - Feb 4 2026)
& $R_EXE -e "source('r_app/batch_pipeline_aura.R')"
# Manually loop custom date range
$startDate = [DateTime]::ParseExact("2026-01-28", "yyyy-MM-dd", $null)
$endDate = [DateTime]::ParseExact("2026-02-19", "yyyy-MM-dd", $null)
$current = $startDate
while ($current -le $endDate) {
$dateStr = $current.ToString("yyyy-MM-dd")
Write-Host "Processing week of $dateStr..."
& $R_EXE r_app/40_mosaic_creation_per_field.R $dateStr 7 "angata"
& $R_EXE r_app/80_calculate_kpis.R $dateStr "angata" 7
$current = $current.AddDays(7)
}
```
---
## Configuration: parameters_project.R
**Location**: `r_app/parameters_project.R`
This file defines global settings used by all stages.
```r
# ============================================================================
# SmartCane Project Configuration
# ============================================================================
# Project settings
PROJECT <- Sys.getenv("PROJECT") # Set by calling script or manually
if (PROJECT == "") {
PROJECT <- "angata" # Default project
}
# Client type mapping
CLIENT_TYPE_MAP <- list(
"angata" = "cane_supply",
"chemba" = "agronomic_support",
"xinavane" = "agronomic_support",
"esa" = "agronomic_support",
"simba" = "agronomic_support",
"aura" = "agronomic_support"
)
CLIENT_TYPE <- CLIENT_TYPE_MAP[[PROJECT]]
# Data directory (Laravel storage)
data_dir <- file.path(
dirname(getwd()), # Up one level from r_app
"laravel_app/storage/app",
PROJECT
)
# Key file paths
pivot_path <- file.path(data_dir, "Data", "pivot.geojson")
harvest_path <- file.path(data_dir, "Data", "harvest.xlsx")
merged_tif_dir <- file.path(data_dir, "merged_tif")
field_tiles_dir <- file.path(data_dir, "field_tiles")
field_tiles_ci_dir <- file.path(data_dir, "field_tiles_CI")
weekly_mosaic_dir <- file.path(data_dir, "weekly_mosaic")
# KPI thresholds (customizable)
CI_THRESHOLD <- 1.0
CV_GOOD <- 0.15
CV_EXCELLENT <- 0.08
CV_POOR <- 0.25
# Print configuration summary
cat("\n=== SmartCane Configuration ===\n")
cat("Project:", PROJECT, "\n")
cat("Client Type:", CLIENT_TYPE, "\n")
cat("Data Directory:", data_dir, "\n\n")
```
**How to Use**:
1. All scripts start with `source("parameters_project.R")`
2. Use global variables: `PROJECT`, `CLIENT_TYPE`, `data_dir`, etc.
3. To change project: Edit this file OR set `PROJECT` env var before running
---
## Troubleshooting Common Issues
### Issue: "File not found: combined_CI_data.rds"
**Cause**: Stage 20 not completed.
**Solution**:
```powershell
# Run Stage 20 again with correct date range
& $R_EXE r_app/20_ci_extraction_per_field.R angata 2026-02-19 7
```
### Issue: "Error in rmarkdown::render()"
**Cause**: RMarkdown template not found or missing dependencies.
**Solution**:
```powershell
# Check template file exists
Test-Path "r_app/90_CI_report_with_kpis_agronomic_support.Rmd"
# Reinstall R packages
& $R_EXE r_app/package_manager.R
```
### Issue: "GDAL error: Cannot open file"
**Cause**: Incorrect pivot.geojson path or file doesn't exist.
**Solution**:
```powershell
# Check pivot.geojson exists
Test-Path "laravel_app/storage/app/angata/Data/pivot.geojson"
# Verify path in parameters_project.R
```
### Issue: Python download fails with "Cloud cover too high"
**Cause**: Planet API filtering out days with >90% clouds.
**Solution**:
```powershell
# Check available dates in merged_tif directory
Get-ChildItem laravel_app/storage/app/angata/merged_tif/
# Or edit Python script to use permissive cloud threshold
# Line: cloud_cover_threshold = 0.95 # 95% clouds allowed
```
---
## Development Workflow Best Practices
### 1. Testing Single Stage in Isolation
```powershell
# Test Stage 20 without running full pipeline
$PROJECT = "angata"
$END_DATE = "2026-02-19"
$OFFSET = 7
# Prerequisite: Stage 10 must be done, or manually create field_tiles/
& $R_EXE r_app/20_ci_extraction_per_field.R $PROJECT $END_DATE $OFFSET
# Review output
Get-ChildItem laravel_app/storage/app/$PROJECT/field_tiles_CI/
# View CI values
$data <- readRDS("laravel_app/storage/app/$PROJECT/Data/extracted_ci/cumulative_vals/combined_CI_data.rds")
head(data)
```
### 2. Debugging RMarkdown
```powershell
# Render with verbose output
& $R_EXE -e `
"rmarkdown::render('r_app/90_CI_report_with_kpis_agronomic_support.Rmd', `
params=list(data_dir='aura', report_date=as.Date('2026-02-19')), `
knit_root_dir=getwd(), clean=FALSE)"
# Check intermediate files
Get-ChildItem r_app/ -Filter "*_files" -Directory
```
### 3. Using RStudio for Interactive Development
```powershell
# Open project in RStudio
# File > Open Project > r_app/r_app.Rproj
# Then in RStudio console:
# - source("parameters_project.R")
# - source("20_ci_extraction_per_field.R") # Run script line by line
# - debug(extract_ci_per_field) # Set breakpoints
```
---
## Next Steps
- See [ARCHITECTURE_DATA_FLOW.md](ARCHITECTURE_DATA_FLOW.md) for understanding pipeline flow
- See [CLIENT_TYPE_ARCHITECTURE.md](CLIENT_TYPE_ARCHITECTURE.md) for client-specific KPI differences
- See [SOBIT_DEPLOYMENT.md](SOBIT_DEPLOYMENT.md) for production server alternative
- See [ARCHITECTURE_INTEGRATION_GUIDE.md](ARCHITECTURE_INTEGRATION_GUIDE.md) for choosing execution model