# SmartCane System Architecture - R Pipeline & File-Based Processing ## Overview The SmartCane system is a file-based agricultural intelligence platform that processes satellite imagery through a multi-stage R-script pipeline. Raw satellite imagery flows through sequential processing steps (CI extraction, growth model interpolation, mosaic creation, KPI analysis) with outputs persisted as GeoTIFFs, RDS files, and Excel/Word reports. ## Processing Pipeline Overview ``` Python Download → Daily GeoTIFFs → CI Extraction (RDS) → Growth Model (RDS) → Weekly Mosaics (TIF) ↓ ↓ Cumulative CI Data ←─────────────────── KPI Calculation ↓ Field Analysis & Report Generation ↓ Excel + Word Outputs ``` ## Complete Processing Pipeline (Mermaid Diagram) ```mermaid graph TD %% ===== INPUTS ===== API["🔑 API Credentials"] Bounds["🗺️ Field Boundaries
(pivot.geojson)"] Harvest["📊 Harvest Data
(harvest.xlsx)"] %% ===== STAGE 1: DOWNLOAD ===== Download["STAGE 1: Satellite Download
01_planet_download.py"] DL_Out["📦 OUTPUT
merged_tif/{date}.tif
(4 bands: RGBN)"] %% ===== STAGE 2: CI EXTRACTION ===== CI["STAGE 2: CI Extraction
02_ci_extraction.R"] CI_Utils["[Utility]
ci_extraction_utils.R"] CI_Out["📦 OUTPUT
combined_CI_data.rds
(wide format)"] %% ===== STAGE 3: GROWTH MODEL ===== Growth["STAGE 3: Growth Model
03_interpolate_growth_model.R"] Growth_Utils["[Utility]
growth_model_utils.R"] Growth_Out["📦 OUTPUT
All_pivots_Cumulative_CI
_quadrant_year_v2.rds"] %% ===== STAGE 4: WEEKLY MOSAIC ===== Mosaic["STAGE 4: Weekly Mosaic
04_mosaic_creation.R"] Mosaic_Utils["[Utility]
mosaic_creation_utils.R"] Mosaic_Out["📦 OUTPUT
weekly_mosaic/week_{WW}.tif
(5 bands: RGBNCI)"] %% ===== STAGE 5: FIELD ANALYSIS ===== Field["STAGE 5: Field Analysis
09_field_analysis_weekly.R
(or 09b parallel)"] Field_Utils["[Utility]
field_analysis_utils.R"] Field_Out1["📦 OUTPUT
{project}_field_analysis
_week{WW}.xlsx"] Field_Out2["📦 OUTPUT
{project}_kpi_summary
_tables_week{WW}.rds"] %% ===== STAGE 6: REPORT ===== Report["STAGE 6: Report Generation
10_CI_report_with_kpis_simple.Rmd"] Report_Utils["[Utility]
report_utils.R"] Report_Out1["📦 PRIMARY OUTPUT
SmartCane_Report
_week{WW}_{YYYY}.docx"] Report_Out2["📦 ALTERNATIVE
SmartCane_Report
_week{WW}_{YYYY}.html"] %% ===== CONFIG ===== Config["[Utility]
parameters_project.R"] %% ===== CONNECTIONS ===== API --> Download Bounds --> Download Download --> DL_Out DL_Out --> CI Bounds --> CI Config --> CI CI --> CI_Utils CI --> CI_Out CI_Out --> Growth Harvest --> Growth Config --> Growth Growth --> Growth_Utils Growth --> Growth_Out DL_Out --> Mosaic Bounds --> Mosaic Config --> Mosaic Mosaic --> Mosaic_Utils Mosaic --> Mosaic_Out Mosaic_Out --> Field Growth_Out --> Field Bounds --> Field Harvest --> Field Config --> Field Field --> Field_Utils Field --> Field_Out1 Field --> Field_Out2 Mosaic_Out --> Report Field_Out2 --> Report Field_Out1 --> Report Config --> Report Report --> Report_Utils Report --> Report_Out1 Report --> Report_Out2 %% ===== STYLING ===== classDef input fill:#e1f5ff,stroke:#01579b,stroke-width:2px classDef stage fill:#f3e5f5,stroke:#4a148c,stroke-width:2px classDef output fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px classDef util fill:#fff3e0,stroke:#e65100,stroke-width:2px class API,Bounds,Harvest,Config input class Download,CI,Growth,Mosaic,Field,Report stage class DL_Out,CI_Out,Growth_Out,Mosaic_Out,Field_Out1,Field_Out2,Report_Out1,Report_Out2 output class CI_Utils,Growth_Utils,Mosaic_Utils,Field_Utils,Report_Utils util ``` ## Data Processing Pipeline ### Stage 1: Satellite Data Acquisition (Python) - **Script**: `python_app/01_planet_download.py` - **Inputs**: API credentials, field boundaries (GeoJSON), date range - **Outputs**: Daily merged GeoTIFFs - **File Location**: `laravel_app/storage/app/{project}/merged_tif/` - **File Format**: `YYYY-MM-DD.tif` (4 bands: Red, Green, Blue, NIR, uint16) - **Processing**: - Downloads from Sentinel Hub BYOC collection - Applies cloud masking (UDM1 band) - Merges tiles into daily mosaics - Stores at 3m resolution ### Stage 2: Canopy Index (CI) Extraction - **Script**: `r_app/02_ci_extraction.R` - **Utility Functions**: `ci_extraction_utils.R` (handles tile detection, RDS I/O) - **Inputs**: Daily GeoTIFFs, field boundaries (GeoJSON) - **Outputs**: - Daily extractions (RDS): `Data/extracted_ci/daily_vals/extracted_{date}_{suffix}.rds` - Cumulative dataset (RDS): `Data/extracted_ci/cumulative_vals/combined_CI_data.rds` - **File Format**: - Daily: Per-field statistics (mean CI, count, notNA pixels) - Cumulative: Wide format with fields as rows, dates as columns - **Processing**: - Calculates CI = (NIR / Green) - 1 - Extracts stats per field using field geometry - Handles missing pixels (clouds → NA values) - Supports both full rasters and tile-based extraction - **Key Parameters**: - CI formula: `(NIR / Green) - 1` - Min valid pixels: 100 per field - Cloud masking: UDM1 != 0 ### Stage 3: Growth Model Interpolation - **Script**: `r_app/03_interpolate_growth_model.R` - **Utility Functions**: `growth_model_utils.R` (interpolation, seasonal grouping) - **Inputs**: - Combined CI data (RDS from Stage 2) - Harvest data with season dates (Excel) - **Outputs**: Interpolated growth model (RDS) - **File Location**: `Data/extracted_ci/cumulative_vals/All_pivots_Cumulative_CI_quadrant_year_v2.rds` - **File Format**: Long-format data frame with columns: - `Date`, `DOY` (Day of Year), `field`, `subField`, `value`, `season` - `CI_per_day`, `cumulative_CI`, `FitData` (interpolated indicator) - **Processing**: - Filters CI by season dates - Linear interpolation across gaps: `approxfun()` - Calculates daily changes and cumulative sums - Groups by field and season year - **Key Calculations**: - `CI_per_day` = today's CI - yesterday's CI - `cumulative_CI` = rolling sum of daily CI ### Stage 4: Weekly Mosaic Creation - **Script**: `r_app/04_mosaic_creation.R` - **Utility Functions**: `mosaic_creation_utils.R`, `mosaic_creation_tile_utils.R` - **Inputs**: - Daily VRTs or GeoTIFFs from Stage 1 - Field boundaries - **Outputs**: Weekly composite mosaic (GeoTIFF) - **File Location**: `weekly_mosaic/week_{WW}_{YYYY}.tif` - **File Format**: 5-band GeoTIFF (R, G, B, NIR, CI), same spatial extent as daily images - **Processing**: - Assesses cloud coverage per daily image - Selects images with acceptable cloud coverage (<45%) - Composites using MAX function (retains highest CI value) - Outputs single weekly composite - **Key Parameters**: - Cloud threshold (strict): <5% missing pixels - Cloud threshold (relaxed): <45% missing pixels - Composite function: MAX across selected images ### Stage 5: Field Analysis & KPI Calculation - **Script**: `r_app/09_field_analysis_weekly.R` or `09b_field_analysis_weekly.R` (parallel version) - **Utility Functions**: `field_analysis_utils.R`, tile extraction functions - **Inputs**: - Current week mosaic (GeoTIFF) - Previous week mosaic (GeoTIFF) - Interpolated growth model (RDS) - Field boundaries (GeoJSON) - Harvest data (Excel) - **Outputs**: - Excel file: `reports/{project}_field_analysis_week{WW}.xlsx` - RDS summary data: `reports/kpis/{project}_kpi_summary_tables_week{WW}.rds` - **File Format (Excel)**: - Sheet 1: Field-level data with CI metrics, phase, status triggers - Sheet 2: Summary statistics (monitored area, cloud coverage, phase distribution) - **Processing** (per field): - Extracts CI from current and previous week mosaics - Calculates field-level statistics: mean, std dev, CV (coefficient of variation) - Assigns growth phase based on field age (Germination, Tillering, Grand Growth, Maturation) - Detects status triggers (rapid growth, disease signals, weed pressure, harvest imminence) - Assesses cloud coverage per field - Parallel processing using `furrr` for 1000+ fields - **Key Calculations**: - **Uniformity (CV)**: std_dev / mean, thresholds: <0.15 excellent, <0.25 good - **Change**: (current_mean - previous_mean) / previous_mean - **Phase age**: weeks since planting (from harvest.xlsx season_start) - **Cloud coverage %**: (non-NA pixels / total pixels in field) * 100 - **Status Triggers** (non-exclusive): - Germination Started: 10% of field CI > 2.0 - Rapid Growth: CI increase > 0.5 units week-over-week - Slow Growth: CI increase < 0.1 units week-over-week - Non-Uniform Growth: CV > 0.25 (field heterogeneity) - Weed Pressure: Rapid increase (>2.0 CI/week) with moderate area (<25%) - Harvest Imminence: Age > 240 days + CI plateau detected ### Stage 6: Report Generation - **Script**: `r_app/10_CI_report_with_kpis_simple.Rmd` (RMarkdown) - **Utility Functions**: `report_utils.R` (doc building, table formatting) - **Inputs**: - Weekly mosaics (GeoTIFFs) - KPI data and field analysis (RDS) - Field boundaries, project config - **Outputs**: - **Word document** (PRIMARY OUTPUT): `reports/SmartCane_Report_week{WW}_{YYYY}.docx` - **HTML version** (optional): `reports/SmartCane_Report_week{WW}_{YYYY}.html` - **Report Contents**: - Executive summary (KPI overview, monitored area, cloud coverage) - Phase distribution tables and visualizations - Status trigger summary (fields with active triggers) - Field-by-field detail pages with CI metrics - Interpretation guides for agronomic thresholds - **Report Generation Technology**: - RMarkdown (`.Rmd`) rendered to Word via `officer` and `flextable` packages - Tables with automatic width/height fitting - Column interpretations embedded in reports - Areas reported in both hectares and acres --- ## File Storage Structure All data persists to the file system. No database writes occur during analysis—only reads for metadata. ``` laravel_app/storage/app/{project}/ ├── Data/ │ ├── pivot.geojson # Field boundaries (read-only) │ ├── pivot_2.geojson # ESA variant with extra fields │ ├── harvest.xlsx # Season dates & yield data (read-only) │ ├── vrt/ # Virtual raster files (daily VRTs) │ │ └── YYYY-MM-DD.vrt │ ├── extracted_ci/ │ │ ├── daily_vals/ │ │ │ └── extracted_YYYY-MM-DD_{suffix}.rds # Daily field stats │ │ └── cumulative_vals/ │ │ ├── combined_CI_data.rds # Cumulative CI (wide) │ │ └── All_pivots_Cumulative_CI_quadrant_year_v2.rds # Interpolated │ └── daily_tiles_split/ # (Optional) Tile-based processing │ ├── master_grid_5x5.geojson │ └── YYYY-MM-DD/ # Date-specific folders │ └── YYYY-MM-DD_01.tif, ..., _25.tif │ ├── merged_tif/ # Raw daily satellite images (Stage 1 output) │ └── YYYY-MM-DD.tif # 4 bands: R, G, B, NIR │ ├── merged_final_tif/ # (Optional) Processed daily images │ └── YYYY-MM-DD.tif # 5 bands: R, G, B, NIR, CI │ ├── weekly_mosaic/ # Weekly composite mosaics (Stage 4 output) │ └── week_WW_YYYY.tif # 5 bands, ISO week numbering │ └── reports/ ├── SmartCane_Report_week{WW}_{YYYY}.docx # PRIMARY OUTPUT (Stage 6) ├── SmartCane_Report_week{WW}_{YYYY}.html # Alternative format ├── {project}_field_analysis_week{WW}.xlsx # PRIMARY OUTPUT (Stage 5) ├── {project}_harvest_predictions_week{WW}.xlsx # Harvest tracking ├── {project}_cloud_coverage_week{WW}.rds # Per-field cloud stats ├── {project}_kpi_summary_tables_week{WW}.rds # Summary data (consumed by reports) └── kpis/ └── week_WW_YYYY/ └── *.csv # Individual KPI exports ``` ### Data Types by File | File Extension | Purpose | Stage | Example Files | |---|---|---|---| | `.tif` | Geospatial raster imagery | 1, 4 | `YYYY-MM-DD.tif`, `week_41_2025.tif` | | `.vrt` | Virtual raster (pointer to TIFFs) | 2 | `YYYY-MM-DD.vrt` | | `.rds` | R serialized data (binary format) | 2, 3, 5, 6 | `combined_CI_data.rds`, `kpi_results_week41.rds` | | `.geojson` | Field boundaries (read-only) | Input | `pivot.geojson` | | `.xlsx` | Excel reports & harvest data | 5, 6 (output), Input (harvest) | `field_analysis_week41.xlsx` | | `.docx` | Word reports (final output) | 6 | `SmartCane_Report_week41_2025.docx` | | `.html` | HTML reports (alternative) | 6 | `SmartCane_Report_week41_2025.html` | | `.csv` | Summary tables (for external use) | 5, 6 | `field_details.csv`, `kpi_summary.csv` | --- ## Script Dependency Map ``` 01_create_master_grid_and_split_tiffs.R (Optional) └→ [Utility] parameters_project.R 02_ci_extraction.R ├→ [Utility] parameters_project.R └→ [Utility] ci_extraction_utils.R └ Functions: find_satellite_images(), process_satellite_images(), process_ci_values(), process_ci_values_from_tiles() 03_interpolate_growth_model.R ├→ [Utility] parameters_project.R └→ [Utility] growth_model_utils.R └ Functions: load_combined_ci_data(), generate_interpolated_ci_data(), calculate_growth_metrics(), save_growth_model() 04_mosaic_creation.R ├→ [Utility] parameters_project.R └→ [Utility] mosaic_creation_utils.R └ Functions: create_weekly_mosaic_from_tiles(), save_mosaic(), assess_cloud_coverage() 09_field_analysis_weekly.R (or 09b_field_analysis_weekly.R - parallel version) ├→ [Utility] parameters_project.R ├→ [Utility] field_analysis_utils.R └→ Outputs: Excel files, RDS summary files └ Functions: load_ci_data(), analyze_field_stats(), assign_growth_phase(), detect_triggers(), export_to_excel() 10_CI_report_with_kpis_simple.Rmd (RMarkdown → rendered to .docx/.html) ├→ [Utility] parameters_project.R ├→ [Utility] report_utils.R └→ [Packages] officer, flextable └ Functions: body_add_flextable(), add_paragraph(), officer::read_docx(), save_docx() ``` ### Utility Files Description - **`parameters_project.R`**: Loads project configuration (paths, field boundaries, harvest data, project metadata) - **`ci_extraction_utils.R`**: CI calculation, field masking, RDS I/O for daily & cumulative CI data - **`growth_model_utils.R`**: Linear interpolation, seasonal grouping, daily metrics calculation - **`mosaic_creation_utils.R`**: Weekly mosaic compositing, cloud assessment, raster masking - **`field_analysis_utils.R`**: Per-field statistics, phase assignment, trigger detection, Excel export - **`report_utils.R`**: RMarkdown helpers, table formatting, Word document building via `officer` package --- ## Data Type Reference ### RDS (R Data Serialization) RDS files store R data objects in binary format. They preserve data types, dimensions, and structure perfectly. Key RDS files in the pipeline: | File | Structure | Rows | Columns | Use | |---|---|---|---|---| | `combined_CI_data.rds` | Data frame (wide format) | # fields | # dates | All-time CI by field | | `All_pivots_Cumulative_CI_quadrant_year_v2.rds` | Data frame (long format) | ~1M+ rows | 11 columns | Interpolated daily CI, used for yield models | | `kpi_summary_tables_week{WW}.rds` | List of data frames | — | varies | Field KPIs, phase dist., triggers | | `cloud_coverage_week{WW}.rds` | Data frame | # fields | 4 columns | Per-field cloud %, category | ### Excel (.xlsx) Primary output format for stakeholder consumption: | Sheet | Content | Rows | Columns | Key Data | |---|---|---|---|---| | Field Data | Field-by-field analysis | # fields | ~15 | CI mean/std, phase, status, cloud% | | Summary | Farm-wide statistics | 10-20 | 3 | Monitored area (ha/acres), cloud dist., phases | ### Word (.docx) Executive report format via RMarkdown → `officer`: - Title page with metadata (project, week, date, total fields, acreage) - Executive summary with KPIs - Phase analysis section with distribution tables - Status trigger summary - Field-by-field detail pages - Interpretation guides --- ## Key Calculations & Thresholds ### Canopy Index (CI) ``` CI = (NIR / Green) - 1 Range: -1 to +∞ Interpretation: CI < 0 → Non-vegetated (water, bare soil) 0 < CI < 1 → Sparse vegetation (early growth) 1 < CI < 2 → Moderate vegetation CI > 2 → Dense vegetation (mature crop) ``` ### Growth Phase Assignment (Age-Based) Based on weeks since planting (`season_start` from harvest.xlsx): | Phase | Age Range | Characteristics | |---|---|---| | Germination | 0-6 weeks | Variable emergence, low CI | | Tillering | 6-18 weeks | Shoot development, increasing CI | | Grand Growth | 18-35 weeks | Peak growth, high CI accumulation | | Maturation | 35+ weeks | Sugar accumulation, plateau or decline | ### Field Uniformity (Coefficient of Variation) ``` CV = std_dev / mean Interpretation: CV < 0.15 → Excellent uniformity CV < 0.25 → Good uniformity CV < 0.35 → Moderate uniformity CV ≥ 0.35 → Poor uniformity (management attention needed) ``` ### Cloud Coverage Classification (Per-Field) ``` cloud_pct = (non_NA_pixels / total_pixels) * 100 Categories: ≥99.5% → Clear view (usable for analysis) 0-99.5% → Partial coverage (biased estimates) 0% → No image available (excluded from analysis) ``` ### Status Triggers (Non-Exclusive) Fields can have multiple simultaneous triggers: | Trigger | Detection Method | Data Used | |---|---|---| | **Germination Started** | 10% of field CI > 2.0 | Current week CI extraction | | **Rapid Growth** | Week-over-week increase > 0.5 CI units | Mosaic-based extraction | | **Slow Growth** | Week-over-week increase < 0.1 CI units | Mosaic-based extraction | | **Non-Uniform** | CV > 0.25 | Spatial stats per field | | **Weed Pressure** | Rapid increase (>2.0 CI/week) + area <25% | Spatial clustering analysis | | **Harvest Imminence** | Age > 240 days + CI plateau | Temporal analysis, phase assignment | --- ## Processing Configuration & Parameters All parameters are configurable via command-line arguments or environment variables: ### Download Stage (Python) - `DATE`: End date for download (YYYY-MM-DD), default: today - `DAYS`: Days lookback, default: 7 - `resolution`: Output resolution in meters, default: 3 - `max_threads`: Concurrent download threads, default: 15 - Grid split: `(5, 5)` bounding boxes (hardcoded) ### CI Extraction Stage (R) - `end_date`: End date (YYYY-MM-DD) - `offset`: Days lookback (default: 7) - `project_dir`: Project directory name (required) - `data_source`: Source folder (merged_tif_8b, merged_tif, or merged_final_tif) - Auto-detection: If `daily_tiles_split/` exists, uses tile-based processing ### Mosaic Creation Stage (R) - `end_date`: End date - `offset`: Days lookback - `project_dir`: Project directory - `file_name`: Custom output filename (optional) - Cloud thresholds: 5% (strict), 45% (relaxed) - hardcoded ### Field Analysis Stage (R) - `end_date`: End date - `project_dir`: Project directory - Parallel workers: Auto-detected via `future::plan()` or user-configurable - Thresholds: CV, change, weed detection - configurable in code --- ## Database Usage The system does NOT write to the database during analysis. Database tables (`project_reports`, `project_mosaics`, `project_mailings`) are maintained by the Laravel application for: - Report metadata tracking - Email delivery history - Report version control File system is the single source of truth for all analysis data. FieldBoundaries --> KPIScript HarvestData --> KPIScript InterpolatedModel --> KPIScript KPIScript --> KPI1 KPIScript --> KPI2 KPIScript --> KPI3 KPIScript --> KPI4 KPIScript --> KPI5 KPIScript --> KPI6 KPI1 & KPI2 & KPI3 & KPI4 & KPI5 & KPI6 --> KPIParams KPIParams --> KPIResults WeeklyMosaic --> ReportScript KPIResults --> ReportScript FieldBoundaries --> ReportScript ReportScript --> Visualizations Visualizations --> FinalReport FinalReport --> EmailDelivery FinalReport --> WebDashboard Laravel --> ShellScripts ShellScripts -.->|Triggers| Download ShellScripts -.->|Triggers| CIScript ShellScripts -.->|Triggers| GrowthScript ShellScripts -.->|Triggers| MosaicScript ShellScripts -.->|Triggers| KPIScript ShellScripts -.->|Triggers| ReportScript %% ===== STYLING ===== style INPUTS fill:#e3f2fd,stroke:#1976d2,stroke-width:2px style DOWNLOAD fill:#fff3e0,stroke:#f57c00,stroke-width:2px style CI_EXTRACTION fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px style GROWTH_MODEL fill:#e8f5e9,stroke:#388e3c,stroke-width:2px style WEEKLY_MOSAIC fill:#fce4ec,stroke:#c2185b,stroke-width:2px style KPI_CALC fill:#e0f2f1,stroke:#00796b,stroke-width:2px style REPORTING fill:#fff9c4,stroke:#f9a825,stroke-width:2px style OUTPUTS fill:#ffebee,stroke:#c62828,stroke-width:2px style DailyMosaics fill:#ffccbc,stroke:#333,stroke-width:1px style CombinedCI fill:#ffccbc,stroke:#333,stroke-width:1px style InterpolatedModel fill:#ffccbc,stroke:#333,stroke-width:1px style WeeklyMosaic fill:#ffccbc,stroke:#333,stroke-width:1px style KPIResults fill:#ffccbc,stroke:#333,stroke-width:1px style FinalReport fill:#ffccbc,stroke:#333,stroke-width:1px ``` ### Overall System Architecture This diagram provides a high-level overview of the complete SmartCane system, showing how major components interact. It focuses on the system boundaries and main data flows between the Python API Downloader, R Processing Engine, Laravel Web App, and data storage components. This view helps understand how the system works as a whole. ```mermaid graph TD A["fa:fa-satellite External Satellite Data Providers API"] --> PyDL["fa:fa-download Python API Downloader"]; C["fa:fa-users Users: Farm Data Input e.g., GeoJSON, Excel"] --> D{"fa:fa-laptop-code Laravel Web App"}; subgraph SmartCane System PyDL --> G["fa:fa-folder-open File System: Raw Satellite Imagery, Rasters, RDS, Reports, Boundaries"]; E["fa:fa-cogs R Processing Engine"] -- Reads --> G; E -- Writes --> G; D -- Manages/Triggers --> F["fa:fa-terminal Shell Script Orchestration"]; F -- Executes --> PyDL; F -- Executes --> E; D -- Manages/Accesses --> G; D -- Reads/Writes --> H["fa:fa-database Database: Project Metadata, Users, Schedules"]; E -- Generates --> I["fa:fa-file-alt Agronomic Reports: DOCX, HTML"]; D -- Accesses/Delivers --> I; end D --> J["fa:fa-desktop Users: Web Interface (future)"]; I -- Via Email (SMTP) --> K["fa:fa-envelope Users: Email Reports"]; style E fill:#f9f,stroke:#333,stroke-width:2px style D fill:#bbf,stroke:#333,stroke-width:2px style PyDL fill:#ffdd57,stroke:#333,stroke-width:2px ``` ### R Processing Engine Detail This diagram zooms in on the R Processing Engine subsystem, detailing the internal components and data flow. It shows how raw satellite imagery and field data progress through various R scripts to produce crop indices and reports. The diagram highlights the data transformation pipeline within this analytical core of the SmartCane system. ```mermaid graph TD subgraph R Processing Engine direction TB subgraph Inputs SatelliteImages["fa:fa-image Raw Satellite Imagery"] FieldBoundaries["fa:fa-map-marker-alt Field Boundaries .geojson"] HarvestData["fa:fa-file-excel Harvest Data .xlsx"] ProjectParams["fa:fa-file-code Project Parameters .R"] end subgraph Core R Scripts & Processes ParamConfig("fa:fa-cogs parameters_project.R") MosaicScript("fa:fa-images mosaic_creation.R") CIExtractionScript("fa:fa-microscope ci_extraction.R") ReportUtils("fa:fa-tools executive_report_utils.R") DashboardRmd("fa:fa-tachometer-alt CI_report_dashboard_planet_enhanced.Rmd") SummaryRmd("fa:fa-list-alt CI_report_executive_summary.Rmd") end subgraph Outputs WeeklyMosaics["fa:fa-file-image Weekly Mosaics .tif"] CIDataRDS["fa:fa-database CI Data .rds"] CIRasters["fa:fa-layer-group CI Rasters .tif"] DashboardReport["fa:fa-chart-bar Dashboard Report .docx/.html"] SummaryReport["fa:fa-file-invoice Executive Summary .docx/.html"] end %% Data Flow ProjectParams --> ParamConfig; SatelliteImages --> MosaicScript; FieldBoundaries --> MosaicScript; ParamConfig --> MosaicScript; MosaicScript --> WeeklyMosaics; WeeklyMosaics --> CIExtractionScript; FieldBoundaries --> CIExtractionScript; ParamConfig --> CIExtractionScript; CIExtractionScript --> CIDataRDS; CIExtractionScript --> CIRasters; CIDataRDS --> ReportUtils; CIRasters --> ReportUtils; HarvestData --> ReportUtils; ParamConfig --> ReportUtils; ReportUtils --> DashboardRmd; ReportUtils --> SummaryRmd; ParamConfig --> DashboardRmd; ParamConfig --> SummaryRmd; DashboardRmd --> DashboardReport; SummaryRmd --> SummaryReport; end ShellOrchestration["fa:fa-terminal Shell Scripts e.g., build_mosaic.sh, build_report.sh"] -->|Triggers| R_Processing_Engine["fa:fa-cogs R Processing Engine"] style R_Processing_Engine fill:#f9f,stroke:#333,stroke-width:2px style Inputs fill:#ccf,stroke:#333,stroke-width:1px style Outputs fill:#cfc,stroke:#333,stroke-width:1px style Core_R_Scripts_Processes fill:#ffc,stroke:#333,stroke-width:1px ``` ### Python API Downloader Detail This diagram focuses on the Python API Downloader subsystem, showing its internal components and workflow. It illustrates how API credentials, field boundaries, and other inputs are processed through various Python functions to download, process, and prepare satellite imagery. This view reveals the technical implementation details of the data acquisition layer. ```mermaid graph TD subgraph Python API Downloader direction TB subgraph Inputs_Py [Inputs] APICreds["fa:fa-key API Credentials (SH_CLIENT_ID, SH_CLIENT_SECRET)"] DateRangeParams["fa:fa-calendar-alt Date Range Parameters (days_needed, specific_date)"] GeoJSONInput["fa:fa-map-marker-alt Field Boundaries (pivot.geojson)"] ProjectConfig["fa:fa-cogs Project Configuration (project_name, paths)"] EvalScripts["fa:fa-file-code Evalscripts (JS for cloud masking & band selection)"] end subgraph Core_Python_Logic_Py [Core Python Logic & Libraries] SetupConfig["fa:fa-cog SentinelHubConfig & BYOC Definition"] DateSlotGen["fa:fa-calendar-check Date Slot Generation (slots)"] GeoProcessing["fa:fa-map GeoJSON Parsing & BBox Splitting (geopandas, BBoxSplitter)"] AvailabilityCheck["fa:fa-search-location Image Availability Check (SentinelHubCatalog)"] RequestHandler["fa:fa-paper-plane Request Generation (SentinelHubRequest, get_true_color_request_day)"] DownloadClient["fa:fa-cloud-download-alt Image Download (SentinelHubDownloadClient, download_function)"] MergeUtility["fa:fa-object-group Tile Merging (gdal.BuildVRT, gdal.Translate, merge_files)"] CleanupUtility["fa:fa-trash-alt Intermediate File Cleanup (empty_folders)"] end subgraph Outputs_Py [Outputs] RawSatImages["fa:fa-file-image Raw Downloaded Satellite Imagery Tiles (response.tiff in dated subfolders)"] MergedTifs["fa:fa-images Merged TIFs (merged_tif/{slot}.tif)"] VirtualRasters["fa:fa-layer-group Virtual Rasters (merged_virtual/merged{slot}.vrt)"] DownloadLogs["fa:fa-file-alt Console Output Logs (print statements)"] end ExternalSatAPI["fa:fa-satellite External Satellite Data Providers API (Planet via Sentinel Hub)"] %% Data Flow for Python Downloader APICreds --> SetupConfig; DateRangeParams --> DateSlotGen; GeoJSONInput --> GeoProcessing; ProjectConfig --> SetupConfig; ProjectConfig --> GeoProcessing; ProjectConfig --> MergeUtility; ProjectConfig --> CleanupUtility; EvalScripts --> RequestHandler; DateSlotGen -- Available Slots --> AvailabilityCheck; GeoProcessing -- BBox List --> AvailabilityCheck; SetupConfig --> AvailabilityCheck; AvailabilityCheck -- Filtered Slots & BBoxes --> RequestHandler; RequestHandler -- Download Requests --> DownloadClient; SetupConfig --> DownloadClient; DownloadClient -- Downloads Data From --> ExternalSatAPI; ExternalSatAPI -- Returns Image Data --> DownloadClient; DownloadClient -- Writes --> RawSatImages; DownloadClient -- Generates --> DownloadLogs; RawSatImages --> MergeUtility; MergeUtility -- Writes --> MergedTifs; MergeUtility -- Writes --> VirtualRasters; end ShellOrchestratorPy["fa:fa-terminal Shell Scripts (e.g., runpython.sh triggering planet_download.ipynb)"] -->|Triggers| Python_API_Downloader["fa:fa-download Python API Downloader"]; style Python_API_Downloader fill:#ffdd57,stroke:#333,stroke-width:2px style Inputs_Py fill:#cdeeff,stroke:#333,stroke-width:1px style Outputs_Py fill:#d4efdf,stroke:#333,stroke-width:1px style Core_Python_Logic_Py fill:#fff5cc,stroke:#333,stroke-width:1px style ExternalSatAPI fill:#f5b7b1,stroke:#333,stroke-width:2px ``` ### SmartCane Engine Integration Diagram This diagram illustrates the integration of Python and R components within the SmartCane Engine. Unlike the first diagram that shows the overall system, this one specifically focuses on how the two processing components interact with each other and the rest of the system. It emphasizes the orchestration layer and data flows between the core processing components and external systems. ```mermaid graph TD %% External Systems & Users Users_DataInput["fa:fa-user Users: Farm Data Input (GeoJSON, Excel, etc.)"] --> Laravel_WebApp; ExternalSatAPI["fa:fa-satellite External Satellite Data Providers API"]; %% Main Application Components Laravel_WebApp["fa:fa-globe Laravel Web App (Frontend & Control Plane)"]; Shell_Orchestration["fa:fa-terminal Shell Script Orchestration (e.g., runcane.sh, runpython.sh, build_mosaic.sh)"]; subgraph SmartCane_Engine ["SmartCane Engine (Data Processing Core)"] direction TB Python_Downloader["fa:fa-download Python API Downloader"]; R_Engine["fa:fa-chart-line R Processing Engine"]; end %% Data Storage FileSystem["fa:fa-folder File System (Raw Imagery, Rasters, RDS, Reports, Boundaries)"]; Database["fa:fa-database Database (Project Metadata, Users, Schedules)"]; %% User Outputs Users_WebView["fa:fa-desktop Users: Web Interface (future)"]; Users_EmailReports["fa:fa-envelope Users: Email Reports (Agronomic Reports)"]; AgronomicReports["fa:fa-file-alt Agronomic Reports (DOCX, HTML)"]; %% --- Data Flows & Interactions --- %% Laravel to Orchestration & Engine Laravel_WebApp -- Manages/Triggers --> Shell_Orchestration; Shell_Orchestration -- Executes --> Python_Downloader; Shell_Orchestration -- Executes --> R_Engine; %% Python Downloader within Engine ExternalSatAPI -- Satellite Data --> Python_Downloader; Python_Downloader -- Writes Raw Data --> FileSystem; %% Inputs to Python (simplified for this view - details in Python-specific diagram) %% Laravel_WebApp -- Provides Config/Boundaries --> Python_Downloader; %% R Engine within Engine %% Inputs to R (simplified - details in R-specific diagram) %% Laravel_WebApp -- Provides Config/Boundaries --> R_Engine; R_Engine -- Reads Processed Data/Imagery --> FileSystem; R_Engine -- Writes Derived Products --> FileSystem; R_Engine -- Generates --> AgronomicReports; %% Laravel interaction with Data Storage Laravel_WebApp -- Manages/Accesses --> FileSystem; Laravel_WebApp -- Reads/Writes --> Database; %% Output Delivery Laravel_WebApp --> Users_WebView; AgronomicReports --> Users_EmailReports; %% Assuming a mechanism like SMTP, potentially triggered by Laravel or R-Engine completion Laravel_WebApp -- Delivers/Displays --> AgronomicReports; %% Styling style SmartCane_Engine fill:#e6ffe6,stroke:#333,stroke-width:2px style Python_Downloader fill:#ffdd57,stroke:#333,stroke-width:2px style R_Engine fill:#f9f,stroke:#333,stroke-width:2px style Laravel_WebApp fill:#bbf,stroke:#333,stroke-width:2px style Shell_Orchestration fill:#f0ad4e,stroke:#333,stroke-width:2px style FileSystem fill:#d1e0e0,stroke:#333,stroke-width:1px style Database fill:#d1e0e0,stroke:#333,stroke-width:1px style ExternalSatAPI fill:#f5b7b1,stroke:#333,stroke-width:2px style AgronomicReports fill:#d4efdf,stroke:#333,stroke-width:1px ``` ## Future Directions The SmartCane platform is poised for significant evolution, with several key enhancements and new capabilities planned to further empower users and expand its utility: - **Advanced Management Dashboard**: Development of a more comprehensive and interactive management dashboard to provide users with deeper insights and greater control over their operations. - **Enhanced Yield Prediction Models**: Improving the accuracy and granularity of yield predictions by incorporating more variables and advanced machine learning techniques. - **Integrated Weather and Irrigation Advice**: Leveraging weather forecast data and soil moisture information (potentially from new data sources) to provide precise irrigation scheduling and weather-related agronomic advice. - **AI-Guided Agronomic Advice**: Implementing sophisticated AI algorithms to analyze integrated data (satellite, weather, soil, farm practices) and offer tailored, actionable agronomic recommendations. - **Automated Advice Generation**: Developing capabilities for the system to automatically generate and disseminate critical advice and alerts to users based on real-time data analysis. - **Expanded Data Source Integration**: - **Radar Data**: Incorporating radar satellite imagery (e.g., Sentinel-1) for all-weather monitoring capabilities, particularly useful during cloudy seasons for assessing crop structure, soil moisture, and biomass. - **IoT and Ground Sensors**: Integrating data from in-field IoT devices and soil sensors for highly localized and continuous monitoring of environmental and soil conditions. - **Client-Facing Portal**: Exploration and potential development of a client-facing portal to allow end-users direct access to their data, dashboards, and reports, complementing the current internal management interface. These future developments aim to transform SmartCane into an even more powerful decision support system, fostering sustainable and efficient agricultural practices. ## Conclusion and Integration Summary The SmartCane system architecture demonstrates a well-integrated solution that combines different technologies and subsystems to solve complex agricultural challenges. Here is a summary of how the key subsystems work together: ### Subsystem Integration 1. **Data Flow Sequence** - The Laravel Web App initiates the workflow and manages user interactions - Shell scripts orchestrate the execution sequence of the processing subsystems - The Python API Downloader acquires raw data from external sources - The R Processing Engine transforms this data into actionable insights - Results flow back to users through the web interface and email reports 2. **Technology Integration** - **Python + R**: Different programming languages are leveraged for their respective strengths—Python for API communication and data acquisition, R for statistical analysis and report generation - **Laravel + Processing Engine**: Clear separation between web presentation layer and computational backend - **File System + Database**: Hybrid data storage approach with file system for imagery and reports, database for metadata and user information 3. **Key Integration Mechanisms** - **File System Bridge**: The different subsystems primarily communicate through standardized file formats (GeoTIFF, GeoJSON, RDS, DOCX) - **Shell Script Orchestration**: Acts as the "glue" between subsystems, ensuring proper execution sequence and environment setup - **Standardized Data Formats**: Use of widely-accepted geospatial and data formats enables interoperability 4. **Extensibility and Scalability** - The modular architecture allows for replacement or enhancement of individual components - The clear subsystem boundaries enable parallel development and testing - Standard interfaces simplify integration of new data sources, algorithms, or output methods The SmartCane architecture balances complexity with maintainability by using well-established technologies and clear boundaries between subsystems. The separation of concerns between data acquisition, processing, and presentation layers ensures that changes in one area minimally impact others, while the consistent data flow pathways ensure that information moves smoothly through the system.