updated sc-91

2026-01-29 17:26:03 +01:00 · 2026-01-29 17:26:03 +01:00 · d1f352f21c
parent 4445f72e6f
commit d1f352f21c
5 changed files with 1523 additions and 541 deletions
--- a/CODE_REVIEW_FINDINGS.md
+++ b/CODE_REVIEW_FINDINGS.md
@ -0,0 +1,751 @@
+# SmartCane Pipeline Code Review
+## Efficiency, Cleanup, and Architecture Analysis
+
+**Date**: January 29, 2026  
+**Scope**: `run_full_pipeline.R` + all called scripts (10, 20, 21, 30, 31, 40, 80, 90, 91) + utility files  
+**Status**: Comprehensive review completed
+
+---
+
+## EXECUTIVE SUMMARY
+
+Your pipeline is **well-structured and intentional**, but has accumulated significant technical debt through development iterations. The main issues are:
+
+1. **🔴 HIGH IMPACT**: **3 separate mosaic mode detection functions** doing identical work
+2. **🔴 HIGH IMPACT**: **Week/year calculations duplicated 10+ times** across 6+ files
+3. **🟡 MEDIUM IMPACT**: **40+ debug statements** cluttering output
+4. **🟡 MEDIUM IMPACT**: **File existence checks repeated** in multiple places (especially KPI checks)
+5. **🟢 LOW IMPACT**: Minor redundancy in command construction, but manageable
+
+**Estimated cleanup effort**: 2-3 hours for core refactoring; significant code quality gains.
+
+**Workflow clarity issue**: The split between `merged_tif` vs `merged_tif_8b` and `weekly_mosaic` vs `weekly_tile_max` is **not clearly documented**. This should be clarified.
+
+---
+
+## 1. DUPLICATED FUNCTIONS & LOGIC
+
+### 1.1 Mosaic Mode Detection (CRITICAL REDUNDANCY)
+
+**Problem**: Three identical implementations of `detect_mosaic_mode()`:
+
+| Location | Function Name | Lines | Issue |
+|----------|---------------|-------|-------|
+| `run_full_pipeline.R` | `detect_mosaic_mode_early()` | ~20 lines | Detects tiled vs single-file |
+| `run_full_pipeline.R` | `detect_mosaic_mode_simple()` | ~20 lines | Detects tiled vs single-file (duplicate) |
+| `parameters_project.R` | `detect_mosaic_mode()` | ~30 lines | Detects tiled vs single-file (different signature) |
+
+**Impact**: If you change the detection logic, you must update 3 places. Bug risk is high.
+
+**Solution**: Create **single canonical function in `parameters_project.R`**:
+```r
+# SINGLE SOURCE OF TRUTH
+detect_mosaic_mode <- function(project_dir) {
+  weekly_tile_max <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max")
+  if (dir.exists(weekly_tile_max)) {
+    subfolders <- list.dirs(weekly_tile_max, full.names = FALSE, recursive = FALSE)
+    if (length(grep("^\\d+x\\d+$", subfolders)) > 0) return("tiled")
+  }
+  
+  weekly_mosaic <- file.path("laravel_app", "storage", "app", project_dir, "weekly_mosaic")
+  if (dir.exists(weekly_mosaic) && 
+      length(list.files(weekly_mosaic, pattern = "^week_.*\\.tif$")) > 0) {
+    return("single-file")
+  }
+  
+  return("unknown")
+}
+```
+
+Then replace all three calls in `run_full_pipeline.R` with this single function.
+
+---
+
+### 1.2 Week/Year Calculations (CRITICAL REDUNDANCY)
+
+**Problem**: The pattern `week_num <- as.numeric(format(..., "%V"))` + `year_num <- as.numeric(format(..., "%G"))` appears **13+ times** across multiple files.
+
+**Locations**:
+- `run_full_pipeline.R`: Lines 82, 126-127, 229-230, 630, 793-794 (5 times)
+- `80_calculate_kpis.R`: Lines 323-324 (1 time)
+- `80_weekly_stats_utils.R`: Lines 829-830 (1 time)
+- `kpi_utils.R`: Line 45 (1 time)
+- `80_kpi_utils.R`: Lines 177-178 (1 time)
+- Plus inline in sprintf statements: ~10+ additional times
+
+**Impact**: 
+- High maintenance burden
+- Risk of inconsistency (%V vs %Y confusion noted at line 82 in `run_full_pipeline.R`)
+- Code verbosity
+
+**Solution**: Create **utility function in `parameters_project.R`**:
+```r
+get_iso_week_year <- function(date) {
+  list(
+    week = as.numeric(format(date, "%V")),
+    year = as.numeric(format(date, "%G"))  # ISO year, not calendar year
+  )
+}
+
+# Usage:
+wwy <- get_iso_week_year(end_date)
+cat(sprintf("Week %02d/%d\n", wwy$week, wwy$year))
+```
+
+**Also add convenience function**:
+```r
+format_week_year <- function(date, separator = "_") {
+  wwy <- get_iso_week_year(date)
+  sprintf("week_%02d%s%d", wwy$week, separator, wwy$year)
+}
+
+# Usage: format_week_year(end_date)  # "week_02_2026"
+```
+
+---
+
+### 1.3 File Path Construction (MEDIUM REDUNDANCY)
+
+**Problem**: Repeated patterns like:
+```r
+file.path("laravel_app", "storage", "app", project_dir, "weekly_mosaic")
+file.path("laravel_app", "storage", "app", project_dir, "reports", "kpis", kpi_subdir)
+```
+
+**Solution**: Centralize in `parameters_project.R`:
+```r
+# Project-agnostic path builders
+get_project_storage_path <- function(project_dir, subdir = NULL) {
+  base <- file.path("laravel_app", "storage", "app", project_dir)
+  if (!is.null(subdir)) file.path(base, subdir) else base
+}
+
+get_mosaic_dir <- function(project_dir, mosaic_mode = "auto") {
+  if (mosaic_mode == "auto") mosaic_mode <- detect_mosaic_mode(project_dir)
+  if (mosaic_mode == "tiled") {
+    get_project_storage_path(project_dir, "weekly_tile_max/5x5")
+  } else {
+    get_project_storage_path(project_dir, "weekly_mosaic")
+  }
+}
+
+get_kpi_dir <- function(project_dir, client_type) {
+  subdir <- if (client_type == "agronomic_support") "field_level" else "field_analysis"
+  get_project_storage_path(project_dir, file.path("reports", "kpis", subdir))
+}
+```
+
+---
+
+## 2. DEBUG STATEMENTS & LOGGING CLUTTER
+
+### 2.1 Excessive Debug Output
+
+The pipeline prints **40+ debug statements** that pollute the terminal output. Examples:
+
+**In `run_full_pipeline.R`**:
+```r
+Line 82:   cat(sprintf("       Running week: %02d / %d\n", ...))  # Note: %d (calendar year) should be %G
+Line 218:  cat(sprintf("[KPI_DIR_CREATED] Created directory: %s\n", ...))
+Line 223:  cat(sprintf("[KPI_DIR_EXISTS] %s\n", ...))
+Line 224:  cat(sprintf("[KPI_DEBUG] Total files in directory: %d\n", ...))
+Line 225:  cat(sprintf("[KPI_DEBUG] Sample files: %s\n", ...))
+Line 240:  cat(sprintf("[KPI_DEBUG_W%02d_%d] Pattern: '%s' | Found: %d files\n", ...))
+Line 630:  cat("DEBUG: Running command:", cmd, "\n")
+Line 630 in Script 31 execution - prints full conda command
+```
+
+**In `80_calculate_kpis.R`**:
+```
+Line 323:  message(paste("Calculating statistics for all fields - Week", week_num, year))
+Line 417:  # Plus many more ...
+```
+
+**Impact**: 
+- Makes output hard to scan for real issues
+- Test developers skip important messages
+- Production logs become noise
+
+**Solution**: Replace with **structured logging** (3 levels):
+
+```r
+# Add to parameters_project.R
+smartcane_log <- function(message, level = "INFO") {
+  timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
+  prefix <- sprintf("[%s] %s", level, timestamp)
+  cat(sprintf("%s | %s\n", prefix, message))
+}
+
+smartcane_debug <- function(message) {
+  if (Sys.getenv("SMARTCANE_DEBUG") == "TRUE") {
+    smartcane_log(message, level = "DEBUG")
+  }
+}
+
+smartcane_warn <- function(message) {
+  smartcane_log(message, level = "WARN")
+}
+```
+
+**Usage**:
+```r
+# Keep important messages
+smartcane_log(sprintf("Downloaded %d dates, %d failed", download_count, download_failed))
+
+# Hide debug clutter (only show if DEBUG=TRUE)
+smartcane_debug(sprintf("KPI directory exists: %s", kpi_dir))
+
+# Warnings stay visible
+smartcane_warn("Some downloads failed, but continuing pipeline")
+```
+
+---
+
+### 2.2 Redundant Status Checks in KPI Section
+
+**Lines 218-270 in `run_full_pipeline.R`**: The KPI requirement check has **deeply nested debug statements**.
+
+```r
+if (dir.exists(kpi_dir)) {
+  cat(sprintf("[KPI_DIR_EXISTS] %s\n", kpi_dir))
+  all_kpi_files <- list.files(kpi_dir)
+  cat(sprintf("[KPI_DEBUG] Total files in directory: %d\n", length(all_kpi_files)))
+  if (length(all_kpi_files) > 0) {
+    cat(sprintf("[KPI_DEBUG] Sample files: %s\n", ...))
+  }
+} else {
+  cat(sprintf("[KPI_DIR_MISSING] Directory does not exist: %s\n", kpi_dir))
+}
+```
+
+**Solution**: Simplify to:
+```r
+if (!dir.exists(kpi_dir)) {
+  dir.create(kpi_dir, recursive = TRUE, showWarnings = FALSE)
+}
+
+all_kpi_files <- list.files(kpi_dir)
+smartcane_debug(sprintf("KPI directory: %d files found", length(all_kpi_files)))
+```
+
+---
+
+## 3. DOUBLE CALCULATIONS & INEFFICIENCIES
+
+### 3.1 KPI Existence Check (Calculated Twice)
+
+**Problem**: KPI existence is checked **twice** in `run_full_pipeline.R`:
+
+1. **First check (Lines 228-270)**: Initial KPI requirement check that calculates `kpis_needed` dataframe
+2. **Second check (Lines 786-810)**: Verification after Script 80 runs (almost identical logic)
+
+Both loops do:
+```r
+for (weeks_back in 0:(reporting_weeks_needed - 1)) {
+  check_date <- end_date - (weeks_back * 7)
+  week_num <- as.numeric(format(check_date, "%V"))
+  year_num <- as.numeric(format(check_date, "%G"))
+  
+  week_pattern <- sprintf("week%02d_%d", week_num, year_num)
+  kpi_files_this_week <- list.files(kpi_dir, pattern = week_pattern)
+  
+  has_kpis <- length(kpi_files_this_week) > 0
+  # ... same logic again
+}
+```
+
+**Impact**: Slower pipeline execution, code duplication
+
+**Solution**: Create **reusable function in utility file**:
+```r
+check_kpi_completeness <- function(project_dir, client_type, end_date, reporting_weeks_needed) {
+  kpi_dir <- get_kpi_dir(project_dir, client_type)
+  
+  kpis_needed <- data.frame()
+  for (weeks_back in 0:(reporting_weeks_needed - 1)) {
+    check_date <- end_date - (weeks_back * 7)
+    wwy <- get_iso_week_year(check_date)
+    
+    week_pattern <- sprintf("week%02d_%d", wwy$week, wwy$year)
+    has_kpis <- any(grepl(week_pattern, list.files(kpi_dir)))
+    
+    kpis_needed <- rbind(kpis_needed, data.frame(
+      week = wwy$week,
+      year = wwy$year,
+      date = check_date,
+      has_kpis = has_kpis
+    ))
+  }
+  
+  return(list(
+    kpis_df = kpis_needed,
+    missing_count = sum(!kpis_needed$has_kpis),
+    all_complete = all(kpis_needed$has_kpis)
+  ))
+}
+
+# Then in run_full_pipeline.R:
+initial_kpi_check <- check_kpi_completeness(project_dir, client_type, end_date, reporting_weeks_needed)
+
+# ... after Script 80 runs:
+final_kpi_check <- check_kpi_completeness(project_dir, client_type, end_date, reporting_weeks_needed)
+if (final_kpi_check$all_complete) {
+  smartcane_log("✓ All KPIs available")
+}
+```
+
+---
+
+### 3.2 Mosaic Mode Detection (Called 3+ Times per Run)
+
+**Current code**:
+- Line 99-117: `detect_mosaic_mode_early()` called once
+- Line 301-324: `detect_mosaic_mode_simple()` called again
+- Result: **Same detection logic runs twice unnecessarily**
+
+**Solution**: Call once, store result:
+```r
+mosaic_mode <- detect_mosaic_mode(project_dir)  # Once at top
+
+# Then reuse throughout:
+if (mosaic_mode == "tiled") { ... }
+else if (mosaic_mode == "single-file") { ... }
+```
+
+---
+
+### 3.3 Missing Weeks Calculation Inefficiency
+
+**Lines 126-170**: The loop builds `weeks_needed` dataframe, then **immediately** iterates again to find which ones are missing.
+
+**Current code**:
+```r
+# First: build all weeks
+weeks_needed <- data.frame()
+for (weeks_back in 0:(reporting_weeks_needed - 1)) {
+  # ... build weeks_needed
+}
+
+# Then: check which are missing (loop again)
+missing_weeks <- data.frame()
+for (i in 1:nrow(weeks_needed)) {
+  # ... check each week
+}
+```
+
+**Solution**: Combine into **single loop**:
+```r
+weeks_needed <- data.frame()
+missing_weeks <- data.frame()
+earliest_missing_date <- end_date
+
+for (weeks_back in 0:(reporting_weeks_needed - 1)) {
+  check_date <- end_date - (weeks_back * 7)
+  wwy <- get_iso_week_year(check_date)
+  
+  # Add to weeks_needed
+  weeks_needed <- rbind(weeks_needed, data.frame(
+    week = wwy$week, year = wwy$year, date = check_date
+  ))
+  
+  # Check if missing, add to missing_weeks if so
+  week_pattern <- sprintf("week_%02d_%d", wwy$week, wwy$year)
+  mosaic_dir <- get_mosaic_dir(project_dir, mosaic_mode)
+  
+  if (length(list.files(mosaic_dir, pattern = week_pattern)) == 0) {
+    missing_weeks <- rbind(missing_weeks, data.frame(
+      week = wwy$week, year = wwy$year, week_end_date = check_date
+    ))
+    if (check_date - 6 < earliest_missing_date) {
+      earliest_missing_date <- check_date - 6
+    }
+  }
+}
+```
+
+---
+
+### 3.4 Data Source Detection Logic
+
+**Lines 58-84**: The `data_source_used` detection is overly complex:
+
+```r
+data_source_used <- "merged_tif_8b"  # Default
+if (dir.exists(merged_tif_path)) {
+  tif_files <- list.files(merged_tif_path, pattern = "\\.tif$")
+  if (length(tif_files) > 0) {
+    data_source_used <- "merged_tif"
+    # ...
+  } else if (dir.exists(merged_tif_8b_path)) {
+    tif_files_8b <- list.files(merged_tif_8b_path, pattern = "\\.tif$")
+    # ...
+  }
+} else if (dir.exists(merged_tif_8b_path)) {
+  # ...
+}
+```
+
+**Issues**:
+- Multiple nested conditions doing the same check
+- `tif_files` and `tif_files_8b` are listed but only counts checked (not used later)
+- Logic could be cleaner
+
+**Solution**: Create utility function:
+```r
+detect_data_source <- function(project_dir, preferred = "auto") {
+  storage_dir <- get_project_storage_path(project_dir)
+  
+  for (source in c("merged_tif", "merged_tif_8b")) {
+    source_dir <- file.path(storage_dir, source)
+    if (dir.exists(source_dir)) {
+      tifs <- list.files(source_dir, pattern = "\\.tif$")
+      if (length(tifs) > 0) return(source)
+    }
+  }
+  
+  smartcane_warn("No data source found - defaulting to merged_tif_8b")
+  return("merged_tif_8b")
+}
+```
+
+---
+
+## 4. WORKFLOW CLARITY ISSUES
+
+### 4.1 TIFF Data Format Confusion
+
+**Problem**: Why are there TWO different TIFF folders?
+
+- `merged_tif`: 4-band data (RGB + NIR)
+- `merged_tif_8b`: 8-band data (appears to include UDM cloud masking from Planet)
+
+**Currently in code**:
+```r
+data_source <- if (project_dir == "angata") "merged_tif_8b" else "merged_tif"
+```
+
+**Issues**:
+- Hard-coded per project, not based on what's actually available
+- Not documented **why** angata uses 8-band
+- Unclear what the 8-band data adds (cloud masking? extra bands?)
+- Scripts handle both, but it's not clear when to use which
+
+**Recommendation**:
+1. **Document in `parameters_project.R`** what each data source contains:
+```r
+DATA_SOURCE_FORMATS <- list(
+  "merged_tif" = list(
+    bands = 4,
+    description = "4-band PlanetScope: Red, Green, Blue, NIR",
+    projects = c("aura", "chemba", "xinavane"),
+    note = "Standard format from Planet API"
+  ),
+  "merged_tif_8b" = list(
+    bands = 8,
+    description = "8-band PlanetScope with UDM: RGB+NIR + 4-band cloud mask",
+    projects = c("angata"),
+    note = "Enhanced with cloud confidence from UDM2 (Unusable Data Mask)"
+  )
+)
+```
+
+2. **Update hard-coded assignment** to be data-driven:
+```r
+# OLD: data_source <- if (project_dir == "angata") "merged_tif_8b" else "merged_tif"
+# NEW: detect what's actually available
+data_source <- detect_data_source(project_dir)
+```
+
+---
+
+### 4.2 Mosaic Storage Format Confusion
+
+**Problem**: Why are there TWO different mosaic storage styles?
+
+- `weekly_mosaic/`: Single TIF file per week (monolithic)
+- `weekly_tile_max/5x5/`: Tiled TIFFs per week (25+ files per week)
+
+**Currently in code**:
+- Detected automatically via `detect_mosaic_mode()`
+- But **no documentation** on when/why each is used
+
+**Recommendation**:
+1. **Document the trade-offs in `parameters_project.R`**:
+```r
+MOSAIC_MODES <- list(
+  "single-file" = list(
+    description = "One TIF per week",
+    storage_path = "weekly_mosaic/",
+    files_per_week = 1,
+    pros = c("Simpler file management", "Easier to load full mosaic"),
+    cons = c("Slower for field-specific analysis", "Large file I/O"),
+    suitable_for = c("agronomic_support", "dashboard visualization")
+  ),
+  "tiled" = list(
+    description = "5×5 grid of tiles per week",
+    storage_path = "weekly_tile_max/5x5/",
+    files_per_week = 25,
+    pros = c("Parallel field processing", "Faster per-field queries", "Scalable to 1000+ fields"),
+    cons = c("More file management", "Requires tile_grid metadata"),
+    suitable_for = c("cane_supply", "large-scale operations")
+  )
+)
+```
+
+2. **Document why angata uses tiled, aura uses single-file**:
+   - Is it a function of field count? (Angata = cane_supply, large fields → tiled)
+   - Is it historical? (Legacy decision?)
+   - Should new projects choose based on client type?
+
+---
+
+### 4.3 Client Type Mapping Clarity
+
+**Current structure** in `parameters_project.R`:
+
+```r
+CLIENT_TYPE_MAP <- list(
+  "angata" = "cane_supply",
+  "aura" = "agronomic_support",
+  "chemba" = "cane_supply",
+  "xinavane" = "cane_supply",
+  "esa" = "cane_supply"
+)
+```
+
+**Issues**:
+- Not clear **why** aura is agronomic_support while angata/chemba are cane_supply
+- No documentation of what each client type needs
+- Scripts branch heavily on `skip_cane_supply_only` logic
+
+**Recommendation**: 
+Add metadata to explain the distinction:
+
+```r
+CLIENT_TYPES <- list(
+  "cane_supply" = list(
+    description = "Sugar mill supply chain optimization",
+    requires_harvest_prediction = TRUE,  # Script 31
+    requires_phase_assignment = TRUE,     # Based on planting date
+    per_field_detail = TRUE,              # Script 91 Excel report
+    data_sources = c("merged_tif", "merged_tif_8b"),
+    mosaic_mode = "tiled",
+    projects = c("angata", "chemba", "xinavane", "esa")
+  ),
+  "agronomic_support" = list(
+    description = "Farm-level decision support for agronomists",
+    requires_harvest_prediction = FALSE,
+    requires_phase_assignment = FALSE,
+    per_field_detail = FALSE,
+    farm_level_kpis = TRUE,               # Script 90 Word report
+    data_sources = c("merged_tif"),
+    mosaic_mode = "single-file",
+    projects = c("aura")
+  )
+)
+```
+
+---
+
+## 5. COMMAND CONSTRUCTION REDUNDANCY
+
+### 5.1 Rscript Path Repetition
+
+**Problem**: The Rscript path is repeated 5 times:
+
+```r
+Line 519:  '"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"'
+Line 676:  '"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"'
+Line 685:  '"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"'
+```
+
+**Solution**: Define once in `parameters_project.R`:
+```r
+RSCRIPT_PATH <- "C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"
+
+# Usage:
+cmd <- sprintf('"%s" --vanilla r_app/20_ci_extraction.R ...', RSCRIPT_PATH)
+```
+
+---
+
+## 6. SPECIFIC LINE-BY-LINE ISSUES
+
+### 6.1 Line 82 Bug: Wrong Format Code
+
+```r
+cat(sprintf("       Running week: %02d / %d\n", 
+            as.numeric(format(end_date, "%V")), 
+            as.numeric(format(end_date, "%Y"))))  # ❌ Should be %G, not %Y
+```
+
+**Issue**: Uses calendar year `%Y` instead of ISO week year `%G`. On dates like 2025-12-30 (week 1 of 2026), this will print "Week 01 / 2025" (confusing).
+
+**Fix**:
+```r
+wwy <- get_iso_week_year(end_date)
+cat(sprintf("       Running week: %02d / %d\n", wwy$week, wwy$year))
+```
+
+---
+
+### 6.2 Line 630 Debug Statement
+
+```r
+cmd <- sprintf('conda run -n pytorch_gpu python python_app/31_harvest_imminent_weekly.py %s', project_dir)
+cat("DEBUG: Running command:", cmd, "\n")  # ❌ Prints full conda command
+```
+
+**Solution**: Use `smartcane_debug()` function:
+```r
+cmd <- sprintf('conda run -n pytorch_gpu python python_app/31_harvest_imminent_weekly.py %s', project_dir)
+smartcane_debug(sprintf("Running Python 31: %s", cmd))
+```
+
+---
+
+### 6.3 Lines 719-723: Verbose Script 31 Verification
+
+```r
+# Check for THIS WEEK's specific file
+current_week <- as.numeric(format(end_date, "%V"))
+current_year <- as.numeric(format(end_date, "%Y"))
+expected_file <- file.path(...)
+```
+
+**Issue**: Calculates week twice (already done earlier). Also uses `%Y` (should be `%G`).
+
+**Solution**: Reuse earlier `wwy` calculation or create helper.
+
+---
+
+## 7. REFACTORING ROADMAP
+
+### Phase 1: Foundation (1 hour)
+- [ ] Consolidate `detect_mosaic_mode()` into single function in `parameters_project.R`
+- [ ] Create `get_iso_week_year()` and `format_week_year()` utilities
+- [ ] Create `get_project_storage_path()`, `get_mosaic_dir()`, `get_kpi_dir()` helpers
+- [ ] Add logging functions (`smartcane_log()`, `smartcane_debug()`, `smartcane_warn()`)
+
+### Phase 2: Deduplication (1 hour)
+- [ ] Replace all 13+ week_num/year_num calculations with `get_iso_week_year()`
+- [ ] Replace all 3 `detect_mosaic_mode_*()` calls with single function
+- [ ] Combine duplicate KPI checks into `check_kpi_completeness()` function
+- [ ] Fix line 82 and 630 format bugs
+
+### Phase 3: Cleanup (1 hour)
+- [ ] Remove all debug statements (40+), replace with `smartcane_debug()`
+- [ ] Simplify nested conditions in data_source detection
+- [ ] Combine missing weeks detection into single loop
+- [ ] Extract Rscript path to constant
+
+### Phase 4: Documentation (30 min)
+- [ ] Add comments explaining `merged_tif` vs `merged_tif_8b` trade-offs
+- [ ] Document `single-file` vs `tiled` mosaic modes and when to use each
+- [ ] Clarify client type mapping in `CLIENT_TYPE_MAP`
+- [ ] Add inline comments for non-obvious logic
+
+---
+
+## 8. ARCHITECTURE & WORKFLOW RECOMMENDATIONS
+
+### 8.1 Clear Data Flow Diagram
+
+Add to `r_app/system_architecture/system_architecture.md`:
+
+```
+INPUT SOURCES:
+  ├── Planet API 4-band or 8-band imagery
+  ├── Field boundaries (pivot.geojson)
+  └── Harvest data (harvest.xlsx, optional for cane_supply)
+
+STORAGE TIERS:
+  ├── Tier 1: Raw data (merged_tif/ or merged_tif_8b/)
+  ├── Tier 2: Daily tiles (daily_tiles_split/{grid_size}/{dates}/)
+  ├── Tier 3: Extracted CI (Data/extracted_ci/daily_vals/*.rds)
+  ├── Tier 4: Weekly mosaics (weekly_mosaic/ OR weekly_tile_max/5x5/)
+  └── Tier 5: KPI outputs (reports/kpis/{field_level|field_analysis}/)
+
+DECISION POINTS:
+  └─ Client type (cane_supply vs agronomic_support)
+     ├─ Drives script selection (Scripts 21, 22, 23, 31, 90/91)
+     ├─ Drives data source (merged_tif_8b for cane_supply, merged_tif for agronomic)
+     ├─ Drives mosaic mode (tiled for cane_supply, single-file for agronomic)
+     └─ Drives KPI subdirectory (field_analysis vs field_level)
+```
+
+### 8.2 .sh Scripts Alignment
+
+You mention `.sh` scripts in the online environment. If they're **not calling the R pipeline**, there's a **split responsibility** issue:
+
+**Question**: Are the `.sh` scripts:
+- (A) Independent duplicates of the R pipeline logic? (BAD - maintenance nightmare)
+- (B) Wrappers calling the R pipeline? (GOOD - single source of truth)
+- (C) Different workflow for online vs local? (RED FLAG - they diverge)
+
+**Recommendation**: If using `.sh` for production, ensure they **call the same R scripts** (`run_full_pipeline.R`). Example:
+
+```bash
+#!/bin/bash
+# Wrapper that ensures R pipeline is called
+cd /path/to/smartcane
+& "C:\Program Files\R\R-4.4.3\bin\x64\Rscript.exe" r_app/run_full_pipeline.R
+```
+
+---
+
+## 9. SUMMARY TABLE: Issues by Severity
+
+| Issue | Type | Impact | Effort | Priority |
+|-------|------|--------|--------|----------|
+| 3 mosaic detection functions | Duplication | HIGH | 30 min | P0 |
+| 13+ week/year calculations | Duplication | HIGH | 1 hour | P0 |
+| 40+ debug statements | Clutter | MEDIUM | 1 hour | P1 |
+| KPI check run twice | Inefficiency | LOW | 30 min | P2 |
+| Line 82: %Y should be %G | Bug | LOW | 5 min | P2 |
+| Data source confusion | Documentation | MEDIUM | 30 min | P1 |
+| Mosaic mode confusion | Documentation | MEDIUM | 30 min | P1 |
+| Client type mapping | Documentation | MEDIUM | 30 min | P1 |
+| Data source detection complexity | Code style | LOW | 15 min | P3 |
+
+---
+
+## 10. RECOMMENDED NEXT STEPS
+
+1. **Review this report** with your team to align on priorities
+2. **Create Linear issues** for each phase of refactoring
+3. **Start with Phase 1** (foundation utilities) - builds confidence for Phase 2
+4. **Test thoroughly** after each phase - the pipeline is complex and easy to break
+5. **Update `.sh` scripts** if they duplicate R logic
+6. **Document data flow** in `system_architecture/system_architecture.md`
+
+---
+
+## Questions for Clarification
+
+Before implementing, please clarify:
+
+1. **Data source split**: Why does angata use `merged_tif_8b` (8-band with cloud mask) while aura uses `merged_tif` (4-band)? Is this:
+   - A function of client need (cane_supply requires cloud masking)?
+   - Historical (legacy decision for angata)?
+   - Should new projects choose based on availability?
+
+2. **Mosaic mode split**: Why tiled for angata but single-file for aura? Should this be:
+   - Hard-coded per project?
+   - Based on field count/client type?
+   - Auto-detected from first run?
+
+3. **Production vs local**: Are the `.sh` scripts in the online environment:
+   - Calling this same R pipeline?
+   - Duplicating logic independently?
+   - A different workflow entirely?
+
+4. **Client type growth**: Are there other client types planned beyond `cane_supply` and `agronomic_support`? (e.g., extension_service?)
+
+---
+
+**Report prepared**: January 29, 2026  
+**Total code reviewed**: ~2,500 lines across 10 files  
+**Estimated refactoring time**: 3-4 hours  
+**Estimated maintenance savings**: 5-10 hours/month (fewer bugs, easier updates)
+
--- a/r_app/40_mosaic_creation.R
+++ b/r_app/40_mosaic_creation.R
@ -188,7 +188,7 @@ main <- function() {
  if (!exists("use_tile_mosaic")) {
    # Fallback detection if flag not set (shouldn't happen)
    merged_final_dir <- file.path(laravel_storage, "merged_final_tif")
-    tile_detection <- detect_mosaic_mode(merged_final_dir)
+    tile_detection <- detect_tile_structure_from_merged_final(merged_final_dir)
    use_tile_mosaic <- tile_detection$has_tiles
  }
  
--- a/r_app/40_mosaic_creation_utils.R
+++ b/r_app/40_mosaic_creation_utils.R
@ -3,12 +3,12 @@
 # Utility functions for creating weekly mosaics from daily satellite imagery.
 # These functions support cloud cover assessment, date handling, and mosaic creation.

-#' Detect whether a project uses tile-based or single-file mosaic approach
+#' Detect whether a project uses tile-based or single-file mosaic approach (utility version)
 #'
 #' @param merged_final_tif_dir Directory containing merged_final_tif files
 #' @return List with has_tiles (logical), detected_tiles (vector), total_files (count)
 #'
-detect_mosaic_mode <- function(merged_final_tif_dir) {
+detect_tile_structure_from_files <- function(merged_final_tif_dir) {
  # Check if directory exists
  if (!dir.exists(merged_final_tif_dir)) {
    return(list(has_tiles = FALSE, detected_tiles = character(), total_files = 0))
--- a/r_app/parameters_project.R
+++ b/r_app/parameters_project.R
@ -114,7 +114,7 @@ get_client_kpi_config <- function(client_type) {

 # 3. Smart detection for tile-based vs single-file mosaic approach
 # ----------------------------------------------------------------
-detect_mosaic_mode <- function(merged_final_tif_dir, daily_tiles_split_dir = NULL) {
+detect_tile_structure_from_merged_final <- function(merged_final_tif_dir, daily_tiles_split_dir = NULL) {
  # PRIORITY 1: Check for tiling_config.json metadata file from script 10
  # This is the most reliable source since script 10 explicitly records its decision
  
@ -223,7 +223,7 @@ setup_project_directories <- function(project_dir, data_source = "merged_tif_8b"
  merged_final_dir <- here(laravel_storage_dir, "merged_final_tif")
  daily_tiles_split_dir <- here(laravel_storage_dir, "daily_tiles_split")
  
-  tile_detection <- detect_mosaic_mode(
+  tile_detection <- detect_tile_structure_from_merged_final(
    merged_final_tif_dir = merged_final_dir,
    daily_tiles_split_dir = daily_tiles_split_dir
  )
@ -498,6 +498,279 @@ setup_logging <- function(log_dir) {
  ))
 }

+# 8. HELPER FUNCTIONS FOR COMMON CALCULATIONS
+# -----------------------------------------------
+# Centralized functions to reduce duplication across scripts
+
+# Get ISO week and year from a date
+get_iso_week <- function(date) {
+  as.numeric(format(date, "%V"))
+}
+
+get_iso_year <- function(date) {
+  as.numeric(format(date, "%G"))
+}
+
+# Get both ISO week and year as a list
+get_iso_week_year <- function(date) {
+  list(
+    week = as.numeric(format(date, "%V")),
+    year = as.numeric(format(date, "%G"))
+  )
+}
+
+# Format week/year into a readable label
+format_week_label <- function(date, separator = "_") {
+  wwy <- get_iso_week_year(date)
+  sprintf("week%02d%s%d", wwy$week, separator, wwy$year)
+}
+
+# Auto-detect mosaic mode (tiled vs single-file)
+# Returns: "tiled", "single-file", or "unknown"
+detect_mosaic_mode <- function(project_dir) {
+  # Check for tile-based approach: weekly_tile_max/{grid_size}/week_*.tif
+  weekly_tile_max <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max")
+  if (dir.exists(weekly_tile_max)) {
+    subfolders <- list.dirs(weekly_tile_max, full.names = FALSE, recursive = FALSE)
+    grid_patterns <- grep("^\\d+x\\d+$", subfolders, value = TRUE)
+    if (length(grid_patterns) > 0) {
+      return("tiled")
+    }
+  }
+  
+  # Check for single-file approach: weekly_mosaic/week_*.tif
+  weekly_mosaic <- file.path("laravel_app", "storage", "app", project_dir, "weekly_mosaic")
+  if (dir.exists(weekly_mosaic)) {
+    files <- list.files(weekly_mosaic, pattern = "^week_.*\\.tif$")
+    if (length(files) > 0) {
+      return("single-file")
+    }
+  }
+  
+  return("unknown")
+}
+
+# Auto-detect grid size from tile directory structure
+# Returns: e.g., "5x5", "10x10", or "unknown"
+detect_grid_size <- function(project_dir) {
+  weekly_tile_max <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max")
+  if (dir.exists(weekly_tile_max)) {
+    subfolders <- list.dirs(weekly_tile_max, full.names = FALSE, recursive = FALSE)
+    grid_patterns <- grep("^\\d+x\\d+$", subfolders, value = TRUE)
+    if (length(grid_patterns) > 0) {
+      return(grid_patterns[1])  # Return first match (usually only one)
+    }
+  }
+  return("unknown")
+}
+
+# Build storage paths consistently across all scripts
+get_project_storage_path <- function(project_dir, subdir = NULL) {
+  base <- file.path("laravel_app", "storage", "app", project_dir)
+  if (!is.null(subdir)) file.path(base, subdir) else base
+}
+
+get_mosaic_dir <- function(project_dir, mosaic_mode = "auto") {
+  if (mosaic_mode == "auto") {
+    mosaic_mode <- detect_mosaic_mode(project_dir)
+  }
+  
+  if (mosaic_mode == "tiled") {
+    grid_size <- detect_grid_size(project_dir)
+    if (grid_size != "unknown") {
+      get_project_storage_path(project_dir, file.path("weekly_tile_max", grid_size))
+    } else {
+      get_project_storage_path(project_dir, "weekly_tile_max/5x5")  # Fallback default
+    }
+  } else {
+    get_project_storage_path(project_dir, "weekly_mosaic")
+  }
+}
+
+get_kpi_dir <- function(project_dir, client_type) {
+  subdir <- if (client_type == "agronomic_support") "field_level" else "field_analysis"
+  get_project_storage_path(project_dir, file.path("reports", "kpis", subdir))
+}
+
+# Logging functions for clean output
+smartcane_log <- function(message, level = "INFO", verbose = TRUE) {
+  if (!verbose) return(invisible(NULL))
+  timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
+  prefix <- sprintf("[%s]", level)
+  cat(sprintf("%s %s\n", prefix, message))
+}
+
+smartcane_debug <- function(message, verbose = FALSE) {
+  if (!verbose && Sys.getenv("SMARTCANE_DEBUG") != "TRUE") {
+    return(invisible(NULL))
+  }
+  smartcane_log(message, level = "DEBUG", verbose = TRUE)
+}
+
+smartcane_warn <- function(message) {
+  smartcane_log(message, level = "WARN", verbose = TRUE)
+}
+
+# ============================================================================
+# PHASE 3 & 4: OPTIMIZATION & DOCUMENTATION
+# ============================================================================
+
+# System Constants
+# ----------------
+# Define once, use everywhere
+
+RSCRIPT_PATH <- "C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"
+# Used in run_full_pipeline.R for calling R scripts via system()
+
+# Data Source Documentation
+# ---------------------------
+# Explains the two satellite data formats and when to use each
+#
+# SmartCane uses PlanetScope imagery from Planet Labs API in two formats:
+#
+# 1. merged_tif (4-band):
+#    - Standard format: Red, Green, Blue, Near-Infrared
+#    - Size: ~150-200 MB per date
+#    - Use case: Agronomic support, general crop health monitoring
+#    - Projects: aura, xinavane
+#    - Cloud handling: Basic cloud masking from Planet metadata
+#
+# 2. merged_tif_8b (8-band with cloud confidence):
+#    - Enhanced format: 4-band imagery + 4-band UDM2 cloud mask
+#    - UDM2 bands: Clear, Snow, Shadow, Light Haze
+#    - Size: ~250-350 MB per date
+#    - Use case: Harvest prediction, supply chain optimization
+#    - Projects: angata, chemba, esa (cane_supply clients)
+#    - Cloud handling: Per-pixel cloud confidence from Planet UDM2
+#    - Why: Cane supply chains need precise confidence to predict harvest dates
+#           (don't want to predict based on cloudy data)
+#
+# The system auto-detects which is available via detect_data_source()
+
+# Mosaic Mode Documentation
+# --------------------------
+# SmartCane supports two ways to store and process weekly mosaics:
+#
+# 1. Single-file mosaic ("single-file"):
+#    - One GeoTIFF per week: weekly_mosaic/week_02_2026.tif
+#    - 5 bands per file: R, G, B, NIR, CI (Canopy Index)
+#    - Size: ~300-500 MB per week
+#    - Pros: Simpler file management, easier full-field visualization
+#    - Cons: Slower for field-specific queries, requires loading full raster
+#    - Best for: Agronomic support (aura) with <100 fields
+#    - Script 04 output: 5-band single-file mosaic
+#
+# 2. Tiled mosaic ("tiled"):
+#    - Grid of tiles per week: weekly_tile_max/5x5/week_02_2026_{TT}.tif
+#    - Example: 25 files (5×5 grid) × 5 bands = 125 individual tiffs
+#    - Size: ~15-20 MB per tile, organized in folders
+#    - Pros: Parallel processing, fast field lookups, scales to 1000+ fields
+#    - Cons: More file I/O, requires tile-to-field mapping metadata
+#    - Best for: Cane supply (angata, chemba) with 500+ fields
+#    - Script 04 output: Per-tile tiff files in weekly_tile_max/{grid}/
+#    - Tile assignment: Field boundaries mapped to grid coordinates
+#
+# The system auto-detects which is available via detect_mosaic_mode()
+
+# Client Type Documentation
+# --------------------------
+# SmartCane runs different analysis pipelines based on client_type:
+#
+# CLIENT_TYPE: cane_supply
+#   Purpose: Optimize sugar mill supply chain (harvest scheduling)
+#   Scripts run: 20 (CI), 21 (RDS to CSV), 30 (Growth), 31 (Harvest pred), 40 (Mosaic), 80 (KPI), 91 (Excel)
+#   Outputs:
+#     - Per-field analysis: field status, growth phase, harvest readiness
+#     - Excel reports (Script 91): Detailed metrics for logistics planning
+#     - KPI directory: reports/kpis/field_analysis/ (one RDS per week)
+#   Harvest data: Required (harvest.xlsx - planting dates for phase assignment)
+#   Data source: merged_tif_8b (uses cloud confidence for confidence)
+#   Mosaic mode: tiled (scales to 500+ fields)
+#   Projects: angata, chemba, xinavane, esa
+#
+# CLIENT_TYPE: agronomic_support
+#   Purpose: Provide weekly crop health insights to agronomists
+#   Scripts run: 80 (KPI), 90 (Word report)
+#   Outputs:
+#     - Farm-level KPI summaries (no per-field breakdown)
+#     - Word reports (Script 90): Charts and trends for agronomist decision support
+#     - KPI directory: reports/kpis/field_level/ (one RDS per week)
+#   Harvest data: Not used
+#   Data source: merged_tif (simpler, smaller)
+#   Mosaic mode: single-file (100-200 fields)
+#   Projects: aura
+#
+
+# Detect data source (merged_tif vs merged_tif_8b) based on availability
+# Returns the first available source; defaults to merged_tif_8b if neither exists
+detect_data_source <- function(project_dir) {
+  storage_dir <- get_project_storage_path(project_dir)
+  
+  # Preferred order: check merged_tif first, fall back to merged_tif_8b
+  for (source in c("merged_tif", "merged_tif_8b")) {
+    source_dir <- file.path(storage_dir, source)
+    if (dir.exists(source_dir)) {
+      tifs <- list.files(source_dir, pattern = "\\.tif$")
+      if (length(tifs) > 0) {
+        smartcane_log(sprintf("Detected data source: %s (%d TIF files)", source, length(tifs)))
+        return(source)
+      }
+    }
+  }
+  
+  smartcane_warn(sprintf("No data source found for %s - defaulting to merged_tif_8b", project_dir))
+  return("merged_tif_8b")
+}
+
+# Check KPI completeness for a reporting period
+# Returns: List with kpis_df (data.frame), missing_count, and all_complete (boolean)
+# This replaces duplicate KPI checking logic in run_full_pipeline.R (lines ~228-270, ~786-810)
+check_kpi_completeness <- function(project_dir, client_type, end_date, reporting_weeks_needed) {
+  kpi_dir <- get_kpi_dir(project_dir, client_type)
+  
+  kpis_needed <- data.frame()
+  
+  for (weeks_back in 0:(reporting_weeks_needed - 1)) {
+    check_date <- end_date - (weeks_back * 7)
+    wwy <- get_iso_week_year(check_date)
+    
+    # Build week pattern and check if it exists
+    week_pattern <- sprintf("week%02d_%d", wwy$week, wwy$year)
+    files_this_week <- list.files(kpi_dir, pattern = week_pattern)
+    has_kpis <- length(files_this_week) > 0
+    
+    # Track missing weeks
+    kpis_needed <- rbind(kpis_needed, data.frame(
+      week = wwy$week,
+      year = wwy$year,
+      date = check_date,
+      has_kpis = has_kpis,
+      pattern = week_pattern,
+      file_count = length(files_this_week)
+    ))
+    
+    # Debug logging
+    smartcane_debug(sprintf(
+      "Week %02d/%d (%s): %s (%d files)",
+      wwy$week, wwy$year, format(check_date, "%Y-%m-%d"),
+      if (has_kpis) "✓ FOUND" else "✗ MISSING",
+      length(files_this_week)
+    ))
+  }
+  
+  # Summary statistics
+  missing_count <- sum(!kpis_needed$has_kpis)
+  all_complete <- missing_count == 0
+  
+  return(list(
+    kpis_df = kpis_needed,
+    kpi_dir = kpi_dir,
+    missing_count = missing_count,
+    missing_weeks = kpis_needed[!kpis_needed$has_kpis, ],
+    all_complete = all_complete
+  ))
+}
+
 # 9. Initialize the project
 # ----------------------
 # Export project directories and settings
--- a/r_app/run_full_pipeline.R
+++ b/r_app/run_full_pipeline.R
@ -31,7 +31,7 @@

 # *** EDIT THESE VARIABLES ***
 end_date <- as.Date("2026-01-07") # or specify: as.Date("2026-01-27") , Sys.Date()
-project_dir <- "angata"              # project name: "esa", "aura", "angata", "chemba"
+project_dir <- "aura" # project name: "esa", "aura", "angata", "chemba"
 data_source <- if (project_dir == "angata") "merged_tif_8b" else "merged_tif"
 force_rerun <- FALSE # Set to TRUE to force all scripts to run even if outputs exist
 # ***************************
@ -45,30 +45,11 @@ cat(sprintf("\nProject: %s → Client Type: %s\n", project_dir, client_type))
 # DETECT WHICH DATA SOURCE IS AVAILABLE (merged_tif vs merged_tif_8b)
 # ==============================================================================
 # Check which merged_tif folder actually has files for this project
-laravel_storage_dir <- file.path("laravel_app", "storage", "app", project_dir)
-merged_tif_path <- file.path(laravel_storage_dir, "merged_tif")
-merged_tif_8b_path <- file.path(laravel_storage_dir, "merged_tif_8b")
-
-data_source_used <- "merged_tif_8b"  # Default
-if (dir.exists(merged_tif_path)) {
-  tif_files <- list.files(merged_tif_path, pattern = "\\.tif$")
-  if (length(tif_files) > 0) {
-    data_source_used <- "merged_tif"
-    cat(sprintf("[INFO] Detected data source: %s (%d TIF files)\n", data_source_used, length(tif_files)))
-  } else if (dir.exists(merged_tif_8b_path)) {
-    tif_files_8b <- list.files(merged_tif_8b_path, pattern = "\\.tif$")
-    if (length(tif_files_8b) > 0) {
-      data_source_used <- "merged_tif_8b"
-      cat(sprintf("[INFO] Detected data source: %s (%d TIF files)\n", data_source_used, length(tif_files_8b)))
-    }
-  }
-} else if (dir.exists(merged_tif_8b_path)) {
-  tif_files_8b <- list.files(merged_tif_8b_path, pattern = "\\.tif$")
-  if (length(tif_files_8b) > 0) {
-    data_source_used <- "merged_tif_8b"
-    cat(sprintf("[INFO] Detected data source: %s (%d TIF files)\n", data_source_used, length(tif_files_8b)))
-  }
-}
+# Uses centralized detection function from parameters_project.R
+# NOTE: Old code below commented out - now handled by detect_data_source()
+# laravel_storage_dir <- file.path("laravel_app", "storage", "app", project_dir)
+# merged_tif_path <- file.path(laravel_storage_dir, "merged_tif")
+data_source_used <- detect_data_source(project_dir)

 # ==============================================================================
 # DETERMINE REPORTING WINDOW (auto-calculated based on KPI requirements)
@ -79,9 +60,11 @@ reporting_weeks_needed <- 4  # Default: KPIs need current week + 3 weeks history
 offset <- (reporting_weeks_needed - 1) * 7 # Convert weeks to days

 cat(sprintf("\n[INFO] Reporting window: %d weeks (%d days of data)\n", reporting_weeks_needed, offset))
-cat(sprintf("       Running week: %02d / %d\n", as.numeric(format(end_date, "%V")), as.numeric(format(end_date, "%Y"))))
+wwy_current <- get_iso_week_year(end_date)
+cat(sprintf("       Running week: %02d / %d\n", wwy_current$week, wwy_current$year))
 cat(sprintf("       Date range: %s to %s\n", format(end_date - offset, "%Y-%m-%d"), format(end_date, "%Y-%m-%d")))

+
 # Format dates
 end_date_str <- format(as.Date(end_date), "%Y-%m-%d")

@ -95,37 +78,15 @@ pipeline_success <- TRUE
 # Run this BEFORE downloads so we can download ONLY missing dates upfront
 cat("\n========== EARLY CHECK: MOSAIC REQUIREMENTS FOR REPORTING WINDOW ==========\n")

-# Detect mosaic mode early (before full checking section)
-detect_mosaic_mode_early <- function(project_dir) {
-  weekly_tile_max <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max")
-  if (dir.exists(weekly_tile_max)) {
-    subfolders <- list.dirs(weekly_tile_max, full.names = FALSE, recursive = FALSE)
-    grid_patterns <- grep("^\\d+x\\d+$", subfolders, value = TRUE)
-    if (length(grid_patterns) > 0) {
-      return("tiled")
-    }
-  }
-  
-  weekly_mosaic <- file.path("laravel_app", "storage", "app", project_dir, "weekly_mosaic")
-  if (dir.exists(weekly_mosaic)) {
-    files <- list.files(weekly_mosaic, pattern = "^week_.*\\.tif$")
-    if (length(files) > 0) {
-      return("single-file")
-    }
-  }
-  
-  return("unknown")
-}
-
-mosaic_mode <- detect_mosaic_mode_early(project_dir)
+# Detect mosaic mode early (centralized function in parameters_project.R)
+mosaic_mode <- detect_mosaic_mode(project_dir)

 # Check what mosaics we NEED
 weeks_needed <- data.frame()
 for (weeks_back in 0:(reporting_weeks_needed - 1)) {
  check_date <- end_date - (weeks_back * 7)
-  week_num <- as.numeric(format(check_date, "%V"))
-  year_num <- as.numeric(format(check_date, "%G"))  # %G = ISO week year (not calendar year %Y)
-  weeks_needed <- rbind(weeks_needed, data.frame(week = week_num, year = year_num, date = check_date))
+  wwy <- get_iso_week_year(check_date)
+  weeks_needed <- rbind(weeks_needed, data.frame(week = wwy$week, year = wwy$year, date = check_date))
 }

 missing_weeks_dates <- c() # Will store the earliest date of missing weeks
@ -144,7 +105,7 @@ for (i in 1:nrow(weeks_needed)) {
  files_this_week <- c()

  if (mosaic_mode == "tiled") {
-    mosaic_dir_check <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max", "5x5")
+    mosaic_dir_check <- get_mosaic_dir(project_dir, mosaic_mode = "tiled")
    if (dir.exists(mosaic_dir_check)) {
      files_this_week <- list.files(mosaic_dir_check, pattern = week_pattern_check)
    }
@ -155,8 +116,10 @@ for (i in 1:nrow(weeks_needed)) {
    }
  }

-  cat(sprintf("  Week %02d/%d (%s): %s\n", week_num, year_num, format(check_date, "%Y-%m-%d"),
-              if(length(files_this_week) > 0) "✓ EXISTS" else "✗ MISSING"))
+  cat(sprintf(
+    "  Week %02d/%d (%s): %s\n", week_num, year_num, format(check_date, "%Y-%m-%d"),
+    if (length(files_this_week) > 0) "✓ EXISTS" else "✗ MISSING"
+  ))

  # If week is missing, track its date range for downloading/processing
  if (length(files_this_week) == 0) {
@ -175,8 +138,10 @@ if (earliest_missing_date < end_date) {

  # Adjust offset to cover only the gap (from earliest missing week to end_date)
  dynamic_offset <- as.numeric(end_date - earliest_missing_date)
-  cat(sprintf("[INFO] Will download/process ONLY missing dates: %d days (from %s to %s)\n", 
-              dynamic_offset, format(earliest_missing_date, "%Y-%m-%d"), format(end_date, "%Y-%m-%d")))
+  cat(sprintf(
+    "[INFO] Will download/process ONLY missing dates: %d days (from %s to %s)\n",
+    dynamic_offset, format(earliest_missing_date, "%Y-%m-%d"), format(end_date, "%Y-%m-%d")
+  ))

  # Use dynamic offset for data generation scripts (10, 20, 30, 40)
  # But Script 80 still uses full reporting_weeks_needed offset for KPI calculations
@ -193,80 +158,39 @@ if (earliest_missing_date < end_date) {
 # ==============================================================================
 # Scripts 90 (Word report) and 91 (Excel report) require KPIs for full reporting window
 # Script 80 ALWAYS runs and will CALCULATE missing KPIs, so this is just for visibility
+# Uses centralized check_kpi_completeness() function from parameters_project.R
 cat("\n========== KPI REQUIREMENT CHECK ==========\n")
-cat(sprintf("KPIs needed for reporting: %d weeks (current week + %d weeks history)\n", 
-            reporting_weeks_needed, reporting_weeks_needed - 1))
+cat(sprintf(
+  "KPIs needed for reporting: %d weeks (current week + %d weeks history)\n",
+  reporting_weeks_needed, reporting_weeks_needed - 1
+))

-# Determine KPI directory based on client type
-# - agronomic_support: field_level/ (6 farm-level KPIs)
-# - cane_supply: field_analysis/ (per-field analysis)
-kpi_subdir <- if (client_type == "agronomic_support") "field_level" else "field_analysis"
-kpi_dir <- file.path("laravel_app", "storage", "app", project_dir, "reports", "kpis", kpi_subdir)
+# Check KPI completeness (replaces duplicate logic from lines ~228-270 and ~786-810)
+kpi_check <- check_kpi_completeness(project_dir, client_type, end_date, reporting_weeks_needed)
+kpi_dir <- kpi_check$kpi_dir
+kpis_needed <- kpi_check$kpis_df
+kpis_missing_count <- kpi_check$missing_count

 # Create KPI directory if it doesn't exist
 if (!dir.exists(kpi_dir)) {
  dir.create(kpi_dir, recursive = TRUE, showWarnings = FALSE)
-  cat(sprintf("[KPI_DIR_CREATED] Created directory: %s\n", kpi_dir))
 }

-kpis_needed <- data.frame()
-kpis_missing_count <- 0
-
-# Debug: Check if KPI directory exists
-if (dir.exists(kpi_dir)) {
-  cat(sprintf("[KPI_DIR_EXISTS] %s\n", kpi_dir))
-  all_kpi_files <- list.files(kpi_dir)
-  cat(sprintf("[KPI_DEBUG] Total files in directory: %d\n", length(all_kpi_files)))
-  if (length(all_kpi_files) > 0) {
-    cat(sprintf("[KPI_DEBUG] Sample files: %s\n", paste(head(all_kpi_files, 3), collapse = ", ")))
-  }
-} else {
-  cat(sprintf("[KPI_DIR_MISSING] Directory does not exist: %s\n", kpi_dir))
-}
-
-for (weeks_back in 0:(reporting_weeks_needed - 1)) {
-  check_date <- end_date - (weeks_back * 7)
-  week_num <- as.numeric(format(check_date, "%V"))
-  year_num <- as.numeric(format(check_date, "%G"))
-  
-  # Check for any KPI file from that week - use more flexible pattern matching
-  week_pattern <- sprintf("week%02d_%d", week_num, year_num)
-  kpi_files_this_week <- c()
-  if (dir.exists(kpi_dir)) {
-    # List all files and manually check for pattern match
-    all_files <- list.files(kpi_dir, pattern = "\\.csv$|\\.json$")
-    kpi_files_this_week <- all_files[grepl(week_pattern, all_files, fixed = TRUE)]
-    
-    # Debug output for first week
-    if (weeks_back == 0) {
-      cat(sprintf("[KPI_DEBUG_W%02d_%d] Pattern: '%s' | Found: %d files\n", 
-                  week_num, year_num, week_pattern, length(kpi_files_this_week)))
-      if (length(kpi_files_this_week) > 0) {
-        cat(sprintf("[KPI_DEBUG_W%02d_%d] Files: %s\n", 
-                    week_num, year_num, paste(kpi_files_this_week, collapse = ", ")))
-      }
-    }
-  }
-  
-  has_kpis <- length(kpi_files_this_week) > 0
-  kpis_needed <- rbind(kpis_needed, data.frame(
-    week = week_num, 
-    year = year_num, 
-    date = check_date,
-    has_kpis = has_kpis
+# Display status for each week
+for (i in 1:nrow(kpis_needed)) {
+  row <- kpis_needed[i, ]
+  cat(sprintf(
+    "  Week %02d/%d (%s): %s (%d files)\n",
+    row$week, row$year, format(row$date, "%Y-%m-%d"),
+    if (row$has_kpis) "✓ EXISTS" else "✗ WILL BE CALCULATED",
+    row$file_count
  ))
-  
-  if (!has_kpis) {
-    kpis_missing_count <- kpis_missing_count + 1
 }

-  cat(sprintf("  Week %02d/%d (%s): %s\n", 
-              week_num, year_num, format(check_date, "%Y-%m-%d"),
-              if(has_kpis) "✓ EXISTS" else "✗ WILL BE CALCULATED"))
-}
-
-cat(sprintf("\nKPI Summary: %d/%d weeks exist, %d week(s) will be calculated by Script 80\n",
-            nrow(kpis_needed) - kpis_missing_count, nrow(kpis_needed), kpis_missing_count))
+cat(sprintf(
+  "\nKPI Summary: %d/%d weeks exist, %d week(s) will be calculated by Script 80\n",
+  nrow(kpis_needed) - kpis_missing_count, nrow(kpis_needed), kpis_missing_count
+))

 # Define conditional script execution based on client type
 # Client types:
@ -297,31 +221,7 @@ run_modern_report <- (client_type == "cane_supply")  # Script 91 for cane supply
 # ==============================================================================
 cat("\n========== CHECKING EXISTING OUTPUTS ==========\n")

-# Detect mosaic mode (tile-based vs single-file) automatically
-detect_mosaic_mode_simple <- function(project_dir) {
-  # Check for tile-based approach: weekly_tile_max/{grid_size}/week_*.tif
-  weekly_tile_max <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max")
-  if (dir.exists(weekly_tile_max)) {
-    subfolders <- list.dirs(weekly_tile_max, full.names = FALSE, recursive = FALSE)
-    grid_patterns <- grep("^\\d+x\\d+$", subfolders, value = TRUE)
-    if (length(grid_patterns) > 0) {
-      return("tiled")
-    }
-  }
-  
-  # Check for single-file approach: weekly_mosaic/week_*.tif
-  weekly_mosaic <- file.path("laravel_app", "storage", "app", project_dir, "weekly_mosaic")
-  if (dir.exists(weekly_mosaic)) {
-    files <- list.files(weekly_mosaic, pattern = "^week_.*\\.tif$")
-    if (length(files) > 0) {
-      return("single-file")
-    }
-  }
-  
-  return("unknown")
-}
-
-mosaic_mode <- detect_mosaic_mode_simple(project_dir)
+# Use centralized mosaic mode detection from parameters_project.R
 cat(sprintf("Auto-detected mosaic mode: %s\n", mosaic_mode))

 # Check Script 10 outputs - FLEXIBLE: look for tiles either directly OR in grid subdirs
@ -363,8 +263,7 @@ skip_40 <- (nrow(missing_weeks) == 0 && !force_rerun)  # Only skip if NO missing
 cat(sprintf("Script 40: %d missing week(s) to create\n", nrow(missing_weeks)))

 # Check Script 80 outputs (KPIs in reports/kpis/{field_level|field_analysis})
-# Use the same kpi_subdir logic to find the right directory
-kpi_dir <- file.path("laravel_app", "storage", "app", project_dir, "reports", "kpis", kpi_subdir)
+# kpi_dir already set by check_kpi_completeness() above
 kpi_files <- if (dir.exists(kpi_dir)) {
  list.files(kpi_dir, pattern = "\\.csv$|\\.json$")
 } else {
@ -400,7 +299,8 @@ cat(sprintf("  Script 91: %s %s\n", if(!run_modern_report) "SKIP" else "RUN", if
 # PYTHON: DOWNLOAD PLANET IMAGES (MISSING DATES ONLY)
 # ==============================================================================
 cat("\n========== DOWNLOADING PLANET IMAGES (MISSING DATES ONLY) ==========\n")
-tryCatch({
+tryCatch(
+  {
    # Setup paths
    base_path <- file.path("laravel_app", "storage", "app", project_dir)
    merged_tifs_dir <- file.path(base_path, data_source)
@ -467,18 +367,20 @@ tryCatch({
    if (download_count > 0) {
      skip_10 <- FALSE
    }
-  
-}, error = function(e) {
+  },
+  error = function(e) {
    cat("✗ Error in planet download:", e$message, "\n")
    pipeline_success <<- FALSE
-})
+  }
+)

 # ==============================================================================
 # SCRIPT 10: CREATE MASTER GRID AND SPLIT TIFFs
 # ==============================================================================
 if (pipeline_success && !skip_10) {
  cat("\n========== RUNNING SCRIPT 10: CREATE MASTER GRID AND SPLIT TIFFs ==========\n")
-  tryCatch({
+  tryCatch(
+    {
      # CRITICAL: Save global variables before sourcing Script 10 (it overwrites end_date, offset, etc.)
      saved_end_date <- end_date
      saved_offset <- offset # Use FULL offset for tiling (not dynamic_offset)
@ -501,19 +403,26 @@ if (pipeline_success && !skip_10) {
      project_dir <- saved_project_dir
      data_source <- saved_data_source

-    # Verify output
-    tiles_dir <- file.path("laravel_app", "storage", "app", project_dir, "daily_tiles_split", "5x5")
+      # Verify output - auto-detect grid size
+      grid_size <- detect_grid_size(project_dir)
+      tiles_dir <- if (grid_size != "unknown") {
+        file.path("laravel_app", "storage", "app", project_dir, "daily_tiles_split", grid_size)
+      } else {
+        file.path("laravel_app", "storage", "app", project_dir, "daily_tiles_split", "5x5")
+      }
      if (dir.exists(tiles_dir)) {
        subdirs <- list.dirs(tiles_dir, full.names = FALSE, recursive = FALSE)
        cat(sprintf("✓ Script 10 completed - created tiles for %d dates\n", length(subdirs)))
      } else {
        cat("✓ Script 10 completed\n")
      }
-  }, error = function(e) {
+    },
+    error = function(e) {
      sink()
      cat("✗ Error in Script 10:", e$message, "\n")
      pipeline_success <<- FALSE
-  })
+    }
+  )
 } else if (skip_10) {
  cat("\n========== SKIPPING SCRIPT 10 (tiles already exist) ==========\n")
 }
@ -523,12 +432,16 @@ if (pipeline_success && !skip_10) {
 # ==============================================================================
 if (pipeline_success && !skip_20) {
  cat("\n========== RUNNING SCRIPT 20: CI EXTRACTION ==========\n")
-  tryCatch({
+  tryCatch(
+    {
      # Run Script 20 via system() to pass command-line args just like from terminal
      # Arguments: end_date offset project_dir data_source
      # Use FULL offset so CI extraction covers entire reporting window (not just new data)
-    cmd <- sprintf('"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe" --vanilla r_app/20_ci_extraction.R "%s" %d "%s" "%s"', 
-                   format(end_date, "%Y-%m-%d"), offset, project_dir, data_source)
+      cmd <- sprintf(
+        '"%s" --vanilla r_app/20_ci_extraction.R "%s" %d "%s" "%s"',
+        RSCRIPT_PATH,
+        format(end_date, "%Y-%m-%d"), offset, project_dir, data_source
+      )
      result <- system(cmd)

      if (result != 0) {
@ -543,10 +456,12 @@ if (pipeline_success && !skip_20) {
      } else {
        cat("✓ Script 20 completed\n")
      }
-  }, error = function(e) {
+    },
+    error = function(e) {
      cat("✗ Error in Script 20:", e$message, "\n")
      pipeline_success <<- FALSE
-  })
+    }
+  )
 } else if (skip_20) {
  cat("\n========== SKIPPING SCRIPT 20 (CI already extracted) ==========\n")
 }
@ -556,7 +471,8 @@ if (pipeline_success && !skip_20) {
 # ==============================================================================
 if (pipeline_success && !skip_21) {
  cat("\n========== RUNNING SCRIPT 21: CONVERT CI RDS TO CSV ==========\n")
-  tryCatch({
+  tryCatch(
+    {
      # Set environment variables for the script
      assign("end_date", end_date, envir = .GlobalEnv)
      assign("offset", offset, envir = .GlobalEnv)
@ -573,10 +489,12 @@ if (pipeline_success && !skip_21) {
      } else {
        cat("✓ Script 21 completed\n")
      }
-  }, error = function(e) {
+    },
+    error = function(e) {
      cat("✗ Error in Script 21:", e$message, "\n")
      pipeline_success <<- FALSE
-  })
+    }
+  )
 } else if (skip_21) {
  cat("\n========== SKIPPING SCRIPT 21 (CSV already created) ==========\n")
 }
@ -586,12 +504,16 @@ if (pipeline_success && !skip_21) {
 # ==============================================================================
 if (pipeline_success && !skip_30) {
  cat("\n========== RUNNING SCRIPT 30: INTERPOLATE GROWTH MODEL ==========\n")
-  tryCatch({
+  tryCatch(
+    {
      # Run Script 30 via system() to pass command-line args just like from terminal
      # Script 30 expects: project_dir data_source as arguments
      # Pass the same data_source that Script 20 is using
-    cmd <- sprintf('"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe" --vanilla r_app/30_interpolate_growth_model.R "%s" "%s"', 
-                   project_dir, data_source_used)
+      cmd <- sprintf(
+        '"%s" --vanilla r_app/30_interpolate_growth_model.R "%s" "%s"',
+        RSCRIPT_PATH,
+        project_dir, data_source_used
+      )
      result <- system(cmd)

      if (result != 0) {
@ -606,10 +528,12 @@ if (pipeline_success && !skip_30) {
      } else {
        cat("✓ Script 30 completed\n")
      }
-  }, error = function(e) {
+    },
+    error = function(e) {
      cat("✗ Error in Script 30:", e$message, "\n")
      pipeline_success <<- FALSE
-  })
+    }
+  )
 }

 # ==============================================================================
@ -617,33 +541,36 @@ if (pipeline_success && !skip_30) {
 # ==============================================================================
 if (pipeline_success && !skip_31) {
  cat("\n========== RUNNING PYTHON 31: HARVEST IMMINENT WEEKLY ==========\n")
-  tryCatch({
+  tryCatch(
+    {
      # Run Python script in pytorch_gpu conda environment
      # Script expects positional project name (not --project flag)
      # Run from smartcane root so conda can find the environment
-    cmd <- sprintf('conda run -n pytorch_gpu python python_app/31_harvest_imminent_weekly.py %s', project_dir)
-    cat("DEBUG: Running command:", cmd, "\n")
+      cmd <- sprintf("conda run -n pytorch_gpu python python_app/31_harvest_imminent_weekly.py %s", project_dir)
      result <- system(cmd)

      if (result == 0) {
        # Verify harvest output - check for THIS WEEK's specific file
-      current_week <- as.numeric(format(end_date, "%V"))
-      current_year <- as.numeric(format(end_date, "%Y"))
-      expected_file <- file.path("laravel_app", "storage", "app", project_dir, "reports", "kpis", "field_stats",
-                                 sprintf("%s_harvest_imminent_week_%02d_%d.csv", project_dir, current_week, current_year))
+        wwy_current_31 <- get_iso_week_year(end_date)
+        expected_file <- file.path(
+          "laravel_app", "storage", "app", project_dir, "reports", "kpis", "field_stats",
+          sprintf("%s_harvest_imminent_week_%02d_%d.csv", project_dir, wwy_current_31$week, wwy_current_31$year)
+        )

        if (file.exists(expected_file)) {
-        cat(sprintf("✓ Script 31 completed - generated harvest imminent file for week %02d\n", current_week))
+          cat(sprintf("✓ Script 31 completed - generated harvest imminent file for week %02d\n", wwy_current_31$week))
        } else {
          cat("✓ Script 31 completed (check if harvest.xlsx is available)\n")
        }
      } else {
        cat("⚠ Script 31 completed with errors (check harvest.xlsx availability)\n")
      }
-  }, error = function(e) {
+    },
+    error = function(e) {
      setwd(original_dir)
      cat("⚠ Script 31 error:", e$message, "\n")
-  })
+    }
+  )
 } else if (skip_31) {
  cat("\n========== SKIPPING SCRIPT 31 (non-cane_supply client type) ==========\n")
 }
@ -665,15 +592,21 @@ if (pipeline_success && !skip_40) {
      year_num <- missing_week$year
      week_end_date <- as.Date(missing_week$week_end_date)

-      cat(sprintf("--- Creating mosaic for week %02d/%d (ending %s) ---\n", 
-                  week_num, year_num, format(week_end_date, "%Y-%m-%d")))
+      cat(sprintf(
+        "--- Creating mosaic for week %02d/%d (ending %s) ---\n",
+        week_num, year_num, format(week_end_date, "%Y-%m-%d")
+      ))

-      tryCatch({
+      tryCatch(
+        {
          # Run Script 40 with offset=7 (one week only) for this specific week
          # The end_date is the last day of the week, and offset=7 covers the full 7-day week
          # IMPORTANT: Pass data_source so Script 40 uses the correct folder (not auto-detect which can be wrong)
-        cmd <- sprintf('"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe" --vanilla r_app/40_mosaic_creation.R "%s" 7 "%s" "" "%s"', 
-                       format(week_end_date, "%Y-%m-%d"), project_dir, data_source)
+          cmd <- sprintf(
+            '"%s" --vanilla r_app/40_mosaic_creation.R "%s" 7 "%s" "" "%s"',
+            RSCRIPT_PATH,
+            format(week_end_date, "%Y-%m-%d"), project_dir, data_source
+          )
          result <- system(cmd)

          if (result != 0) {
@ -683,7 +616,7 @@ if (pipeline_success && !skip_40) {
          # Verify mosaic was created for this specific week
          mosaic_created <- FALSE
          if (mosaic_mode == "tiled") {
-          mosaic_dir <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max", "5x5")
+            mosaic_dir <- get_mosaic_dir(project_dir, mosaic_mode = "tiled")
            if (dir.exists(mosaic_dir)) {
              week_pattern <- sprintf("week_%02d_%d\\.tif", week_num, year_num)
              mosaic_files <- list.files(mosaic_dir, pattern = week_pattern)
@ -703,10 +636,12 @@ if (pipeline_success && !skip_40) {
          } else {
            cat(sprintf("✓ Week %02d/%d processing completed (verify output)\n\n", week_num, year_num))
          }
-      }, error = function(e) {
+        },
+        error = function(e) {
          cat(sprintf("✗ Error creating mosaic for week %02d/%d: %s\n", week_num, year_num, e$message), "\n")
          pipeline_success <<- FALSE
-      })
+        }
+      )
    }

    if (pipeline_success) {
@ -733,46 +668,59 @@ if (pipeline_success && !skip_80) {
    # Sort by date (oldest to newest) for sequential processing
    weeks_to_calculate <- weeks_to_calculate[order(weeks_to_calculate$date), ]

-    cat(sprintf("Looping through %d missing week(s) in reporting window (from %s back to %s):\n\n", 
+    cat(sprintf(
+      "Looping through %d missing week(s) in reporting window (from %s back to %s):\n\n",
      nrow(weeks_to_calculate),
      format(max(weeks_to_calculate$date), "%Y-%m-%d"),
-                format(min(weeks_to_calculate$date), "%Y-%m-%d")))
+      format(min(weeks_to_calculate$date), "%Y-%m-%d")
+    ))

-    tryCatch({
+    tryCatch(
+      {
        for (week_idx in 1:nrow(weeks_to_calculate)) {
          week_row <- weeks_to_calculate[week_idx, ]
          calc_date <- week_row$date

          # Run Script 80 for this specific week with offset=7 (one week only)
          # This ensures Script 80 calculates KPIs for THIS week with proper trend data
-        cmd <- sprintf('"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe" --vanilla r_app/80_calculate_kpis.R "%s" "%s" %d', 
-                       format(calc_date, "%Y-%m-%d"), project_dir, 7)  # offset=7 for single week
+          cmd <- sprintf(
+            '"%s" --vanilla r_app/80_calculate_kpis.R "%s" "%s" %d',
+            RSCRIPT_PATH,
+            format(calc_date, "%Y-%m-%d"), project_dir, 7
+          ) # offset=7 for single week

-        cat(sprintf("  [Week %02d/%d] Running Script 80 with end_date=%s...\n", 
-                    week_row$week, week_row$year, format(calc_date, "%Y-%m-%d")))
+          cat(sprintf(
+            "  [Week %02d/%d] Running Script 80 with end_date=%s...\n",
+            week_row$week, week_row$year, format(calc_date, "%Y-%m-%d")
+          ))

          result <- system(cmd, ignore.stdout = TRUE, ignore.stderr = TRUE)

          if (result == 0) {
            cat(sprintf("    ✓ KPIs calculated for week %02d/%d\n", week_row$week, week_row$year))
          } else {
-          cat(sprintf("    ✗ Error calculating KPIs for week %02d/%d (exit code: %d)\n", 
-                      week_row$week, week_row$year, result))
+            cat(sprintf(
+              "    ✗ Error calculating KPIs for week %02d/%d (exit code: %d)\n",
+              week_row$week, week_row$year, result
+            ))
          }
        }

-      # Verify total KPI output
-      kpi_dir <- file.path("laravel_app", "storage", "app", project_dir, "reports", "kpis", kpi_subdir)
+        # Verify total KPI output (kpi_dir defined by check_kpi_completeness() earlier)
        if (dir.exists(kpi_dir)) {
          files <- list.files(kpi_dir, pattern = "\\.csv$|\\.json$")
-        cat(sprintf("\n✓ Script 80 loop completed - total %d KPI files in %s/\n", length(files), kpi_subdir))
+          # Extract subdir name from kpi_dir path for display
+          subdir_name <- basename(kpi_dir)
+          cat(sprintf("\n✓ Script 80 loop completed - total %d KPI files in %s/\n", length(files), subdir_name))
        } else {
          cat("\n✓ Script 80 loop completed\n")
        }
-    }, error = function(e) {
+      },
+      error = function(e) {
        cat("✗ Error in Script 80 loop:", e$message, "\n")
        pipeline_success <<- FALSE
-    })
+      }
+    )
  } else {
    cat(sprintf("✓ All %d weeks already have KPIs - skipping calculation\n", nrow(kpis_needed)))
  }
@ -819,7 +767,8 @@ if (pipeline_success && run_legacy_report) {
  if (!kpis_complete) {
    cat("⚠ Skipping Script 90 - KPIs not available for full reporting window\n")
  } else {
-    tryCatch({
+    tryCatch(
+      {
        # Script 90 is an RMarkdown file - compile it with rmarkdown::render()
        output_dir <- file.path("laravel_app", "storage", "app", project_dir, "reports")

@ -828,9 +777,11 @@ if (pipeline_success && run_legacy_report) {
          dir.create(output_dir, recursive = TRUE, showWarnings = FALSE)
        }

-      output_filename <- sprintf("CI_report_week%02d_%d.docx", 
+        output_filename <- sprintf(
+          "CI_report_week%02d_%d.docx",
          as.numeric(format(end_date, "%V")),
-                                as.numeric(format(end_date, "%G")))
+          as.numeric(format(end_date, "%G"))
+        )

        # Render the RMarkdown document
        rmarkdown::render(
@ -845,10 +796,12 @@ if (pipeline_success && run_legacy_report) {
        )

        cat(sprintf("✓ Script 90 completed - generated Word report: %s\n", output_filename))
-    }, error = function(e) {
+      },
+      error = function(e) {
        cat("✗ Error in Script 90:", e$message, "\n")
        pipeline_success <<- FALSE
-    })
+      }
+    )
  }
 } else if (run_legacy_report) {
  cat("\n========== SKIPPING SCRIPT 90 (pipeline error or KPIs incomplete) ==========\n")
@ -863,7 +816,8 @@ if (pipeline_success && run_modern_report) {
  if (!kpis_complete) {
    cat("⚠ Skipping Script 91 - KPIs not available for full reporting window\n")
  } else {
-    tryCatch({
+    tryCatch(
+      {
        # Script 91 is an RMarkdown file - compile it with rmarkdown::render()
        output_dir <- file.path("laravel_app", "storage", "app", project_dir, "reports")

@ -872,9 +826,11 @@ if (pipeline_success && run_modern_report) {
          dir.create(output_dir, recursive = TRUE, showWarnings = FALSE)
        }

-      output_filename <- sprintf("CI_report_week%02d_%d.docx", 
+        output_filename <- sprintf(
+          "CI_report_week%02d_%d.docx",
          as.numeric(format(end_date, "%V")),
-                                as.numeric(format(end_date, "%G")))
+          as.numeric(format(end_date, "%G"))
+        )

        # Render the RMarkdown document
        rmarkdown::render(
@ -889,10 +845,12 @@ if (pipeline_success && run_modern_report) {
        )

        cat(sprintf("✓ Script 91 completed - generated Word report: %s\n", output_filename))
-    }, error = function(e) {
+      },
+      error = function(e) {
        cat("✗ Error in Script 91:", e$message, "\n")
        pipeline_success <<- FALSE
-    })
+      }
+    )
  }
 } else if (run_modern_report) {
  cat("\n========== SKIPPING SCRIPT 91 (pipeline error or KPIs incomplete) ==========\n")