SmartCane/r_app/91_CI_report_with_kpis_Angata.Rmd

---
params:
  ref: "word-styles-reference-var1.docx"
  output_file: CI_report.docx
  report_date: "2026-01-25"
  data_dir: "angata"
  mail_day: "Wednesday"
  borders: FALSE
  ci_plot_type: "both"  # options: "absolute", "cumulative", "both"
  colorblind_friendly: TRUE  # use colorblind-friendly palettes (viridis/plasma)
  facet_by_season: FALSE  # facet CI trend plots by season instead of overlaying
  x_axis_unit: "days"  # x-axis unit for trend plots: "days" or "weeks"
output:
   word_document:
    reference_docx: !expr file.path("word-styles-reference-var1.docx")
    toc: no
editor_options:
  chunk_output_type: console
---

```{r setup_parameters, include=FALSE}
# Set up basic report parameters from input values
report_date <- params$report_date
mail_day <- params$mail_day
borders <- params$borders
ci_plot_type <- params$ci_plot_type
colorblind_friendly <- params$colorblind_friendly
facet_by_season <- params$facet_by_season
x_axis_unit <- params$x_axis_unit
```

```{r load_libraries, message=FALSE, warning=FALSE, include=FALSE}
# Configure knitr options
knitr::opts_chunk$set(warning = FALSE, message = FALSE)

# Set flag for reporting scripts to use pivot.geojson instead of pivot_2.geojson
reporting_script <- TRUE

# Load all packages at once with suppressPackageStartupMessages
suppressPackageStartupMessages({
  library(here)
  library(sf)
  library(terra)
  library(tidyverse)
  library(tmap)
  library(lubridate)
  library(zoo)
  library(rsample)
  library(caret)
  library(randomForest)
  library(CAST)
  library(knitr)
  library(tidyr)
  library(flextable)
})

# Load custom utility functions
tryCatch({
  source("r_app/report_utils.R")
}, error = function(e) {
  message(paste("Error loading report_utils.R:", e$message))
  # Try alternative path if the first one fails
  tryCatch({
    source(here::here("r_app", "report_utils.R"))
  }, error = function(e) {
    stop("Could not load report_utils.R from either location: ", e$message)
  })
})

# Function to determine field priority level based on CV and Moran's I
# Returns: 1=Urgent, 2=Monitor, 3=No stress
get_field_priority_level <- function(cv, morans_i) {
  # Handle NA values
  if (is.na(cv) || is.na(morans_i)) return(3)  # Default to no stress

  # Determine priority based on thresholds
  if (cv < 0.1) {
    if (morans_i < 0.7) {
      return(3)  # No stress
    } else if (morans_i <= 0.9) {
      return(2)  # Monitor (young field with some clustering)
    } else {
      return(1)  # Urgent
    }
  } else if (cv <= 0.15) {
    if (morans_i < 0.7) {
      return(2)  # Monitor
    } else {
      return(1)  # Urgent
    }
  } else {  # cv > 0.15
    return(1)  # Urgent
  }
}
```

```{r initialize_project_config, message=FALSE, warning=FALSE, include=FALSE}
# Set the project directory from parameters
project_dir <- params$data_dir

# Source project parameters with error handling
tryCatch({
  source(here::here("r_app", "parameters_project.R"))
}, error = function(e) {
  stop("Error loading parameters_project.R: ", e$message)
})

# Log initial configuration
safe_log("Starting the R Markdown script with KPIs")
safe_log(paste("mail_day params:", params$mail_day))
safe_log(paste("report_date params:", params$report_date))
safe_log(paste("mail_day variable:", mail_day))
```

```{r load_kpi_data, eval=TRUE, message=FALSE, warning=FALSE, include=FALSE}
## SIMPLE KPI LOADING - robust lookup with fallbacks
# Primary expected directory inside the laravel storage
kpi_data_dir <- file.path("..", "laravel_app", "storage", "app", project_dir, "reports", "kpis")
date_suffix <- format(as.Date(report_date), "%Y%m%d")

# Calculate current week from report_date using ISO 8601 week numbering
current_week <- as.numeric(format(as.Date(report_date), "%V"))
week_suffix <- paste0("week", current_week)

# Candidate filenames we expect (exact and common variants)
expected_summary_names <- c(
  paste0(project_dir, "_kpi_summary_tables_", week_suffix, ".rds"),
  paste0(project_dir, "_kpi_summary_tables_", date_suffix, ".rds"),
  paste0(project_dir, "_kpi_summary_tables.rds"),
  "kpi_summary_tables.rds",
  paste0("kpi_summary_tables_", week_suffix, ".rds"),
  paste0("kpi_summary_tables_", date_suffix, ".rds")
)

expected_field_details_names <- c(
  paste0(project_dir, "_field_details_", week_suffix, ".rds"),
  paste0(project_dir, "_field_details_", date_suffix, ".rds"),
  paste0(project_dir, "_field_details.rds"),
  "field_details.rds"
)

# Helper to attempt loading a file from the directory or fallback to a workspace-wide search
try_load_from_dir <- function(dir, candidates) {
  if (!dir.exists(dir)) return(NULL)
  for (name in candidates) {
    f <- file.path(dir, name)
    if (file.exists(f)) return(f)
  }
  return(NULL)
}

# Try primary directory first
summary_file <- try_load_from_dir(kpi_data_dir, expected_summary_names)
field_details_file <- try_load_from_dir(kpi_data_dir, expected_field_details_names)

# If not found, perform a workspace-wide search (slower) limited to laravel_app storage
if (is.null(summary_file) || is.null(field_details_file)) {
  safe_log(paste("KPI files not found in", kpi_data_dir, "—searching workspace for RDS files"))
  # List rds files under laravel_app/storage/app recursively
  files <- list.files(path = file.path("laravel_app", "storage", "app"), pattern = "\\.rds$", recursive = TRUE, full.names = TRUE)
  # Try to match by expected names
  if (is.null(summary_file)) {
    matched <- files[basename(files) %in% expected_summary_names]
    if (length(matched) > 0) summary_file <- matched[1]
  }
  if (is.null(field_details_file)) {
    matched2 <- files[basename(files) %in% expected_field_details_names]
    if (length(matched2) > 0) field_details_file <- matched2[1]
  }
}

# Final checks and load with safe error messages
kpi_files_exist <- FALSE
if (!is.null(summary_file) && file.exists(summary_file)) {
  safe_log(paste("Loading KPI summary from:", summary_file))
  summary_data <- tryCatch(readRDS(summary_file), error = function(e) { safe_log(paste("Failed to read summary RDS:", e$message), "ERROR"); NULL })

  # Convert new RDS structure (field_analysis, field_analysis_summary) to legacy summary_tables format
  if (!is.null(summary_data)) {
    if (is.list(summary_data) && !is.data.frame(summary_data)) {
      # New format from 09_field_analysis_weekly.R - just pass it through
      if ("field_analysis_summary" %in% names(summary_data)) {
        # Keep the new structure intact - combined_kpi_table will use it directly
        kpi_files_exist <- TRUE
      } else {
        # Old format - keep as is
        summary_tables <- summary_data
        if (!is.null(summary_tables)) kpi_files_exist <- TRUE
      }
    } else {
      # Data frame format or direct tables
      summary_tables <- summary_data
      if (!is.null(summary_tables)) kpi_files_exist <- TRUE
    }
  }
} else {
  safe_log(paste("KPI summary file not found. Searched:", paste(expected_summary_names, collapse=", ")), "WARNING")
}

if (!is.null(field_details_file) && file.exists(field_details_file)) {
  safe_log(paste("Loading field details from:", field_details_file))
  field_details_table <- tryCatch(readRDS(field_details_file), error = function(e) { safe_log(paste("Failed to read field details RDS:", e$message), "ERROR"); NULL })
  if (!is.null(field_details_table)) kpi_files_exist <- kpi_files_exist && TRUE
} else {
  safe_log(paste("Field details file not found. Searched:", paste(expected_field_details_names, collapse=", ")), "WARNING")
  # Try to extract field_details from summary_data if available
  if (exists("summary_data") && !is.null(summary_data) && "field_analysis" %in% names(summary_data)) {
    field_details_table <- summary_data$field_analysis %>%
      rename(`Mean CI` = Acreage, `CV Value` = CV, Field = Field_id)
    safe_log("Extracted field details from field_analysis data")
  }
}

if (kpi_files_exist) {
  safe_log("✓ KPI summary tables loaded successfully")
} else {
  safe_log("KPI files could not be located or loaded. KPI sections will be skipped.", "WARNING")
}
```

```{r load_cloud_coverage_data, eval=TRUE, message=FALSE, warning=FALSE, include=FALSE}
## LOAD PER-FIELD CLOUD COVERAGE DATA
# Cloud coverage calculated from the mosaic by script 09
# Expected filename pattern: [project_dir]_cloud_coverage_week[N].rds or _cloud_coverage_[date].rds

expected_cloud_names <- c(
  paste0(project_dir, "_cloud_coverage_week", week_suffix, ".rds"),
  paste0(project_dir, "_cloud_coverage_week", current_week, ".rds"),
  paste0(project_dir, "_cloud_coverage_", date_suffix, ".rds"),
  paste0(project_dir, "_cloud_coverage.rds"),
  paste0(project_dir, "_per_field_cloud_coverage.rds"),
  "cloud_coverage.rds",
  "per_field_cloud_coverage.rds"
)

# Try to load cloud coverage from KPI directory
cloud_file <- try_load_from_dir(kpi_data_dir, expected_cloud_names)

# If not found in KPI dir, search workspace
if (is.null(cloud_file)) {
  files <- list.files(path = file.path("laravel_app", "storage", "app"), pattern = "\\.rds$", recursive = TRUE, full.names = TRUE)
  matched <- files[basename(files) %in% expected_cloud_names]
  if (length(matched) > 0) cloud_file <- matched[1]
}

# Load cloud coverage data if file exists
per_field_cloud_coverage <- NULL
cloud_coverage_available <- FALSE

if (!is.null(cloud_file) && file.exists(cloud_file)) {
  safe_log(paste("Loading cloud coverage data from:", cloud_file))
  per_field_cloud_coverage <- tryCatch(
    readRDS(cloud_file),
    error = function(e) {
      safe_log(paste("Failed to read cloud coverage RDS:", e$message), "WARNING");
      NULL
    }
  )

  if (!is.null(per_field_cloud_coverage) && nrow(per_field_cloud_coverage) > 0) {
    cloud_coverage_available <- TRUE
    safe_log("✓ Per-field cloud coverage data loaded successfully")
  }
} else {
  safe_log("Per-field cloud coverage file not found. Cloud sections will be skipped.", "WARNING")
}
```

```{r generate_field_kpi_summary_function, include=FALSE, eval=TRUE}
#' Generate field-specific KPI summary for display in reports
#' @param field_name Name of the field to summarize
#' @param field_details_table Data frame with field-level KPI details
#' @return Formatted text string with field KPI summary
generate_field_kpi_summary <- function(field_name, field_details_table, CI_quadrant) {
  tryCatch({
    # Get field age from CI quadrant data for the CURRENT SEASON only
    # First identify the current season for this field
    current_season <- CI_quadrant %>%
      filter(field == field_name, Date <= as.Date(report_date)) %>%
      group_by(season) %>%
      summarise(season_end = max(Date), .groups = 'drop') %>%
      filter(season == max(season)) %>%
      pull(season)

    # Get the most recent DOY from the current season
    field_age <- CI_quadrant %>%
      filter(field == field_name, season == current_season) %>%
      pull(DOY) %>%
      max(na.rm = TRUE)

    # Filter data for this specific field
    field_data <- field_details_table %>%
      filter(Field == field_name)

    if (nrow(field_data) == 0) {
      return(paste("**Field", field_name, "KPIs:** Data not available"))
    }

    # Aggregate sub-field data for field-level summary
    # For categorical data, take the most common value or highest risk level
    field_summary <- field_data %>%
      summarise(
        field_size = sum(`Field Size (ha)`, na.rm = TRUE),
        uniformity_levels = paste(unique(`Growth Uniformity`), collapse = "/"),
        avg_yield_forecast = ifelse(is.na(`Yield Forecast (t/ha)`[1]), NA, mean(`Yield Forecast (t/ha)`, na.rm = TRUE)),
        max_gap_score = max(`Gap Score`, na.rm = TRUE),
        highest_decline_risk = case_when(
          any(`Decline Risk` == "Very-high") ~ "Very-high",
          any(`Decline Risk` == "High") ~ "High",
          any(`Decline Risk` == "Moderate") ~ "Moderate",
          any(`Decline Risk` == "Low") ~ "Low",
          TRUE ~ "Unknown"
        ),
        highest_weed_risk = case_when(
          any(`Weed Risk` == "High") ~ "High",
          any(`Weed Risk` == "Moderate") ~ "Moderate",
          any(`Weed Risk` == "Low") ~ "Low",
          TRUE ~ "Unknown"
        ),
        avg_mean_ci = mean(`Mean CI`, na.rm = TRUE),
        avg_cv = mean(`CV Value`, na.rm = TRUE),
        .groups = 'drop'
      )

    # Apply age-based filtering to yield forecast
    if (is.na(field_age) || field_age < 240) {
      field_summary$avg_yield_forecast <- NA_real_
    }

    # Format the summary text
    yield_text <- if (is.na(field_summary$avg_yield_forecast)) {
      "Yield Forecast: NA"
    } else {
      paste0("Yield Forecast: ", round(field_summary$avg_yield_forecast, 1), " t/ha")
    }

    kpi_text <- paste0(
      "Size: ", round(field_summary$field_size, 1), " ha • Growth Uniformity: ", field_summary$uniformity_levels,
      " • ", yield_text, " • Gap Score: ", round(field_summary$max_gap_score, 1),
      " • Decline Risk: ", field_summary$highest_decline_risk, " • Weed Risk: ", field_summary$highest_weed_risk,
      " • Mean CI: ", round(field_summary$avg_mean_ci, 2)
    )

    # Wrap in smaller text HTML tags for Word output
    #kpi_text <- paste0("<small>", kpi_text, "</small>")
    kpi_text <- paste0("<span style='font-size:10pt'>", kpi_text, "</span>")

    # Add alerts based on risk levels (smaller font too)
 #   alerts <- c()
 #   if (field_summary$highest_decline_risk %in% c("High", "Very-high")) {
 #     alerts <- c(alerts, "<span style='font-size: 8pt; color: red;'>🚨 High risk of growth decline detected</span>")
 #   }
 #   if (field_summary$highest_weed_risk == "High") {
 #     alerts <- c(alerts, "<span style='font-size: 8pt; color: orange;'>⚠️ High weed presence detected</span>")
 #   }
 #   if (field_summary$max_gap_score > 20) {
 #     alerts <- c(alerts, "<span style='font-size: 8pt; color: blue;'>💡 Significant gaps detected - monitor closely</span>")
 #   }
 #   if (field_summary$avg_cv > 0.25) {
 #     alerts <- c(alerts, "<span style='font-size: 8pt; color: orange;'>⚠️ Poor field uniformity - check irrigation/fertility</span>")
 #  }

 #   if (length(alerts) > 0) {
 #     kpi_text <- paste0(kpi_text, "\n\n", paste(alerts, collapse = "\n"))
 #   }

    return(kpi_text)

  }, error = function(e) {
    safe_log(paste("Error generating KPI summary for field", field_name, ":", e$message), "ERROR")
    return(paste("**Field", field_name, "KPIs:** Error generating summary"))
  })
}
```

```{r calculate_dates_and_weeks, message=FALSE, warning=FALSE, include=FALSE, eval=TRUE}
# Set locale for consistent date formatting
Sys.setlocale("LC_TIME", "C")

# Initialize date variables from parameters
today <- as.character(report_date)
mail_day_as_character <- as.character(mail_day)

# Calculate report dates and weeks using ISO 8601 week numbering
report_date_obj <- as.Date(today)
current_week <- as.numeric(format(report_date_obj, "%V"))
year <- as.numeric(format(report_date_obj, "%Y"))

# Calculate dates for weekly analysis
week_start <- report_date_obj - ((as.numeric(format(report_date_obj, "%w")) + 1) %% 7)
week_end <- week_start + 6

# Calculate week days (copied from 05 script for compatibility)
report_date_as_week_day <- weekdays(lubridate::ymd(today))
days_of_week <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")

# Calculate initial week number
week <- lubridate::week(today)
safe_log(paste("Initial week calculation:", week, "today:", today))

# Calculate previous dates for comparisons
today_minus_1 <- as.character(lubridate::ymd(today) - 7)
today_minus_2 <- as.character(lubridate::ymd(today) - 14)
today_minus_3 <- as.character(lubridate::ymd(today) - 21)

# Adjust week calculation based on mail day
if (which(days_of_week == report_date_as_week_day) > which(days_of_week == mail_day_as_character)) {
  safe_log("Adjusting weeks because of mail day")
  week <- lubridate::week(today) + 1
  today_minus_1 <- as.character(lubridate::ymd(today))
  today_minus_2 <- as.character(lubridate::ymd(today) - 7)
  today_minus_3 <- as.character(lubridate::ymd(today) - 14)
}

# Calculate week numbers for previous weeks
week_minus_1 <- week - 1
week_minus_2 <- week - 2
week_minus_3 <- week - 3

# Format current week with leading zeros
week <- sprintf("%02d", week)

safe_log(paste("Report week:", current_week, "Year:", year))
safe_log(paste("Week range:", week_start, "to", week_end))
```

```{r load_ci_data, message=FALSE, warning=FALSE, include=FALSE, eval=TRUE}
# Load CI index data with error handling
tryCatch({
  CI_quadrant <- readRDS(here::here(cumulative_CI_vals_dir, "All_pivots_Cumulative_CI_quadrant_year_v2.rds"))

  safe_log("Successfully loaded CI quadrant data")
}, error = function(e) {
  stop("Error loading CI quadrant data: ", e$message)
})
```

```{r compute_benchmarks_once, include=FALSE, eval=TRUE}
# Compute CI benchmarks once for the entire estate
benchmarks <- compute_ci_benchmarks(CI_quadrant, project_dir, c(10, 50, 90))
if (!is.null(benchmarks)) {
  safe_log("Benchmarks computed successfully for the report")
} else {
  safe_log("Failed to compute benchmarks", "WARNING")
}
```


<!-- Dynamic cover page  -->
::: {custom-style="Cover_title" style="text-align:center; margin-top:120px;"}
<span style="font-size:100pt; line-height:1.0; font-weight:700;">Satellite Based Field Reporting</span>
:::

::: {custom-style="Cover_subtitle" style="text-align:center; margin-top:18px;"}
<span style="font-size:20pt; font-weight:600;">Chlorophyll Index (CI) Monitoring Report — `r toupper(params$data_dir)` Estate (Week `r (if (!is.null(params$week)) params$week else format(as.Date(params$report_date), '%V'))`, `r format(as.Date(params$report_date), '%Y')`)</span>
:::

\newpage


## Report Generated

**Farm Location:** `r toupper(project_dir)` Estate
**Report Period:** Week `r current_week` of `r year`
**Report Generated on:** `r format(Sys.time(), "%B %d, %Y at %H:%M")`
**Farm Size Included in Analysis:**
**Data Source:** Planet Labs Satellite Imagery
**Analysis Type:** Chlorophyll Index (CI) Monitoring


## Key Insights

```{r key_insights, echo=FALSE, results='asis', eval=TRUE}
# Calculate key insights from KPI data
if (exists("summary_data") && !is.null(summary_data) && "field_analysis" %in% names(summary_data)) {
  field_analysis_df <- summary_data$field_analysis
  field_analysis_summary <- summary_data$field_analysis_summary

  # Field uniformity insights
  field_cv <- field_analysis_df$CV
  excellent_fields <- sum(field_cv < 0.08, na.rm = TRUE)
  good_fields <- sum(field_cv >= 0.08 & field_cv < 0.15, na.rm = TRUE)
  total_fields <- sum(!is.na(field_cv))
  excellent_pct <- ifelse(total_fields > 0, round(excellent_fields / total_fields * 100, 1), 0)
  good_pct <- ifelse(total_fields > 0, round(good_fields / total_fields * 100, 1), 0)

  # Area change insights - extract from field_analysis_summary
  parse_ci_change <- function(change_str) {
    if (is.na(change_str)) return(NA)
    match <- regexpr("^[+-]?[0-9]+\\.?[0-9]*", change_str)
    if (match > 0) {
      return(as.numeric(substr(change_str, match, attr(match, "match.length"))))
    }
    return(NA)
  }

  field_analysis_df$ci_change_numeric <- sapply(field_analysis_df$Weekly_ci_change, parse_ci_change)
  total_acreage <- sum(field_analysis_df$Acreage, na.rm = TRUE)
  improving_acreage <- sum(field_analysis_df$Acreage[field_analysis_df$ci_change_numeric > 0.2], na.rm = TRUE)
  declining_acreage <- sum(field_analysis_df$Acreage[field_analysis_df$ci_change_numeric < -0.2], na.rm = TRUE)
  improving_pct <- ifelse(total_acreage > 0, round(improving_acreage / total_acreage * 100, 1), 0)
  declining_pct <- ifelse(total_acreage > 0, round(declining_acreage / total_acreage * 100, 1), 0)

  cat("- ", excellent_pct, "% of fields have excellent uniformity (CV < 0.08)\n", sep="")
  cat("- ", good_pct, "% of fields have good uniformity (CV < 0.15)\n", sep="")
  cat("- ", round(improving_acreage, 1), " acres (", improving_pct, "%) of farm area is improving week-over-week\n", sep="")
  cat("- ", round(declining_acreage, 1), " acres (", declining_pct, "%) of farm area is declining week-over-week\n", sep="")

} else {
  cat("KPI data not available for key insights.\n")
}
```


## Report Structure

**Section 1:** Cane supply zone analyses, summaries and Key Performance Indicators (KPIs)
**Section 2:** Explanation of the report, definitions, methodology, and CSV export structure


\newpage

# Section 1: Farm-wide Analyses and KPIs

## 1.1 Overview of cane supply area, showing zones with number of acres being harvest ready

```{r overview_map, fig.width=9, fig.height=7, fig.align="center", echo=FALSE}
# Create a hexbin overview map with ggplot
tryCatch({
  # Try to load in the field analysis from this week
  tryCatch({
    # Ensure week is zero-padded (e.g., 04) to match filenames like *_week04_2026.xlsx
    week_padded <- sprintf("%02d", as.numeric(current_week))
    analysis_path <- file.path("laravel_app", "storage", "app", project_dir, "reports", "kpis", "field_analysis", paste0(params$data_dir, "_field_analysis_week", week_padded, "_", year, ".xlsx"))
    analysis_data <- read_excel(analysis_path)
  }, error = function(e) {
    warning("Error loading field analysis data:", e$message)
  })

  # Define constants
  ACRE_CONV <- 4046.856
  TARGET_CRS <- 32736

  # Process polygons into points
  points_processed <- field_boundaries_sf %>%
    st_make_valid() %>%
    mutate(
      # Calculate area, convert to numeric to strip units, divide by conversion factor
      area_ac = round(as.numeric(st_area(geometry)) / ACRE_CONV, 2)
    ) %>%
    filter(
      # Filter polygons with no surface area
      !is.na(area_ac), area_ac > 0
    ) %>%
    left_join (
      # Add the status_trigger information
      analysis_data %>% select(Field_id, Status_trigger),
      by = c("field" = "Field_id")
    ) %>%
    st_transform(crs = TARGET_CRS) %>%
    st_centroid() %>%
    bind_cols(st_coordinates(.))

  # Subsetting
  points_ready <- points_processed %>%
    filter(Status_trigger == "harvest_ready")
  points_not_ready <- points_processed %>%
    filter(Status_trigger != "harvest_ready" | is.na(Status_trigger))

  # Generate breaks for the plotting
  breaks_vec <- c(0, 5, 10, 15, 20, 30, 35)
  labels_vec <- as.character(breaks_vec)
  labels_vec[length(labels_vec)] <- ">30"
  labels_vec[1] <- "0.1"

  # Set CRS
  map_crs <- st_crs(points_processed)

  # Create dummy point to anchor hexbin grids
  dummy_point <- data.frame(
    field = NA,
    sub_field = NA,
    area_ac = 0,
    Status_trigger = NA,
    X = min(points_processed$X, na.rm = TRUE),
    Y = min(points_processed$Y, na.rm = TRUE),
    geometry = NA
  )

  # Add dummy point
  dummy_point <- st_as_sf(dummy_point, coords = c("X", "Y"), crs = st_crs(points_ready))
  dummy_point <- cbind(dummy_point, st_coordinates(dummy_point))
  points_ready <- rbind(points_ready, dummy_point)
  points_not_ready <- rbind(points_not_ready, dummy_point)

  # Create the plot
  hexbin <- ggplot() +
    # Add OSM basemap
    annotation_map_tile(type = "osm", zoom = 11, progress = "none", alpha = 0.5) +
    # Add the hexbins for not ready points
    geom_hex(
        data = points_not_ready,
        aes(x = X, y = Y, weight = area_ac, alpha = "Not harvest ready within 1 month."),
        binwidth = c(1500, 1500),
        fill = "#ffffff",
        colour = "#0000009a",
        linewidth = 0.1
    ) +
    # Add the hexbins for ready points
    geom_hex(
        data = points_ready,
        aes(x = X, y = Y, weight = area_ac),
        binwidth = c(1500, 1500),
        alpha = 0.9,
        colour = "#0000009a",
        linewidth = 0.1
    ) +
    # Create colour bins
    scale_fill_viridis_b(
        option = "viridis",
        direction = -1,
        breaks = breaks_vec,        # Use our 0-50 sequence
        labels = labels_vec,        # Use our custom ">50" labels
        limits = c(0, 35),          # Limit the scale
        oob = scales::squish,       # Squish higher values into the top bin
        name = "Total Acres"
    ) +
    # Titles
    labs(
        subtitle = "Acres of fields 'harvest ready within a month'"
    ) +
    # Set the CRS
    coord_sf(crs = map_crs) +
    theme_minimal() +
     # Legend trick to add the gray colours
    scale_alpha_manual(
        name = NULL,    # No title needed for this specific legend item
        values = 0.8    # This sets the actual transparency for the map
    ) +
    # Legend customisation
    theme(
        legend.position = "right",
        legend.box = "vertical",
        legend.title.align = 0.5,    # Center the legend title
        plot.title = element_text(face = "bold", size = 16),
        plot.subtitle = element_text(size = 11)
    ) +
    # Customise the look of the horizontal bar
    guides(
        # The colour bar
        fill = guide_coloursteps(
            barwidth = 1,
            barheight = 20,
            title.position = "top",
            order = 1
        ),
        # The not ready box
        alpha = guide_legend(
            override.aes = list(
                fill = "#ffffff",
                colour = "#0000009a",
                shape = 22
            ),
            order = 2
        )
    )

  hexbin

}, error = function(e) {
  warning("Error creating hexbin map:", e$message)
})
```


\newpage
## 1.2 Key Performance Indicators

```{r combined_kpi_table, echo=FALSE, eval=TRUE}
# Create summary KPI table from field_analysis_summary data
# This shows: Phases, Triggers, Area Change, and Total Farm acreage

if (exists("summary_data") && !is.null(summary_data) && "field_analysis_summary" %in% names(summary_data)) {
  field_analysis_summary <- summary_data$field_analysis_summary
  field_analysis_df <- summary_data$field_analysis

  # Phase names and trigger names to extract from summary
  phase_names <- c("Germination", "Tillering", "Grand Growth", "Maturation", "Unknown Phase")
  trigger_names <- c("Harvest Ready", "Strong Recovery", "Growth On Track", "Stress Detected",
                     "Germination Complete", "Germination Started", "No Active Trigger")

  # Extract phase distribution - match on category names directly
  phase_rows <- field_analysis_summary %>%
    filter(Category %in% phase_names) %>%
    select(Category, Acreage) %>%
    mutate(KPI_Group = "PHASE DISTRIBUTION", .before = 1)

  # Extract status triggers - match on category names directly
  trigger_rows <- field_analysis_summary %>%
    filter(Category %in% trigger_names) %>%
    select(Category, Acreage) %>%
    mutate(KPI_Group = "STATUS TRIGGERS", .before = 1)

  # Calculate area change from field_analysis data
  total_acreage <- sum(field_analysis_df$Acreage, na.rm = TRUE)

  # Parse Weekly_ci_change to determine improvement/decline
  parse_ci_change <- function(change_str) {
    if (is.na(change_str)) return(NA)
    match <- regexpr("^[+-]?[0-9]+\\.?[0-9]*", change_str)
    if (match > 0) {
      return(as.numeric(substr(change_str, match, attr(match, "match.length"))))
    }
    return(NA)
  }

  field_analysis_df$ci_change_numeric <- sapply(field_analysis_df$Weekly_ci_change, parse_ci_change)

  improving_acreage <- sum(field_analysis_df$Acreage[field_analysis_df$ci_change_numeric > 0.2], na.rm = TRUE)
  declining_acreage <- sum(field_analysis_df$Acreage[field_analysis_df$ci_change_numeric < -0.2], na.rm = TRUE)
  stable_acreage <- sum(field_analysis_df$Acreage[field_analysis_df$ci_change_numeric >= -0.2 &
                                                   field_analysis_df$ci_change_numeric <= 0.2], na.rm = TRUE)

  improving_pct <- ifelse(total_acreage > 0, round(improving_acreage / total_acreage * 100, 1), 0)
  declining_pct <- ifelse(total_acreage > 0, round(declining_acreage / total_acreage * 100, 1), 0)
  stable_pct <- ifelse(total_acreage > 0, round(stable_acreage / total_acreage * 100, 1), 0)

  # Calculate percentages for phases and triggers
  phase_pcts <- phase_rows %>%
    mutate(Percent = paste0(round(Acreage / total_acreage * 100, 1), "%"))

  trigger_pcts <- trigger_rows %>%
    mutate(Percent = paste0(round(Acreage / total_acreage * 100, 1), "%"))

  area_change_rows <- data.frame(
    KPI_Group = "AREA CHANGE",
    Category = c("Improving", "Stable", "Declining"),
    Acreage = c(round(improving_acreage, 2), round(stable_acreage, 2), round(declining_acreage, 2)),
    Percent = c(paste0(improving_pct, "%"), paste0(stable_pct, "%"), paste0(declining_pct, "%")),
    stringsAsFactors = FALSE
  )

  # Total farm row
  total_row <- data.frame(
    KPI_Group = "TOTAL FARM",
    Category = "Total Acreage",
    Acreage = round(total_acreage, 2),
    Percent = "100%",
    stringsAsFactors = FALSE
  )

  # Combine all rows with percentages for all
  combined_df <- bind_rows(
    phase_pcts,
    trigger_pcts,
    area_change_rows,
    total_row
  )

  # Create grouped display where KPI_Group name appears only once per group
  combined_df <- combined_df %>%
    group_by(KPI_Group) %>%
    mutate(
      KPI_display = if_else(row_number() == 1, KPI_Group, "")
    ) %>%
    ungroup() %>%
    select(KPI_display, Category, Acreage, Percent)

  # Render as flextable with merged cells
  ft <- flextable(combined_df) %>%
    set_header_labels(
      KPI_display = "KPI Category",
      Category = "Item",
      Acreage = "Acreage",
      Percent = "Percent"
    ) %>%
    merge_v(j = "KPI_display") %>%
    autofit()

  # Add horizontal lines after each KPI group (at cumulative row positions)
  # Calculate row positions: row 1 is header, then data rows follow
  phase_count <- nrow(phase_rows)
  trigger_count <- nrow(trigger_rows)
  area_count <- nrow(area_change_rows)

  # Add lines after phases, triggers, and area change groups (before totals)
  if (phase_count > 0) {
    ft <- ft %>% hline(i = phase_count, border = officer::fp_border(width = 1))
  }
  if (trigger_count > 0) {
    ft <- ft %>% hline(i = phase_count + trigger_count, border = officer::fp_border(width = 1))
  }
  if (area_count > 0) {
    ft <- ft %>% hline(i = phase_count + trigger_count + area_count, border = officer::fp_border(width = 1))
  }

  ft
} else {
  cat("KPI summary data not available.\n")
}
```

## Cloud Coverage Summary

```{r cloud_coverage_summary, echo=FALSE}
# Display per-field cloud coverage summary
if (cloud_coverage_available && !is.null(per_field_cloud_coverage)) {
  # Prepare cloud coverage table for display
  # Handle both old and new column naming conventions
  cloud_display <- per_field_cloud_coverage %>%
    mutate(
      Field = if_else(exists("field", list(per_field_cloud_coverage)), field_id,
                      if_else(exists("Field", list(per_field_cloud_coverage)), Field, field_id)),
      Clear_Percent = pct_clear,
      Cloud_Acreage = if_else(exists("Cloud_Acreage", list(per_field_cloud_coverage)), Cloud_Acreage,
                              as.numeric(NA)),
      Total_Acreage = if_else(exists("Total_Acreage", list(per_field_cloud_coverage)), Total_Acreage,
                              as.numeric(NA))
    ) %>%
    select(Field, Cloud_category, Clear_Percent, missing_pixels, clear_pixels, total_pixels) %>%
    rename(
      "Field" = Field,
      "Cloud Status" = Cloud_category,
      "Clear %" = Clear_Percent,
      "Cloud Pixels" = missing_pixels,
      "Clear Pixels" = clear_pixels,
      "Total Pixels" = total_pixels
    ) %>%
    arrange(Field)

  # Create flextable
  ft <- flextable(cloud_display) %>%
    autofit()

  ft
} else if (exists("cloud_coverage_available") && !cloud_coverage_available) {
  cat("Cloud coverage data not available for this week.\n")
} else {
  cat("Cloud coverage data not loaded.\n")
}
```

## Field Alerts

```{r field_alerts_table, echo=FALSE}
# Generate alerts table from field analysis status triggers
if (exists("summary_data") && !is.null(summary_data) && "field_analysis" %in% names(summary_data)) {
  field_analysis_table <- summary_data$field_analysis

  # Extract fields with status triggers (non-null)
  alerts_data <- field_analysis_table %>%
    filter(!is.na(Status_trigger), Status_trigger != "") %>%
    select(Field_id, Status_trigger) %>%
    rename(Field = Field_id, Alert = Status_trigger)

  if (nrow(alerts_data) > 0) {
    # Format alert messages for display
    alerts_data <- alerts_data %>%
      mutate(
        Alert = case_when(
          Alert == "germination_started" ~ "🌱 Germination started - crop emerging",
          Alert == "germination_complete" ~ "✓ Germination complete - established",
          Alert == "stress_detected_whole_field" ~ "🚨 Stress detected - check irrigation/disease",
          Alert == "strong_recovery" ~ "📈 Strong recovery - growth accelerating",
          Alert == "growth_on_track" ~ "✓ Growth on track - normal progression",
          Alert == "maturation_progressing" ~ "🌾 Maturation progressing - ripening phase",
          Alert == "harvest_ready" ~ "✂️ Harvest ready - 45+ weeks old",
          TRUE ~ Alert
        )
      )

    ft <- flextable(alerts_data) %>%
      autofit()
    ft
  } else {
    cat("No active status triggers this week.\n")
  }
} else {
  cat("Field analysis data not available for alerts.\n")
}
```

```{r data, message=TRUE, warning=TRUE, include=FALSE}
# All data comes from the field analysis performed in 09_field_analysis_weekly.R
# The report renders KPI tables and field summaries from that data
```

```{r load_field_boundaries, message=TRUE, warning=TRUE, include=FALSE, eval=TRUE}
# Load field boundaries from parameters
field_boundaries_sf <- sf::st_make_valid(field_boundaries_sf)
tryCatch({
  AllPivots0 <- field_boundaries_sf %>%
    dplyr::filter(!is.na(field), !is.na(sub_field))  # Filter out NA field names
  safe_log("Successfully loaded field boundaries")

  # Prepare merged field list for use in summaries
  AllPivots_merged <- AllPivots0 %>%
    dplyr::filter(!is.na(field), !is.na(sub_field)) %>%  # Filter out NA field names
    dplyr::group_by(field) %>%
    dplyr::summarise(.groups = 'drop')

}, error = function(e) {
  stop("Error loading field boundaries: ", e$message)
})
```
\newpage

# Section 2: Support Document for weekly SmartCane data package.

## 1. About This Document

This document is the support document to the SmartCane data file. It includes the definitions, explanatory calculations and suggestions for interpretations of the data as provided. For additional questions please feel free to contact SmartCane support, through your contact person, or via info@smartcane.org.


## 2. About the Data File

The data file is automatically populated based on normalized and indexed remote sensing images of provided polygons. Specific SmartCane algorithms provide tailored calculation results developed to support the sugarcane operations by:

• Supporting harvest planning mill-field logistics to ensure optimal tonnage and sucrose levels
• Monitoring of the crop growth rates on the farm, providing evidence of performance
• Identifying growth-related issues that are in need of attention
• Enabling timely actions to minimize negative impact

Key Features of the data file: - High-resolution satellite imagery analysis - Week-over-week change detection - Individual field performance metrics - Actionable insights for crop management.


#### *What is the Chlorophyll Index (CI)?*

The Chlorophyll Index (CI) is a vegetation index that measures the relative amount of chlorophyll in plant leaves. Chlorophyll is the green pigment responsible for photosynthesis in plants. Higher CI values indicate:
• Greater photosynthetic activity
• Healthier plant tissue
• Better nitrogen uptake
• More vigorous crop growth

CI values typically range from 0 (bare soil or severely stressed vegetation) to 7+ (very healthy, dense vegetation). For sugarcane, values between 3-7 generally indicate good crop health, depending on the growth stage.


```{r ci_fig, echo=FALSE, fig.align='right', out.width='40%', fig.cap="Chlorophyll Index Example"}
knitr::include_graphics("CI_graph_example.png")
```


### Data File Structure and Columns
The data file is organized in rows, one row per agricultural field (polygon), and columns, providing field data, actual measurements, calculation results and descriptions. The data file can be directly integration with existing farm management systems for further analysis. Each column is described hereunder:

| **Nr.** | **Column**     | **Description**                                                       | **Example**     |
|-----|---------------|----------------------------------------------------------------------------|-------------|
|-----|---------------|----------------------------------------------------------------------------|-------------|
| **1** | **Field_id** | Unique identifier for a cane field combining field name and sub-field number. This can be the same as Field_Name but is also helpful in keeping track of cane fields should they change, split or merge. | "00110" |
| **2** | **Farm_Section** | Sub-area or section name | "Section a" |
| **3** | **Field_name** | Client Name or label assigned to a cane field. | "Tinga1" |
| **4** | **Acreage** | Field size in acres | "4.5" |
| **5** | **Status_trigger** | Shows changes in crop status worth alerting. More detailed explanation of the possible alerts is written down  under key concepts. | "Harvest_ready" |
| **6** | **Last_harvest_or_planting_date** | Date of most recent harvest as per satellite detection algorithm / or manual entry | “2025-03-14” |
| **7** |**Age_week** | Time elapsed since planting/harvest in weeks; used to predict expected growth phases. Reflects planting/harvest date (left). | "40" |
| **8** | **Phase (age based)** | Current growth phase (e.g., germination, tillering, stem elongation, grain fill, mature) inferred from crop age | "Maturation" |
| **9** | **Germination_progress** | Estimated percentage or stage of germination/emergence based on CI patterns and age. This goes for young fields (age < 4 months). Remain at 100% when finished. | "maturation_progressing" |
| **10** | **Mean_CI** | Average Chlorophyll Index value across the field; higher values indicate healthier, greener vegetation. Calculated on a 7-day merged weekly image  | "3.95" |
| **11** | **Weekly CI Change** | Week-over-week change in Mean_CI; positive values indicate greening/growth, negative values indicate yellowing/decline | "0.79" |
| **12** | **Four_week_trend** | Long term change in mean CI; smoothed trend (strong growth, growth, no growth, decline, strong decline) | "0.87" |
| **13** | **CI_range** | Min-max Chlorophyll Index values within the field; wide ranges indicate spatial heterogeneity/patches. Derived from week mosaic | "3.6-5.6" |
| **14** | **CI_Percentiles** | The CI-range without border effects | "3.5-4.4" |
| **15** | **CV** | Coefficient of variation of CI; measures field uniformity (lower = more uniform, >0.25 = poor uniformity). Derived from week mosaic. In percentages | "10.01%" |
| **16** | **CV_Trend_Short_Term** | Trend of CV over two weeks. Indicating short-term heterogeneity  | "0.15" |
| **17** | **CV_Trend_Long_Term** | Slope of 8-week trend line. | "0.32" |
| **18** | **Imminent_prob** | Probability (0-1) that the field is ready for harvest based on LSTM harvest model predictions | "0.8" |
| **19** | **Cloud_pct_clear** | Percentage of field visible in the satellite image (unobstructed by clouds); lower values indicate poor data quality | "70%" |
| **20** | **Cloud_category** | Classification of cloud cover level (e.g., clear, partial, heavy); indicates confidence in CI measurements | "Partial Coverage" |


\newpage
# 3. Key Concepts

#### *Growth Phases (Age-Based)*

Each field is assigned to one of four growth phases based on age in weeks since planting:

| **Phase** | **Age Range** | **Characteristics** |
|-------|-----------|-----------------|
| Germination | 0-6 weeks | Crop emergence and early establishment; high variability expected |
| Tillering | 4-16 weeks | Shoot multiplication and plant establishment; rapid growth phase |
| Grand Growth | 17-39 weeks | Peak vegetative growth; maximum height and biomass accumulation |
| Maturation | 39+ weeks | Ripening phase; sugar accumulation and preparation for harvest |


#### *Status Alert*

Status alerts indicate the current field condition based on CI and age-related patterns. Each field receives **one alert** reflecting its most relevant status:

| **Alert** | **Condition** | **Phase** | **Messaging** |
|---------|-----------|-------|-----------|
| Ready for harvest-check | Harvest model > 0.50 and crop is mature | Active from 52 weeks onwards | Ready for harvest-check |
| harvested/bare | Field of 50 weeks or older either shows mean CI values lower than 1.5 (for a maximum of three weeks) OR drops from higher CI to lower than 1.5. Alert drops if CI rises and passes 1.5 again | Maturation (39+) | Harvested or bare field |
| stress_detected_whole_field | Mean CI on field drops by 2+ points but field mean CI remains higher than 1.5 | Any | Strong decline in crop health |


#### *Harvest Date and Harvest Imminent*
The SmartCane algorithm calculates the last harvest date and the probability of harvest approaching in the next 4 weeks. Two different algorithms are used.
The **last harvest date** is a timeseries analyses of the CI levels of the past years, based on clean factory managed fields as data set for the machine learning, a reliability of over 90% has been reached. Smallholder managed fields of small size (0.3 acres) have specific side effects and field management characteristics, that influence the model results.
**Imminent_probability** of harvest is a prediction algorithm, estimating the likelihood of a crop ready to be harvested in the near future. This prediction takes the CI-levels into consideration, building on the vegetative development of sugarcane in the last stage of Maturation, where all sucrose is pulled into the stalk, depleting the leaves from energy and productive function, reducing the levels of CI in the leave tissue.
Both algorithms are not always in sync, and can have contradictory results. Wider field characteristics analyses is suggested if such contradictory calculation results occur.


\newpage
## Report Metadata

```{r report_metadata, echo=FALSE}
metadata_info <- data.frame(
  Metric = c("Report Generated", "Data Source", "Analysis Period", "Total Fields", "Next Update"),
  Value = c(
    format(Sys.time(), "%Y-%m-%d %H:%M:%S"),
    paste("Project", toupper(project_dir)),
    paste("Week", current_week, "of", year),
    ifelse(exists("AllPivots0"), nrow(AllPivots0 %>% filter(!is.na(field)) %>% group_by(field) %>% summarise()), "Unknown"),
    "Next Wednesday"
  )
)

ft <- flextable(metadata_info) %>%
  set_caption("Report Metadata") %>%
  autofit()

ft
```

*This report was automatically generated by the SmartCane monitoring system. For questions or additional analysis, please contact the technical team.*