24 KiB
SmartCane Pipeline Code Review
Efficiency, Cleanup, and Architecture Analysis
Date: January 29, 2026
Scope: run_full_pipeline.R + all called scripts (10, 20, 21, 30, 31, 40, 80, 90, 91) + utility files
Status: Comprehensive review completed
EXECUTIVE SUMMARY
Your pipeline is well-structured and intentional, but has accumulated significant technical debt through development iterations. The main issues are:
- 🔴 HIGH IMPACT: 3 separate mosaic mode detection functions doing identical work
- 🔴 HIGH IMPACT: Week/year calculations duplicated 10+ times across 6+ files
- 🟡 MEDIUM IMPACT: 40+ debug statements cluttering output
- 🟡 MEDIUM IMPACT: File existence checks repeated in multiple places (especially KPI checks)
- 🟢 LOW IMPACT: Minor redundancy in command construction, but manageable
Estimated cleanup effort: 2-3 hours for core refactoring; significant code quality gains.
Workflow clarity issue: The split between merged_tif vs merged_tif_8b and weekly_mosaic vs weekly_tile_max is not clearly documented. This should be clarified.
1. DUPLICATED FUNCTIONS & LOGIC
1.1 Mosaic Mode Detection (CRITICAL REDUNDANCY)
Problem: Three identical implementations of detect_mosaic_mode():
| Location | Function Name | Lines | Issue |
|---|---|---|---|
run_full_pipeline.R |
detect_mosaic_mode_early() |
~20 lines | Detects tiled vs single-file |
run_full_pipeline.R |
detect_mosaic_mode_simple() |
~20 lines | Detects tiled vs single-file (duplicate) |
parameters_project.R |
detect_mosaic_mode() |
~30 lines | Detects tiled vs single-file (different signature) |
Impact: If you change the detection logic, you must update 3 places. Bug risk is high.
Solution: Create single canonical function in parameters_project.R:
# SINGLE SOURCE OF TRUTH
detect_mosaic_mode <- function(project_dir) {
weekly_tile_max <- file.path("laravel_app", "storage", "app", project_dir, "weekly_tile_max")
if (dir.exists(weekly_tile_max)) {
subfolders <- list.dirs(weekly_tile_max, full.names = FALSE, recursive = FALSE)
if (length(grep("^\\d+x\\d+$", subfolders)) > 0) return("tiled")
}
weekly_mosaic <- file.path("laravel_app", "storage", "app", project_dir, "weekly_mosaic")
if (dir.exists(weekly_mosaic) &&
length(list.files(weekly_mosaic, pattern = "^week_.*\\.tif$")) > 0) {
return("single-file")
}
return("unknown")
}
Then replace all three calls in run_full_pipeline.R with this single function.
1.2 Week/Year Calculations (CRITICAL REDUNDANCY)
Problem: The pattern week_num <- as.numeric(format(..., "%V")) + year_num <- as.numeric(format(..., "%G")) appears 13+ times across multiple files.
Locations:
run_full_pipeline.R: Lines 82, 126-127, 229-230, 630, 793-794 (5 times)80_calculate_kpis.R: Lines 323-324 (1 time)80_weekly_stats_utils.R: Lines 829-830 (1 time)kpi_utils.R: Line 45 (1 time)80_kpi_utils.R: Lines 177-178 (1 time)- Plus inline in sprintf statements: ~10+ additional times
Impact:
- High maintenance burden
- Risk of inconsistency (%V vs %Y confusion noted at line 82 in
run_full_pipeline.R) - Code verbosity
Solution: Create utility function in parameters_project.R:
get_iso_week_year <- function(date) {
list(
week = as.numeric(format(date, "%V")),
year = as.numeric(format(date, "%G")) # ISO year, not calendar year
)
}
# Usage:
wwy <- get_iso_week_year(end_date)
cat(sprintf("Week %02d/%d\n", wwy$week, wwy$year))
Also add convenience function:
format_week_year <- function(date, separator = "_") {
wwy <- get_iso_week_year(date)
sprintf("week_%02d%s%d", wwy$week, separator, wwy$year)
}
# Usage: format_week_year(end_date) # "week_02_2026"
1.3 File Path Construction (MEDIUM REDUNDANCY)
Problem: Repeated patterns like:
file.path("laravel_app", "storage", "app", project_dir, "weekly_mosaic")
file.path("laravel_app", "storage", "app", project_dir, "reports", "kpis", kpi_subdir)
Solution: Centralize in parameters_project.R:
# Project-agnostic path builders
get_project_storage_path <- function(project_dir, subdir = NULL) {
base <- file.path("laravel_app", "storage", "app", project_dir)
if (!is.null(subdir)) file.path(base, subdir) else base
}
get_mosaic_dir <- function(project_dir, mosaic_mode = "auto") {
if (mosaic_mode == "auto") mosaic_mode <- detect_mosaic_mode(project_dir)
if (mosaic_mode == "tiled") {
get_project_storage_path(project_dir, "weekly_tile_max/5x5")
} else {
get_project_storage_path(project_dir, "weekly_mosaic")
}
}
get_kpi_dir <- function(project_dir, client_type) {
subdir <- if (client_type == "agronomic_support") "field_level" else "field_analysis"
get_project_storage_path(project_dir, file.path("reports", "kpis", subdir))
}
2. DEBUG STATEMENTS & LOGGING CLUTTER
2.1 Excessive Debug Output
The pipeline prints 40+ debug statements that pollute the terminal output. Examples:
In run_full_pipeline.R:
Line 82: cat(sprintf(" Running week: %02d / %d\n", ...)) # Note: %d (calendar year) should be %G
Line 218: cat(sprintf("[KPI_DIR_CREATED] Created directory: %s\n", ...))
Line 223: cat(sprintf("[KPI_DIR_EXISTS] %s\n", ...))
Line 224: cat(sprintf("[KPI_DEBUG] Total files in directory: %d\n", ...))
Line 225: cat(sprintf("[KPI_DEBUG] Sample files: %s\n", ...))
Line 240: cat(sprintf("[KPI_DEBUG_W%02d_%d] Pattern: '%s' | Found: %d files\n", ...))
Line 630: cat("DEBUG: Running command:", cmd, "\n")
Line 630 in Script 31 execution - prints full conda command
In 80_calculate_kpis.R:
Line 323: message(paste("Calculating statistics for all fields - Week", week_num, year))
Line 417: # Plus many more ...
Impact:
- Makes output hard to scan for real issues
- Test developers skip important messages
- Production logs become noise
Solution: Replace with structured logging (3 levels):
# Add to parameters_project.R
smartcane_log <- function(message, level = "INFO") {
timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
prefix <- sprintf("[%s] %s", level, timestamp)
cat(sprintf("%s | %s\n", prefix, message))
}
smartcane_debug <- function(message) {
if (Sys.getenv("SMARTCANE_DEBUG") == "TRUE") {
smartcane_log(message, level = "DEBUG")
}
}
smartcane_warn <- function(message) {
smartcane_log(message, level = "WARN")
}
Usage:
# Keep important messages
smartcane_log(sprintf("Downloaded %d dates, %d failed", download_count, download_failed))
# Hide debug clutter (only show if DEBUG=TRUE)
smartcane_debug(sprintf("KPI directory exists: %s", kpi_dir))
# Warnings stay visible
smartcane_warn("Some downloads failed, but continuing pipeline")
2.2 Redundant Status Checks in KPI Section
Lines 218-270 in run_full_pipeline.R: The KPI requirement check has deeply nested debug statements.
if (dir.exists(kpi_dir)) {
cat(sprintf("[KPI_DIR_EXISTS] %s\n", kpi_dir))
all_kpi_files <- list.files(kpi_dir)
cat(sprintf("[KPI_DEBUG] Total files in directory: %d\n", length(all_kpi_files)))
if (length(all_kpi_files) > 0) {
cat(sprintf("[KPI_DEBUG] Sample files: %s\n", ...))
}
} else {
cat(sprintf("[KPI_DIR_MISSING] Directory does not exist: %s\n", kpi_dir))
}
Solution: Simplify to:
if (!dir.exists(kpi_dir)) {
dir.create(kpi_dir, recursive = TRUE, showWarnings = FALSE)
}
all_kpi_files <- list.files(kpi_dir)
smartcane_debug(sprintf("KPI directory: %d files found", length(all_kpi_files)))
3. DOUBLE CALCULATIONS & INEFFICIENCIES
3.1 KPI Existence Check (Calculated Twice)
Problem: KPI existence is checked twice in run_full_pipeline.R:
- First check (Lines 228-270): Initial KPI requirement check that calculates
kpis_neededdataframe - Second check (Lines 786-810): Verification after Script 80 runs (almost identical logic)
Both loops do:
for (weeks_back in 0:(reporting_weeks_needed - 1)) {
check_date <- end_date - (weeks_back * 7)
week_num <- as.numeric(format(check_date, "%V"))
year_num <- as.numeric(format(check_date, "%G"))
week_pattern <- sprintf("week%02d_%d", week_num, year_num)
kpi_files_this_week <- list.files(kpi_dir, pattern = week_pattern)
has_kpis <- length(kpi_files_this_week) > 0
# ... same logic again
}
Impact: Slower pipeline execution, code duplication
Solution: Create reusable function in utility file:
check_kpi_completeness <- function(project_dir, client_type, end_date, reporting_weeks_needed) {
kpi_dir <- get_kpi_dir(project_dir, client_type)
kpis_needed <- data.frame()
for (weeks_back in 0:(reporting_weeks_needed - 1)) {
check_date <- end_date - (weeks_back * 7)
wwy <- get_iso_week_year(check_date)
week_pattern <- sprintf("week%02d_%d", wwy$week, wwy$year)
has_kpis <- any(grepl(week_pattern, list.files(kpi_dir)))
kpis_needed <- rbind(kpis_needed, data.frame(
week = wwy$week,
year = wwy$year,
date = check_date,
has_kpis = has_kpis
))
}
return(list(
kpis_df = kpis_needed,
missing_count = sum(!kpis_needed$has_kpis),
all_complete = all(kpis_needed$has_kpis)
))
}
# Then in run_full_pipeline.R:
initial_kpi_check <- check_kpi_completeness(project_dir, client_type, end_date, reporting_weeks_needed)
# ... after Script 80 runs:
final_kpi_check <- check_kpi_completeness(project_dir, client_type, end_date, reporting_weeks_needed)
if (final_kpi_check$all_complete) {
smartcane_log("✓ All KPIs available")
}
3.2 Mosaic Mode Detection (Called 3+ Times per Run)
Current code:
- Line 99-117:
detect_mosaic_mode_early()called once - Line 301-324:
detect_mosaic_mode_simple()called again - Result: Same detection logic runs twice unnecessarily
Solution: Call once, store result:
mosaic_mode <- detect_mosaic_mode(project_dir) # Once at top
# Then reuse throughout:
if (mosaic_mode == "tiled") { ... }
else if (mosaic_mode == "single-file") { ... }
3.3 Missing Weeks Calculation Inefficiency
Lines 126-170: The loop builds weeks_needed dataframe, then immediately iterates again to find which ones are missing.
Current code:
# First: build all weeks
weeks_needed <- data.frame()
for (weeks_back in 0:(reporting_weeks_needed - 1)) {
# ... build weeks_needed
}
# Then: check which are missing (loop again)
missing_weeks <- data.frame()
for (i in 1:nrow(weeks_needed)) {
# ... check each week
}
Solution: Combine into single loop:
weeks_needed <- data.frame()
missing_weeks <- data.frame()
earliest_missing_date <- end_date
for (weeks_back in 0:(reporting_weeks_needed - 1)) {
check_date <- end_date - (weeks_back * 7)
wwy <- get_iso_week_year(check_date)
# Add to weeks_needed
weeks_needed <- rbind(weeks_needed, data.frame(
week = wwy$week, year = wwy$year, date = check_date
))
# Check if missing, add to missing_weeks if so
week_pattern <- sprintf("week_%02d_%d", wwy$week, wwy$year)
mosaic_dir <- get_mosaic_dir(project_dir, mosaic_mode)
if (length(list.files(mosaic_dir, pattern = week_pattern)) == 0) {
missing_weeks <- rbind(missing_weeks, data.frame(
week = wwy$week, year = wwy$year, week_end_date = check_date
))
if (check_date - 6 < earliest_missing_date) {
earliest_missing_date <- check_date - 6
}
}
}
3.4 Data Source Detection Logic
Lines 58-84: The data_source_used detection is overly complex:
data_source_used <- "merged_tif_8b" # Default
if (dir.exists(merged_tif_path)) {
tif_files <- list.files(merged_tif_path, pattern = "\\.tif$")
if (length(tif_files) > 0) {
data_source_used <- "merged_tif"
# ...
} else if (dir.exists(merged_tif_8b_path)) {
tif_files_8b <- list.files(merged_tif_8b_path, pattern = "\\.tif$")
# ...
}
} else if (dir.exists(merged_tif_8b_path)) {
# ...
}
Issues:
- Multiple nested conditions doing the same check
tif_filesandtif_files_8bare listed but only counts checked (not used later)- Logic could be cleaner
Solution: Create utility function:
detect_data_source <- function(project_dir, preferred = "auto") {
storage_dir <- get_project_storage_path(project_dir)
for (source in c("merged_tif", "merged_tif_8b")) {
source_dir <- file.path(storage_dir, source)
if (dir.exists(source_dir)) {
tifs <- list.files(source_dir, pattern = "\\.tif$")
if (length(tifs) > 0) return(source)
}
}
smartcane_warn("No data source found - defaulting to merged_tif_8b")
return("merged_tif_8b")
}
4. WORKFLOW CLARITY ISSUES
4.1 TIFF Data Format Confusion
Problem: Why are there TWO different TIFF folders?
merged_tif: 4-band data (RGB + NIR)merged_tif_8b: 8-band data (appears to include UDM cloud masking from Planet)
Currently in code:
data_source <- if (project_dir == "angata") "merged_tif_8b" else "merged_tif"
Issues:
- Hard-coded per project, not based on what's actually available
- Not documented why angata uses 8-band
- Unclear what the 8-band data adds (cloud masking? extra bands?)
- Scripts handle both, but it's not clear when to use which
Recommendation:
- Document in
parameters_project.Rwhat each data source contains:
DATA_SOURCE_FORMATS <- list(
"merged_tif" = list(
bands = 4,
description = "4-band PlanetScope: Red, Green, Blue, NIR",
projects = c("aura", "chemba", "xinavane"),
note = "Standard format from Planet API"
),
"merged_tif_8b" = list(
bands = 8,
description = "8-band PlanetScope with UDM: RGB+NIR + 4-band cloud mask",
projects = c("angata"),
note = "Enhanced with cloud confidence from UDM2 (Unusable Data Mask)"
)
)
- Update hard-coded assignment to be data-driven:
# OLD: data_source <- if (project_dir == "angata") "merged_tif_8b" else "merged_tif"
# NEW: detect what's actually available
data_source <- detect_data_source(project_dir)
4.2 Mosaic Storage Format Confusion
Problem: Why are there TWO different mosaic storage styles?
weekly_mosaic/: Single TIF file per week (monolithic)weekly_tile_max/5x5/: Tiled TIFFs per week (25+ files per week)
Currently in code:
- Detected automatically via
detect_mosaic_mode() - But no documentation on when/why each is used
Recommendation:
- Document the trade-offs in
parameters_project.R:
MOSAIC_MODES <- list(
"single-file" = list(
description = "One TIF per week",
storage_path = "weekly_mosaic/",
files_per_week = 1,
pros = c("Simpler file management", "Easier to load full mosaic"),
cons = c("Slower for field-specific analysis", "Large file I/O"),
suitable_for = c("agronomic_support", "dashboard visualization")
),
"tiled" = list(
description = "5×5 grid of tiles per week",
storage_path = "weekly_tile_max/5x5/",
files_per_week = 25,
pros = c("Parallel field processing", "Faster per-field queries", "Scalable to 1000+ fields"),
cons = c("More file management", "Requires tile_grid metadata"),
suitable_for = c("cane_supply", "large-scale operations")
)
)
- Document why angata uses tiled, aura uses single-file:
- Is it a function of field count? (Angata = cane_supply, large fields → tiled)
- Is it historical? (Legacy decision?)
- Should new projects choose based on client type?
4.3 Client Type Mapping Clarity
Current structure in parameters_project.R:
CLIENT_TYPE_MAP <- list(
"angata" = "cane_supply",
"aura" = "agronomic_support",
"chemba" = "cane_supply",
"xinavane" = "cane_supply",
"esa" = "cane_supply"
)
Issues:
- Not clear why aura is agronomic_support while angata/chemba are cane_supply
- No documentation of what each client type needs
- Scripts branch heavily on
skip_cane_supply_onlylogic
Recommendation: Add metadata to explain the distinction:
CLIENT_TYPES <- list(
"cane_supply" = list(
description = "Sugar mill supply chain optimization",
requires_harvest_prediction = TRUE, # Script 31
requires_phase_assignment = TRUE, # Based on planting date
per_field_detail = TRUE, # Script 91 Excel report
data_sources = c("merged_tif", "merged_tif_8b"),
mosaic_mode = "tiled",
projects = c("angata", "chemba", "xinavane", "esa")
),
"agronomic_support" = list(
description = "Farm-level decision support for agronomists",
requires_harvest_prediction = FALSE,
requires_phase_assignment = FALSE,
per_field_detail = FALSE,
farm_level_kpis = TRUE, # Script 90 Word report
data_sources = c("merged_tif"),
mosaic_mode = "single-file",
projects = c("aura")
)
)
5. COMMAND CONSTRUCTION REDUNDANCY
5.1 Rscript Path Repetition
Problem: The Rscript path is repeated 5 times:
Line 519: '"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"'
Line 676: '"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"'
Line 685: '"C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"'
Solution: Define once in parameters_project.R:
RSCRIPT_PATH <- "C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\Rscript.exe"
# Usage:
cmd <- sprintf('"%s" --vanilla r_app/20_ci_extraction.R ...', RSCRIPT_PATH)
6. SPECIFIC LINE-BY-LINE ISSUES
6.1 Line 82 Bug: Wrong Format Code
cat(sprintf(" Running week: %02d / %d\n",
as.numeric(format(end_date, "%V")),
as.numeric(format(end_date, "%Y")))) # ❌ Should be %G, not %Y
Issue: Uses calendar year %Y instead of ISO week year %G. On dates like 2025-12-30 (week 1 of 2026), this will print "Week 01 / 2025" (confusing).
Fix:
wwy <- get_iso_week_year(end_date)
cat(sprintf(" Running week: %02d / %d\n", wwy$week, wwy$year))
6.2 Line 630 Debug Statement
cmd <- sprintf('conda run -n pytorch_gpu python python_app/31_harvest_imminent_weekly.py %s', project_dir)
cat("DEBUG: Running command:", cmd, "\n") # ❌ Prints full conda command
Solution: Use smartcane_debug() function:
cmd <- sprintf('conda run -n pytorch_gpu python python_app/31_harvest_imminent_weekly.py %s', project_dir)
smartcane_debug(sprintf("Running Python 31: %s", cmd))
6.3 Lines 719-723: Verbose Script 31 Verification
# Check for THIS WEEK's specific file
current_week <- as.numeric(format(end_date, "%V"))
current_year <- as.numeric(format(end_date, "%Y"))
expected_file <- file.path(...)
Issue: Calculates week twice (already done earlier). Also uses %Y (should be %G).
Solution: Reuse earlier wwy calculation or create helper.
7. REFACTORING ROADMAP
Phase 1: Foundation (1 hour)
- Consolidate
detect_mosaic_mode()into single function inparameters_project.R - Create
get_iso_week_year()andformat_week_year()utilities - Create
get_project_storage_path(),get_mosaic_dir(),get_kpi_dir()helpers - Add logging functions (
smartcane_log(),smartcane_debug(),smartcane_warn())
Phase 2: Deduplication (1 hour)
- Replace all 13+ week_num/year_num calculations with
get_iso_week_year() - Replace all 3
detect_mosaic_mode_*()calls with single function - Combine duplicate KPI checks into
check_kpi_completeness()function - Fix line 82 and 630 format bugs
Phase 3: Cleanup (1 hour)
- Remove all debug statements (40+), replace with
smartcane_debug() - Simplify nested conditions in data_source detection
- Combine missing weeks detection into single loop
- Extract Rscript path to constant
Phase 4: Documentation (30 min)
- Add comments explaining
merged_tifvsmerged_tif_8btrade-offs - Document
single-filevstiledmosaic modes and when to use each - Clarify client type mapping in
CLIENT_TYPE_MAP - Add inline comments for non-obvious logic
8. ARCHITECTURE & WORKFLOW RECOMMENDATIONS
8.1 Clear Data Flow Diagram
Add to r_app/system_architecture/system_architecture.md:
INPUT SOURCES:
├── Planet API 4-band or 8-band imagery
├── Field boundaries (pivot.geojson)
└── Harvest data (harvest.xlsx, optional for cane_supply)
STORAGE TIERS:
├── Tier 1: Raw data (merged_tif/ or merged_tif_8b/)
├── Tier 2: Daily tiles (daily_tiles_split/{grid_size}/{dates}/)
├── Tier 3: Extracted CI (Data/extracted_ci/daily_vals/*.rds)
├── Tier 4: Weekly mosaics (weekly_mosaic/ OR weekly_tile_max/5x5/)
└── Tier 5: KPI outputs (reports/kpis/{field_level|field_analysis}/)
DECISION POINTS:
└─ Client type (cane_supply vs agronomic_support)
├─ Drives script selection (Scripts 21, 22, 23, 31, 90/91)
├─ Drives data source (merged_tif_8b for cane_supply, merged_tif for agronomic)
├─ Drives mosaic mode (tiled for cane_supply, single-file for agronomic)
└─ Drives KPI subdirectory (field_analysis vs field_level)
8.2 .sh Scripts Alignment
You mention .sh scripts in the online environment. If they're not calling the R pipeline, there's a split responsibility issue:
Question: Are the .sh scripts:
- (A) Independent duplicates of the R pipeline logic? (BAD - maintenance nightmare)
- (B) Wrappers calling the R pipeline? (GOOD - single source of truth)
- (C) Different workflow for online vs local? (RED FLAG - they diverge)
Recommendation: If using .sh for production, ensure they call the same R scripts (run_full_pipeline.R). Example:
#!/bin/bash
# Wrapper that ensures R pipeline is called
cd /path/to/smartcane
& "C:\Program Files\R\R-4.4.3\bin\x64\Rscript.exe" r_app/run_full_pipeline.R
9. SUMMARY TABLE: Issues by Severity
| Issue | Type | Impact | Effort | Priority |
|---|---|---|---|---|
| 3 mosaic detection functions | Duplication | HIGH | 30 min | P0 |
| 13+ week/year calculations | Duplication | HIGH | 1 hour | P0 |
| 40+ debug statements | Clutter | MEDIUM | 1 hour | P1 |
| KPI check run twice | Inefficiency | LOW | 30 min | P2 |
| Line 82: %Y should be %G | Bug | LOW | 5 min | P2 |
| Data source confusion | Documentation | MEDIUM | 30 min | P1 |
| Mosaic mode confusion | Documentation | MEDIUM | 30 min | P1 |
| Client type mapping | Documentation | MEDIUM | 30 min | P1 |
| Data source detection complexity | Code style | LOW | 15 min | P3 |
10. RECOMMENDED NEXT STEPS
- Review this report with your team to align on priorities
- Create Linear issues for each phase of refactoring
- Start with Phase 1 (foundation utilities) - builds confidence for Phase 2
- Test thoroughly after each phase - the pipeline is complex and easy to break
- Update
.shscripts if they duplicate R logic - Document data flow in
system_architecture/system_architecture.md
Questions for Clarification
Before implementing, please clarify:
-
Data source split: Why does angata use
merged_tif_8b(8-band with cloud mask) while aura usesmerged_tif(4-band)? Is this:- A function of client need (cane_supply requires cloud masking)?
- Historical (legacy decision for angata)?
- Should new projects choose based on availability?
-
Mosaic mode split: Why tiled for angata but single-file for aura? Should this be:
- Hard-coded per project?
- Based on field count/client type?
- Auto-detected from first run?
-
Production vs local: Are the
.shscripts in the online environment:- Calling this same R pipeline?
- Duplicating logic independently?
- A different workflow entirely?
-
Client type growth: Are there other client types planned beyond
cane_supplyandagronomic_support? (e.g., extension_service?)
Report prepared: January 29, 2026
Total code reviewed: ~2,500 lines across 10 files
Estimated refactoring time: 3-4 hours
Estimated maintenance savings: 5-10 hours/month (fewer bugs, easier updates)