SmartCane/python_app/harvest_detection_experiments/experiment_framework/QUICKSTART.md
2026-01-06 14:17:37 +01:00

3.7 KiB
Raw Blame History

Quick Start Guide - Harvest Detection Experiments

Get Started in 3 Steps

1. Navigate to Framework

cd "c:\Users\timon\Resilience BV\4020 SCane ESA DEMO - Documenten\General\4020 SCDEMO Team\4020 TechnicalData\WP3\smartcane_v2\smartcane\python_app\harvest_detection_experiments\experiment_framework"

2. Run Your First Experiment

python run_experiment.py --exp exp_001

This runs the baseline (4 trend features only). Takes ~30-60 min on GPU.

3. Compare Results

python analyze_results.py --experiments all

Run All Phase 1 Experiments (Overnight)

# Run all 10 feature selection experiments
python run_experiment.py --exp exp_001,exp_002,exp_003,exp_004,exp_005,exp_006,exp_007,exp_008,exp_009,exp_010

Expected time: 5-10 hours total (10 experiments × 30-60 min each)

What Gets Created

After running exp_001, you'll see:

results/001_trends_only/
├── config.json                # Exact configuration used
├── model.pt                   # Trained model weights
├── metrics.json               # All performance metrics
├── training_curves.png        # Training/validation loss
├── roc_curves.png            # ROC curves (imminent + detected)
└── confusion_matrices.png    # Confusion matrices

Check Results

Open results/001_trends_only/metrics.json:

{
  "cv_results": {
    "imminent_auc_mean": 0.6344,
    "imminent_auc_std": 0.0213,
    "detected_auc_mean": 0.6617,
    "detected_auc_std": 0.0766
  },
  "test_results": {
    "imminent_auc": 0.4850,
    "imminent_f1": 0.00,
    "detected_auc": 0.6007,
    "detected_f1": 0.16
  }
}

Interpretation:

  • CV AUC (cross-validation) = How well model learns patterns
  • Test AUC = How well model generalizes to unseen data
  • Gap between CV and test = Overfitting indicator

Find Best Model

# Show top 3 by imminent AUC
python analyze_results.py --rank-by imminent_auc --top 3

# Or by F1 score
python analyze_results.py --rank-by imminent_f1 --top 3

Output:

Experiment                      Imm AUC    Det AUC     Imm F1     Det F1
--------------------------------------------------------------------------------
009_combined_best               0.7821     0.8456     0.6234     0.7123
002_trends_velocity             0.7654     0.8234     0.5987     0.6891
003_trends_velocity_accel       0.7543     0.8123     0.5876     0.6745

Visualizations

After running analyze_results.py, check:

  • results/comparison_imminent_auc.png - Bar chart of AUC scores
  • results/comparison_all_metrics.png - Multi-metric comparison
  • results/comparison_table.csv - Full results table (open in Excel)

Next Steps

  1. Run Phase 1 (all 10 experiments)
  2. Identify best features (highest test AUC with small CV-test gap)
  3. Configure Phase 2 (test model architectures with best features)
  4. Run Phase 2 (optimize hidden_size, num_layers, try GRU)
  5. Configure Phase 3 (fine-tune hyperparameters)
  6. Select final model for production

Troubleshooting

Error: "No module named 'yaml'"

pip install pyyaml

CUDA out of memory:

python run_experiment.py --exp exp_001 --device cpu

Want to test faster? Edit config/experiments.yaml, reduce:

  • num_epochs: 150num_epochs: 50
  • k_folds: 5k_folds: 3

Pro Tips

Start with just exp_001 to verify setup works
Run overnight batch for all Phase 1 experiments
Always check CV vs test AUC gap (should be < 0.05)
Look at confusion matrices to understand failure modes
Export best model: model.pt file can be loaded for production

Questions?

Check README.md for full documentation.