SmartCane/python_app/experiments/planet_download_with_ocm.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "bee51aa9",
   "metadata": {},
   "source": [
    "# Planet Data Download & Processing with OmniCloudMask\n",
    "\n",
    "This notebook extends the functionality of the original `planet_download.ipynb` by incorporating OmniCloudMask (OCM) for improved cloud and shadow detection in PlanetScope imagery. OCM is a state-of-the-art cloud masking tool that was originally trained on Sentinel-2 data but generalizes exceptionally well to PlanetScope imagery.\n",
    "\n",
    "## Key Features Added:\n",
    "- OmniCloudMask integration for advanced cloud and shadow detection\n",
    "- Comparison visualization between standard UDM masks and OCM masks\n",
    "- Options for both local processing and direct integration with SentinelHub\n",
    "- Support for batch processing multiple images"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9012f56",
   "metadata": {},
   "source": [
    "# Planet Data Download & Processing with OmniCloudMask\n",
    "\n",
    "This notebook extends the functionality of the original `planet_download.ipynb` by incorporating OmniCloudMask (OCM) for improved cloud and shadow detection in PlanetScope imagery. OCM is a state-of-the-art cloud masking tool that was originally trained on Sentinel-2 data but generalizes exceptionally well to PlanetScope imagery.\n",
    "\n",
    "## Key Features Added:\n",
    "- OmniCloudMask integration for advanced cloud and shadow detection\n",
    "- Comparison visualization between standard UDM masks and OCM masks\n",
    "- Options for both local processing and direct integration with SentinelHub\n",
    "- Support for batch processing multiple images"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e8cbe80",
   "metadata": {},
   "source": [
    "## 1. Load packages and connect to SentinelHub\n",
    "First, we'll install required packages and import dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "88d787b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Standard packages from original notebook\n",
    "import os\n",
    "import json\n",
    "import datetime\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from pathlib import Path\n",
    "from osgeo import gdal\n",
    "\n",
    "from sentinelhub import MimeType, CRS, BBox, SentinelHubRequest, SentinelHubDownloadClient, \\\n",
    "    DataCollection, bbox_to_dimensions, DownloadRequest, SHConfig, BBoxSplitter, read_data, Geometry, SentinelHubCatalog\n",
    "\n",
    "import time\n",
    "import shutil\n",
    "import geopandas as gpd\n",
    "from shapely.geometry import MultiLineString, MultiPolygon, Polygon, box, shape\n",
    "\n",
    "# Install OmniCloudMask if not present\n",
    "# Uncomment these lines to install dependencies\n",
    "# %pip install omnicloudmask rasterio"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "967d917d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "OmniCloudMask successfully loaded\n"
     ]
    }
   ],
   "source": [
    "# Import OmniCloudMask after installation\n",
    "try:\n",
    "    from omnicloudmask import predict_from_array, load_multiband, predict_from_load_func\n",
    "    from functools import partial\n",
    "    import rasterio as rio\n",
    "    HAS_OCM = True\n",
    "    print(\"OmniCloudMask successfully loaded\")\n",
    "except ImportError:\n",
    "    print(\"OmniCloudMask not installed. Run the cell above to install it or install manually with pip.\")\n",
    "    HAS_OCM = False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "39bd6361",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configure SentinelHub connection\n",
    "config = SHConfig()\n",
    "config.sh_client_id = '1a72d811-4f0e-4447-8282-df09608cff44'\n",
    "config.sh_client_secret = 'FcBlRL29i9ZmTzhmKTv1etSMFs5PxSos'\n",
    "\n",
    "catalog = SentinelHubCatalog(config=config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "99f4f255",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configure BYOC data collection\n",
    "collection_id = 'c691479f-358c-46b1-b0f0-e12b70a9856c'\n",
    "byoc = DataCollection.define_byoc(\n",
    "    collection_id,\n",
    "    name='planet_data2',\n",
    "    is_timeless=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04ad9f39",
   "metadata": {},
   "source": [
    "## 2. Configure project settings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "672bd92c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Project selection\n",
    "project = 'chemba'  # Change this to your project name\n",
    "\n",
    "# Number of days to process\n",
    "days = 30\n",
    "\n",
    "# Set this to True to delete intermediate files after processing\n",
    "empty_folder_question = True\n",
    "\n",
    "# Output directories setup\n",
    "BASE_PATH = Path('../laravel_app/storage/app') / os.getenv('PROJECT_DIR', project) \n",
    "BASE_PATH_SINGLE_IMAGES = Path(BASE_PATH / 'single_images')\n",
    "OCM_MASKS_DIR = Path(BASE_PATH / 'ocm_masks')  # Directory for OmniCloudMask results\n",
    "folder_for_merged_tifs = str(BASE_PATH / 'merged_tif')\n",
    "folder_for_virtual_raster = str(BASE_PATH / 'merged_virtual')\n",
    "geojson_file = Path(BASE_PATH /'Data'/ 'pivot.geojson')\n",
    "\n",
    "# Create directories if they don't exist\n",
    "for directory in [BASE_PATH_SINGLE_IMAGES, OCM_MASKS_DIR, \n",
    "                 Path(folder_for_merged_tifs), Path(folder_for_virtual_raster)]:\n",
    "    directory.mkdir(exist_ok=True, parents=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a69df5ab",
   "metadata": {},
   "source": [
    "## 3. Define OmniCloudMask Functions\n",
    "\n",
    "Here we implement the functionality to use OmniCloudMask for cloud/shadow detection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "51f33368",
   "metadata": {},
   "outputs": [],
   "source": [
    "def process_with_ocm(image_path, output_dir=None, save_mask=True, resample_res=10):\n",
    "    \"\"\"\n",
    "    Process a PlanetScope image with OmniCloudMask\n",
    "    \n",
    "    Parameters:\n",
    "    -----------\n",
    "    image_path : str or Path\n",
    "        Path to the PlanetScope image (TIFF format)\n",
    "    output_dir : str or Path, optional\n",
    "        Directory to save the mask, if None, uses same directory as image\n",
    "    save_mask : bool, default=True\n",
    "        Whether to save the mask to disk\n",
    "    resample_res : int, default=10\n",
    "        Resolution in meters to resample the image to (OCM works best at 10m)\n",
    "        \n",
    "    Returns:\n",
    "    --------\n",
    "    tuple: (mask_array, profile)\n",
    "        The cloud/shadow mask as a numpy array and the rasterio profile\n",
    "    \"\"\"\n",
    "    if not HAS_OCM:\n",
    "        print(\"OmniCloudMask not available. Please install with pip install omnicloudmask\")\n",
    "        return None, None\n",
    "    \n",
    "    # Ensure image_path is a Path object\n",
    "    image_path = Path(image_path)\n",
    "    \n",
    "    # If no output directory specified, use same directory as image\n",
    "    if output_dir is None:\n",
    "        output_dir = image_path.parent\n",
    "    else:\n",
    "        output_dir = Path(output_dir)\n",
    "        output_dir.mkdir(exist_ok=True, parents=True)\n",
    "    \n",
    "    # Define output path for mask\n",
    "    mask_path = output_dir / f\"{image_path.stem}_ocm_mask.tif\"\n",
    "    \n",
    "    try:\n",
    "        # For PlanetScope 4-band images, bands are [B,G,R,NIR]\n",
    "        # We need [R,G,NIR] for OmniCloudMask in this order\n",
    "        # Set band_order=[3, 2, 4] for the standard 4-band PlanetScope imagery\n",
    "        band_order = [3, 2, 4]  # For 4-band images: [R,G,NIR]\n",
    "        \n",
    "        # Load and resample image\n",
    "        print(f\"Loading image: {image_path}\")\n",
    "        rgn_data, profile = load_multiband(\n",
    "            input_path=image_path,\n",
    "            resample_res=resample_res,\n",
    "            band_order=band_order\n",
    "        )\n",
    "        \n",
    "        # Generate cloud and shadow mask\n",
    "        print(\"Applying OmniCloudMask...\")\n",
    "        prediction = predict_from_array(rgn_data)\n",
    "        \n",
    "        # Save the mask if requested\n",
    "        if save_mask:\n",
    "            profile.update(count=1, dtype='uint8')\n",
    "            with rio.open(mask_path, 'w', **profile) as dst:\n",
    "                dst.write(prediction.astype('uint8'), 1)\n",
    "            print(f\"Saved mask to: {mask_path}\")\n",
    "            \n",
    "        # Summary of detected features\n",
    "        n_total = prediction.size\n",
    "        n_clear = np.sum(prediction == 0)\n",
    "        n_thick = np.sum(prediction == 1)\n",
    "        n_thin = np.sum(prediction == 2)\n",
    "        n_shadow = np.sum(prediction == 3)\n",
    "        \n",
    "        print(f\"OCM Classification Results:\")\n",
    "        print(f\"  Clear pixels: {n_clear} ({100*n_clear/n_total:.1f}%)\")\n",
    "        print(f\"  Thick clouds: {n_thick} ({100*n_thick/n_total:.1f}%)\")\n",
    "        print(f\"  Thin clouds: {n_thin} ({100*n_thin/n_total:.1f}%)\")\n",
    "        print(f\"  Cloud shadows: {n_shadow} ({100*n_shadow/n_total:.1f}%)\")\n",
    "        \n",
    "        return prediction, profile\n",
    "        \n",
    "    except Exception as e:\n",
    "        print(f\"Error processing image with OmniCloudMask: {str(e)}\")\n",
    "        return None, None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "bac2f620",
   "metadata": {},
   "outputs": [],
   "source": [
    "def apply_ocm_mask_to_image(image_path, mask_array, output_path=None):\n",
    "    \"\"\"\n",
    "    Apply an OmniCloudMask to a Planet image and save the masked version\n",
    "    \n",
    "    Parameters:\n",
    "    -----------\n",
    "    image_path : str or Path\n",
    "        Path to the input image\n",
    "    mask_array : numpy.ndarray\n",
    "        The cloud/shadow mask from OmniCloudMask\n",
    "    output_path : str or Path, optional\n",
    "        Path to save the masked image, if None, uses image_path with '_masked' suffix\n",
    "        \n",
    "    Returns:\n",
    "    --------\n",
    "    str: Path to the masked image\n",
    "    \"\"\"\n",
    "    image_path = Path(image_path)\n",
    "    \n",
    "    if output_path is None:\n",
    "        output_path = image_path.parent / f\"{image_path.stem}_masked.tif\"\n",
    "    \n",
    "    try:\n",
    "        # Open the original image\n",
    "        with rio.open(image_path) as src:\n",
    "            data = src.read()\n",
    "            profile = src.profile.copy()\n",
    "            \n",
    "        # Check dimensions match or make them match\n",
    "        if data.shape[1:] != mask_array.shape:\n",
    "            # Need to resample the mask\n",
    "            from rasterio.warp import reproject, Resampling\n",
    "            # TODO: Implement resampling if needed\n",
    "            print(\"Warning: Mask and image dimensions don't match\")\n",
    "        \n",
    "        # Create a binary mask (0 = cloud/shadow, 1 = clear)\n",
    "        # OmniCloudMask: 0=clear, 1=thick cloud, 2=thin cloud, 3=shadow\n",
    "        binary_mask = np.ones_like(mask_array)\n",
    "        binary_mask[mask_array > 0] = 0  # Set non-clear pixels to 0\n",
    "        \n",
    "        # Apply the mask to all bands\n",
    "        masked_data = data.copy()\n",
    "        for i in range(data.shape[0]):\n",
    "            # Where mask is 0, set the pixel to nodata\n",
    "            masked_data[i][binary_mask == 0] = profile.get('nodata', 0)\n",
    "        \n",
    "        # Write the masked image\n",
    "        with rio.open(output_path, 'w', **profile) as dst:\n",
    "            dst.write(masked_data)\n",
    "            \n",
    "        print(f\"Masked image saved to: {output_path}\")\n",
    "        return str(output_path)\n",
    "        \n",
    "    except Exception as e:\n",
    "        print(f\"Error applying mask to image: {str(e)}\")\n",
    "        return None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "ad6770ac",
   "metadata": {},
   "outputs": [],
   "source": [
    "def process_all_images_with_ocm(directory, output_dir=None, pattern=\"*.tif\"):\n",
    "    \"\"\"\n",
    "    Process all images in a directory with OmniCloudMask\n",
    "    \n",
    "    Parameters:\n",
    "    -----------\n",
    "    directory : str or Path\n",
    "        Directory containing PlanetScope images\n",
    "    output_dir : str or Path, optional\n",
    "        Directory to save results, defaults to a subfolder of input directory\n",
    "    pattern : str, default=\"*.tif\"\n",
    "        Glob pattern to match image files\n",
    "        \n",
    "    Returns:\n",
    "    --------\n",
    "    list: Paths to processed images\n",
    "    \"\"\"\n",
    "    directory = Path(directory)\n",
    "    \n",
    "    if output_dir is None:\n",
    "        output_dir = directory / \"ocm_processed\"\n",
    "    else:\n",
    "        output_dir = Path(output_dir)\n",
    "        \n",
    "    output_dir.mkdir(exist_ok=True, parents=True)\n",
    "    \n",
    "    # Find all matching image files\n",
    "    image_files = list(directory.glob(pattern))\n",
    "    \n",
    "    if not image_files:\n",
    "        print(f\"No files matching pattern '{pattern}' found in {directory}\")\n",
    "        return []\n",
    "    \n",
    "    print(f\"Found {len(image_files)} images to process\")\n",
    "    processed_images = []\n",
    "    \n",
    "    # Process each image\n",
    "    for img_path in image_files:\n",
    "        print(f\"\\nProcessing: {img_path.name}\")\n",
    "        mask_array, profile = process_with_ocm(img_path, output_dir=output_dir)\n",
    "        \n",
    "        if mask_array is not None:\n",
    "            # Apply mask to create cloud-free image\n",
    "            output_path = output_dir / f\"{img_path.stem}_masked.tif\"\n",
    "            masked_path = apply_ocm_mask_to_image(img_path, mask_array, output_path)\n",
    "            if masked_path:\n",
    "                processed_images.append(masked_path)\n",
    "    \n",
    "    return processed_images"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46e34d74",
   "metadata": {},
   "source": [
    "## 4. Define functions from the original notebook (modified for OCM integration)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "85e07fa8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define evalscripts (from original notebook)\n",
    "\n",
    "# Original evalscript without cloud/shadow detection (for comparison)\n",
    "evalscript_original = \"\"\"\n",
    "    //VERSION=3\n",
    "    function setup() {\n",
    "        return {\n",
    "            input: [{\n",
    "                bands: [\"red\", \"green\", \"blue\", \"nir\", \"udm1\"]\n",
    "            }],\n",
    "            output: {\n",
    "                bands: 4,\n",
    "                sampleType: \"FLOAT32\"\n",
    "            }\n",
    "        };\n",
    "    }\n",
    "\n",
    "    function evaluatePixel(sample) {\n",
    "        // Scale the bands\n",
    "        var scaledBlue = 2.5 * sample.blue / 10000;\n",
    "        var scaledGreen = 2.5 * sample.green / 10000;\n",
    "        var scaledRed = 2.5 * sample.red / 10000;\n",
    "        var scaledNIR = 2.5 * sample.nir / 10000;\n",
    "        \n",
    "        // Only use udm1 mask (Planet's usable data mask)\n",
    "        if (sample.udm1 == 0) {\n",
    "            return [scaledRed, scaledGreen, scaledBlue, scaledNIR];\n",
    "        } else {\n",
    "            return [NaN, NaN, NaN, NaN];\n",
    "        }\n",
    "    }\n",
    "\"\"\"\n",
    "\n",
    "# Placeholder for code to be replaced by OCM-processed imagery later\n",
    "evalscript_true_color = evalscript_original"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "9dee95dd",
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_true_color_request_day(time_interval, bbox, size):\n",
    "    \"\"\"Request with original evalscript (will be replaced by OCM results later)\"\"\"\n",
    "    return SentinelHubRequest(\n",
    "        evalscript=evalscript_true_color,\n",
    "        input_data=[\n",
    "            SentinelHubRequest.input_data(\n",
    "                data_collection=DataCollection.planet_data2,\n",
    "                time_interval=(time_interval, time_interval)\n",
    "            )\n",
    "        ],\n",
    "        responses=[\n",
    "            SentinelHubRequest.output_response('default', MimeType.TIFF)\n",
    "        ],\n",
    "        bbox=bbox,\n",
    "        size=size,\n",
    "        config=config,\n",
    "        data_folder=str(BASE_PATH_SINGLE_IMAGES / time_interval),\n",
    "    )\n",
    "\n",
    "def get_original_request_day(time_interval, bbox, size):\n",
    "    \"\"\"Request with Planet UDM-only mask (for comparison)\"\"\"\n",
    "    return SentinelHubRequest(\n",
    "        evalscript=evalscript_original,\n",
    "        input_data=[\n",
    "            SentinelHubRequest.input_data(\n",
    "                data_collection=DataCollection.planet_data2,\n",
    "                time_interval=(time_interval, time_interval)\n",
    "            )\n",
    "        ],\n",
    "        responses=[\n",
    "            SentinelHubRequest.output_response('default', MimeType.TIFF)\n",
    "        ],\n",
    "        bbox=bbox,\n",
    "        size=size,\n",
    "        config=config,\n",
    "    )\n",
    "\n",
    "def download_function(slot, bbox, size):\n",
    "    \"\"\"Download imagery for a given date and bbox\"\"\"\n",
    "    list_of_requests = [get_true_color_request_day(slot, bbox, size)]\n",
    "    list_of_requests = [request.download_list[0] for request in list_of_requests]\n",
    "    data = SentinelHubDownloadClient(config=config).download(list_of_requests, max_threads=15)\n",
    "    print(f'Image downloaded for {slot} and bbox {str(bbox)}')\n",
    "    time.sleep(.1)\n",
    "    \n",
    "def merge_files(slot):\n",
    "    \"\"\"Merge downloaded tiles into a single image\"\"\"\n",
    "    # Get all response.tiff files\n",
    "    slot_folder = Path(BASE_PATH_SINGLE_IMAGES / slot)\n",
    "    if not slot_folder.exists():\n",
    "        raise ValueError(f\"Folder not found: {slot_folder}\")\n",
    "    \n",
    "    file_list = [f\"{x}/response.tiff\" for x in slot_folder.iterdir() if Path(f\"{x}/response.tiff\").exists()]\n",
    "    \n",
    "    if not file_list:\n",
    "        raise ValueError(f\"No response.tiff files found in {slot_folder}\")\n",
    "    \n",
    "    print(f\"Found {len(file_list)} files to merge\")\n",
    "    \n",
    "    folder_for_merged_tifs = str(BASE_PATH / 'merged_tif' / f\"{slot}.tif\")\n",
    "    folder_for_virtual_raster = str(BASE_PATH / 'merged_virtual' / f\"merged{slot}.vrt\")\n",
    "    \n",
    "    # Make sure parent directories exist\n",
    "    Path(folder_for_merged_tifs).parent.mkdir(exist_ok=True, parents=True)\n",
    "    Path(folder_for_virtual_raster).parent.mkdir(exist_ok=True, parents=True)\n",
    "\n",
    "    try:\n",
    "        # Create a virtual raster\n",
    "        print(f\"Building VRT from {len(file_list)} files\")\n",
    "        vrt_all = gdal.BuildVRT(folder_for_virtual_raster, file_list)\n",
    "        \n",
    "        if vrt_all is None:\n",
    "            raise ValueError(f\"Failed to create virtual raster: {folder_for_virtual_raster}\")\n",
    "        \n",
    "        # Write VRT to disk\n",
    "        vrt_all.FlushCache()\n",
    "        \n",
    "        # Convert to GeoTIFF\n",
    "        print(f\"Translating VRT to GeoTIFF: {folder_for_merged_tifs}\")\n",
    "        result = gdal.Translate(\n",
    "            folder_for_merged_tifs,\n",
    "            folder_for_virtual_raster,\n",
    "            xRes=10,\n",
    "            yRes=10,\n",
    "            resampleAlg=\"bilinear\"  # or \"nearest\" if you prefer\n",
    "        )\n",
    "        \n",
    "        if result is None:\n",
    "            raise ValueError(f\"Failed to translate VRT to GeoTIFF: {folder_for_merged_tifs}\")\n",
    "        \n",
    "        # Make sure the file was created\n",
    "        if not Path(folder_for_merged_tifs).exists():\n",
    "            raise ValueError(f\"Output GeoTIFF file was not created: {folder_for_merged_tifs}\")\n",
    "            \n",
    "        return folder_for_merged_tifs\n",
    "    except Exception as e:\n",
    "        print(f\"Error during merging: {str(e)}\")\n",
    "        # If we have individual files but merging failed, return the first one as a fallback\n",
    "        if file_list:\n",
    "            print(f\"Returning first file as fallback: {file_list[0]}\")\n",
    "            return file_list[0]\n",
    "        raise"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d21a132b",
   "metadata": {},
   "source": [
    "## 5. Setup date ranges and test data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "c00fc762",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time windows to process:\n",
      "\n",
      "2025-04-17\n",
      "2025-04-18\n",
      "2025-04-19\n",
      "...\n",
      "2025-05-14\n",
      "2025-05-15\n",
      "2025-05-16\n"
     ]
    }
   ],
   "source": [
    "# Configure date ranges (from original notebook)\n",
    "days_needed = int(os.environ.get(\"DAYS\", days))\n",
    "date_str = os.environ.get(\"DATE\")\n",
    "\n",
    "if date_str:\n",
    "    end = datetime.datetime.strptime(date_str, \"%Y-%m-%d\").date()\n",
    "else:\n",
    "    end = datetime.date.today()    \n",
    "\n",
    "start = end - datetime.timedelta(days=days_needed - 1)\n",
    "slots = [(start + datetime.timedelta(days=i)).strftime('%Y-%m-%d') for i in range(days_needed)]\n",
    "\n",
    "print('Time windows to process:\\n')\n",
    "if len(slots) > 10:\n",
    "    for slot in slots[:3]:\n",
    "        print(slot)\n",
    "    print(\"...\")\n",
    "    for slot in slots[-3:]:\n",
    "        print(slot)\n",
    "else:\n",
    "    for slot in slots:\n",
    "        print(slot)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "8947de86",
   "metadata": {},
   "outputs": [],
   "source": [
    "# For testing, use a specific date with known clouds/shadows\n",
    "# Comment this out to process all dates defined above\n",
    "slots = ['2024-10-22']  # Change to a date with clouds/shadows in your area"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ede9e761",
   "metadata": {},
   "source": [
    "## 6. Load geospatial data and prepare for processing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "485e5fa1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Area bounding box: BBox(((-47.09879025717693, -22.67132809994226), (-47.09188307701802, -22.66813642658124)), crs=CRS('4326'))\n"
     ]
    }
   ],
   "source": [
    "# Load field boundaries and prepare bounding boxes\n",
    "geo_json = gpd.read_file(str(geojson_file))\n",
    "geometries = [Geometry(geometry, crs=CRS.WGS84) for geometry in geo_json.geometry]\n",
    "shapely_geometries = [geometry.geometry for geometry in geometries]\n",
    "\n",
    "# Split area into manageable bounding boxes\n",
    "bbox_splitter = BBoxSplitter(\n",
    "    shapely_geometries, CRS.WGS84, (1, 1), reduce_bbox_sizes=True\n",
    ")\n",
    "print(\"Area bounding box:\", bbox_splitter.get_area_bbox().__repr__())\n",
    "bbox_list = bbox_splitter.get_bbox_list()\n",
    "info_list = bbox_splitter.get_info_list()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "0eb2ccf1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['2024-12-30']\n",
      "Total slots: 1\n",
      "Available slots: 1\n",
      "Excluded slots due to empty dates: 0\n"
     ]
    }
   ],
   "source": [
    "# Function to check if images are available for each date\n",
    "def is_image_available(date):\n",
    "    for bbox in bbox_list:\n",
    "        search_iterator = catalog.search(\n",
    "            collection=byoc,\n",
    "            bbox=bbox,\n",
    "            time=(date, date)\n",
    "        )\n",
    "        if len(list(search_iterator)) > 0:\n",
    "            return True\n",
    "    return False\n",
    "\n",
    "# Filter slots to only include dates with available images\n",
    "available_slots = [slot for slot in slots if is_image_available(slot)]\n",
    "comparison_slots = available_slots[:min(5, len(available_slots))]\n",
    "\n",
    "print(available_slots)\n",
    "print(f\"Total slots: {len(slots)}\")\n",
    "print(f\"Available slots: {len(available_slots)}\")\n",
    "print(f\"Excluded slots due to empty dates: {len(slots) - len(available_slots)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d628f797",
   "metadata": {},
   "source": [
    "## 7. Download and process images"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "8966f944",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Downloading images for date: 2024-12-30\n",
      "  Processing bbox 1/1\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "c:\\Users\\timon\\anaconda3\\Lib\\site-packages\\sentinelhub\\geometry.py:137: SHDeprecationWarning: Initializing `BBox` objects from `BBox` objects will no longer be possible in future versions.\n",
      "  return cls._tuple_from_bbox(bbox)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Image downloaded for 2024-12-30 and bbox -47.09879025717693,-22.67132809994226,-47.09188307701802,-22.66813642658124\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\timon\\AppData\\Local\\Temp\\ipykernel_25312\\3091203660.py:43: SHDeprecationWarning: The string representation of `BBox` will change to match its `repr` representation.\n",
      "  print(f'Image downloaded for {slot} and bbox {str(bbox)}')\n"
     ]
    }
   ],
   "source": [
    "# Download images\n",
    "resolution = 10  # Using 10m resolution for better OmniCloudMask results\n",
    "\n",
    "for slot in available_slots:\n",
    "    print(f\"\\nDownloading images for date: {slot}\")\n",
    "    \n",
    "    for i, bbox in enumerate(bbox_list):\n",
    "        bbox_obj = BBox(bbox=bbox, crs=CRS.WGS84)\n",
    "        size = bbox_to_dimensions(bbox_obj, resolution=resolution)\n",
    "        print(f\"  Processing bbox {i+1}/{len(bbox_list)}\")\n",
    "        download_function(slot, bbox_obj, size)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "43a8b55e",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "c:\\Users\\timon\\anaconda3\\Lib\\site-packages\\sentinelhub\\geometry.py:137: SHDeprecationWarning: Initializing `BBox` objects from `BBox` objects will no longer be possible in future versions.\n",
      "  return cls._tuple_from_bbox(bbox)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Image downloaded for 2024-12-30 and bbox -47.09879025717693,-22.67132809994226,-47.09188307701802,-22.66813642658124\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\timon\\AppData\\Local\\Temp\\ipykernel_25312\\3091203660.py:43: SHDeprecationWarning: The string representation of `BBox` will change to match its `repr` representation.\n",
      "  print(f'Image downloaded for {slot} and bbox {str(bbox)}')\n"
     ]
    }
   ],
   "source": [
    "resolution = 3\n",
    "\n",
    "for slot in available_slots:\n",
    "    for bbox in bbox_list:\n",
    "        bbox = BBox(bbox=bbox, crs=CRS.WGS84)\n",
    "        size = bbox_to_dimensions(bbox, resolution=resolution)\n",
    "        download_function(slot, bbox, size)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "f15f04f3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Found 2 files to merge\n",
      "Building VRT from 2 files\n",
      "Translating VRT to GeoTIFF: ..\\laravel_app\\storage\\app\\citrus_brazil_trial\\merged_tif\\2024-12-30.tif\n",
      "Error during merging: Failed to translate VRT to GeoTIFF: ..\\laravel_app\\storage\\app\\citrus_brazil_trial\\merged_tif\\2024-12-30.tif\n",
      "Returning first file as fallback: ..\\laravel_app\\storage\\app\\citrus_brazil_trial\\single_images\\2024-12-30\\0aeb88ec276c5a05278127eb769d73ec/response.tiff\n"
     ]
    }
   ],
   "source": [
    "for slot in available_slots:\n",
    "    merge_files(slot)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee0ae99e",
   "metadata": {},
   "source": [
    "## 8. Clean up intermediate files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0fe25a4d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Clean up intermediate files if requested\n",
    "folders_to_empty = [BASE_PATH / 'merged_virtual', BASE_PATH_SINGLE_IMAGES]\n",
    "\n",
    "def empty_folders(folders, run=True):\n",
    "    if not run:\n",
    "        print(\"Skipping empty_folders function.\")\n",
    "        return\n",
    "    \n",
    "    for folder in folders:\n",
    "        try:\n",
    "            for filename in os.listdir(folder):\n",
    "                file_path = os.path.join(folder, filename)\n",
    "                try:\n",
    "                    if os.path.isfile(file_path):\n",
    "                        os.unlink(file_path)\n",
    "                    elif os.path.isdir(file_path):\n",
    "                        shutil.rmtree(file_path)\n",
    "                except Exception as e:\n",
    "                    print(f\"Error: {e}\")\n",
    "            print(f\"Emptied folder: {folder}\")\n",
    "        except OSError as e:\n",
    "            print(f\"Error: {e}\")\n",
    "\n",
    "# Call the function to empty folders only if requested\n",
    "empty_folders(folders_to_empty, run=False)  # Change to True if you want to clean up"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25638297",
   "metadata": {},
   "source": [
    "## 9. Visualize and compare cloud masks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "7d3a73e4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processing ..\\laravel_app\\storage\\app\\chemba\\merged_tif\\2024-10-22.tif with c:\\\\Users\\\\timon\\\\Resilience BV\\\\4020 SCane ESA DEMO - Documenten\\\\General\\\\4020 SCDEMO Team\\\\4020 TechnicalData\\\\WP3\\\\smartcane\\\\python_app\\\\planet_ocm_processor.py...\n",
      "Input image: ..\\laravel_app\\storage\\app\\chemba\\merged_tif\\2024-10-22.tif\n",
      "Output directory: ..\\laravel_app\\storage\\app\\chemba\\ocm_masks\n",
      "--- Running gdalinfo for 2024-10-22.tif ---\n",
      "--- gdalinfo STDOUT ---\n",
      "Driver: GTiff/GeoTIFF\n",
      "Files: ..\\laravel_app\\storage\\app\\chemba\\merged_tif\\2024-10-22.tif\n",
      "Size is 3605, 2162\n",
      "Coordinate System is:\n",
      "GEOGCRS[\"WGS 84\",\n",
      "    ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n",
      "        MEMBER[\"World Geodetic System 1984 (Transit)\"],\n",
      "        MEMBER[\"World Geodetic System 1984 (G730)\"],\n",
      "        MEMBER[\"World Geodetic System 1984 (G873)\"],\n",
      "        MEMBER[\"World Geodetic System 1984 (G1150)\"],\n",
      "        MEMBER[\"World Geodetic System 1984 (G1674)\"],\n",
      "        MEMBER[\"World Geodetic System 1984 (G1762)\"],\n",
      "        MEMBER[\"World Geodetic System 1984 (G2139)\"],\n",
      "        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n",
      "            LENGTHUNIT[\"metre\",1]],\n",
      "        ENSEMBLEACCURACY[2.0]],\n",
      "    PRIMEM[\"Greenwich\",0,\n",
      "        ANGLEUNIT[\"degree\",0.0174532925199433]],\n",
      "    CS[ellipsoidal,2],\n",
      "        AXIS[\"geodetic latitude (Lat)\",north,\n",
      "            ORDER[1],\n",
      "            ANGLEUNIT[\"degree\",0.0174532925199433]],\n",
      "        AXIS[\"geodetic longitude (Lon)\",east,\n",
      "            ORDER[2],\n",
      "            ANGLEUNIT[\"degree\",0.0174532925199433]],\n",
      "    USAGE[\n",
      "        SCOPE[\"Horizontal component of 3D system.\"],\n",
      "        AREA[\"World.\"],\n",
      "        BBOX[-90,-180,90,180]],\n",
      "    ID[\"EPSG\",4326]]\n",
      "Data axis to CRS axis mapping: 2,1\n",
      "Origin = (34.883117060422094,-17.291731714592061)\n",
      "Pixel Size = (0.000027942347249,-0.000027653607237)\n",
      "Metadata:\n",
      "  AREA_OR_POINT=Area\n",
      "Image Structure Metadata:\n",
      "  INTERLEAVE=PIXEL\n",
      "Corner Coordinates:\n",
      "Upper Left  (  34.8831171, -17.2917317) ( 34d52'59.22\"E, 17d17'30.23\"S)\n",
      "Lower Left  (  34.8831171, -17.3515188) ( 34d52'59.22\"E, 17d21' 5.47\"S)\n",
      "Upper Right (  34.9838492, -17.2917317) ( 34d59' 1.86\"E, 17d17'30.23\"S)\n",
      "Lower Right (  34.9838492, -17.3515188) ( 34d59' 1.86\"E, 17d21' 5.47\"S)\n",
      "Center      (  34.9334831, -17.3216253) ( 34d56' 0.54\"E, 17d19'17.85\"S)\n",
      "Band 1 Block=3605x1 Type=Byte, ColorInterp=Gray\n",
      "Band 2 Block=3605x1 Type=Byte, ColorInterp=Undefined\n",
      "Band 3 Block=3605x1 Type=Byte, ColorInterp=Undefined\n",
      "Band 4 Block=3605x1 Type=Byte, ColorInterp=Undefined\n",
      "\n",
      "--- Attempting to run OCM processor for 2024-10-22.tif ---\n",
      "--- Script STDOUT ---\n",
      "--- Starting OCM processing for 2024-10-22.tif ---\n",
      "Input 3m image (absolute): C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\merged_tif\\2024-10-22.tif\n",
      "Output base directory (absolute): C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\n",
      "Intermediate 10m image path: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Resampling C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\merged_tif\\2024-10-22.tif to (10, 10)m resolution -> C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Reprojected raster saved to: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m_reprojected.tif\n",
      "Successfully resampled image saved to: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Loading 10m image for OCM: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Applying OmniCloudMask...\n",
      "Error processing 10m image with OmniCloudMask: Source shape (1, 1, 673, 1078) is inconsistent with given indexes 1\n",
      "OCM processing failed. Exiting.\n",
      "\n",
      "--- Script STDERR ---\n",
      "c:\\Users\\timon\\anaconda3\\Lib\\site-packages\\omnicloudmask\\cloud_mask.py:145: UserWarning: Significant no-data areas detected. Adjusting patch size to 336px and overlap to 168px to minimize no-data patches.\n",
      "  warnings.warn(\n",
      "\n",
      "Successfully processed 2024-10-22.tif with c:\\\\Users\\\\timon\\\\Resilience BV\\\\4020 SCane ESA DEMO - Documenten\\\\General\\\\4020 SCDEMO Team\\\\4020 TechnicalData\\\\WP3\\\\smartcane\\\\python_app\\\\planet_ocm_processor.py\n",
      "--- Script STDOUT ---\n",
      "--- Starting OCM processing for 2024-10-22.tif ---\n",
      "Input 3m image (absolute): C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\merged_tif\\2024-10-22.tif\n",
      "Output base directory (absolute): C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\n",
      "Intermediate 10m image path: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Resampling C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\merged_tif\\2024-10-22.tif to (10, 10)m resolution -> C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Reprojected raster saved to: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m_reprojected.tif\n",
      "Successfully resampled image saved to: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Loading 10m image for OCM: C:\\Users\\timon\\Resilience BV\\4020 SCane ESA DEMO - Documenten\\General\\4020 SCDEMO Team\\4020 TechnicalData\\WP3\\smartcane\\laravel_app\\storage\\app\\chemba\\ocm_masks\\intermediate_ocm_files\\2024-10-22_10m.tif\n",
      "Applying OmniCloudMask...\n",
      "Error processing 10m image with OmniCloudMask: Source shape (1, 1, 673, 1078) is inconsistent with given indexes 1\n",
      "OCM processing failed. Exiting.\n",
      "\n",
      "--- Script STDERR ---\n",
      "c:\\Users\\timon\\anaconda3\\Lib\\site-packages\\omnicloudmask\\cloud_mask.py:145: UserWarning: Significant no-data areas detected. Adjusting patch size to 336px and overlap to 168px to minimize no-data patches.\n",
      "  warnings.warn(\n",
      "\n",
      "Successfully processed 2024-10-22.tif with c:\\\\Users\\\\timon\\\\Resilience BV\\\\4020 SCane ESA DEMO - Documenten\\\\General\\\\4020 SCDEMO Team\\\\4020 TechnicalData\\\\WP3\\\\smartcane\\\\python_app\\\\planet_ocm_processor.py\n"
     ]
    }
   ],
   "source": [
    "import subprocess\n",
    "import sys # Added for more detailed error printing\n",
    "\n",
    "# Path to the Python script\n",
    "script_path = r\"c:\\\\Users\\\\timon\\\\Resilience BV\\\\4020 SCane ESA DEMO - Documenten\\\\General\\\\4020 SCDEMO Team\\\\4020 TechnicalData\\\\WP3\\\\smartcane\\\\python_app\\\\planet_ocm_processor.py\"\n",
    "\n",
    "# Directory containing the recently downloaded images (merged TIFFs)\n",
    "images_dir = BASE_PATH / 'merged_tif'\n",
    "\n",
    "# Output directory for OCM processor (defined in cell 8)\n",
    "# OCM_MASKS_DIR should be defined earlier in your notebook, e.g.,\n",
    "# OCM_MASKS_DIR = Path(BASE_PATH / 'ocm_masks')\n",
    "# OCM_MASKS_DIR.mkdir(exist_ok=True, parents=True) # Ensure it exists\n",
    "available_slots = [\"2024-10-22\"]  # Change this to the available slots you want to process\n",
    "# Run the script for each available slot (date)\n",
    "for slot in available_slots:\n",
    "    image_file = images_dir / f\"{slot}.tif\"\n",
    "    if image_file.exists():\n",
    "        print(f\"Processing {image_file} with {script_path}...\")\n",
    "        print(f\"Input image: {str(image_file)}\")\n",
    "        print(f\"Output directory: {str(OCM_MASKS_DIR)}\")\n",
    "        \n",
    "        try:\n",
    "            # Run gdalinfo to inspect the image before processing\n",
    "            print(f\"--- Running gdalinfo for {image_file.name} ---\")\n",
    "            gdalinfo_result = subprocess.run(\n",
    "                [\"gdalinfo\", str(image_file)],\n",
    "                capture_output=True,\n",
    "                text=True,\n",
    "                check=True\n",
    "            )\n",
    "            print(\"--- gdalinfo STDOUT ---\")\n",
    "            print(gdalinfo_result.stdout)\n",
    "            if gdalinfo_result.stderr:\n",
    "                print(\"--- gdalinfo STDERR ---\")\n",
    "                print(gdalinfo_result.stderr)\n",
    "        except subprocess.CalledProcessError as e:\n",
    "            print(f\"gdalinfo failed for {image_file.name}:\")\n",
    "            print(\"--- gdalinfo STDOUT ---\")\n",
    "            print(e.stdout)\n",
    "            print(\"--- gdalinfo STDERR ---\")\n",
    "            print(e.stderr)\n",
    "            # Decide if you want to continue to the next image or stop\n",
    "            # continue \n",
    "        except FileNotFoundError:\n",
    "            print(\"Error: gdalinfo command not found. Make sure GDAL is installed and in your system's PATH.\")\n",
    "            # Decide if you want to continue or stop\n",
    "            # break # or continue\n",
    "        \n",
    "        print(f\"--- Attempting to run OCM processor for {image_file.name} ---\")\n",
    "        try:\n",
    "            # Run the script, passing the image file and OCM_MASKS_DIR as arguments\n",
    "            process_result = subprocess.run(\n",
    "                [sys.executable, str(script_path), str(image_file), str(OCM_MASKS_DIR)], \n",
    "                capture_output=True, # Capture stdout and stderr\n",
    "                text=True, # Decode output as text\n",
    "                check=False # Do not raise an exception for non-zero exit codes, we'll check manually\n",
    "            )\n",
    "            \n",
    "            # Print the output from the script\n",
    "            print(\"--- Script STDOUT ---\")\n",
    "            print(process_result.stdout)\n",
    "            \n",
    "            if process_result.stderr:\n",
    "                print(\"--- Script STDERR ---\")\n",
    "                print(process_result.stderr)\n",
    "                \n",
    "            if process_result.returncode != 0:\n",
    "                print(f\"Error: Script {script_path} failed for {image_file.name} with exit code {process_result.returncode}\")\n",
    "            else:\n",
    "                print(f\"Successfully processed {image_file.name} with {script_path}\")\n",
    "                \n",
    "        except subprocess.CalledProcessError as e:\n",
    "            # This block will be executed if check=True and the script returns a non-zero exit code\n",
    "            print(f\"Error running script {script_path} for {image_file.name}:\")\n",
    "            print(\"--- Script STDOUT ---\")\n",
    "            print(e.stdout) # stdout from the script\n",
    "            print(\"--- Script STDERR ---\")\n",
    "            print(e.stderr) # stderr from the script (this will contain the GDAL error)\n",
    "        except Exception as e:\n",
    "            print(f\"An unexpected error occurred while trying to run {script_path} for {image_file.name}: {e}\")\n",
    "            \n",
    "    else:\n",
    "        print(f\"Image file not found: {image_file}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7cb00e6a",
   "metadata": {},
   "source": [
    "## 10. Understanding OmniCloudMask Results\n",
    "\n",
    "OmniCloudMask produces a classified raster with these values:\n",
    "- **0 = Clear**: No clouds or shadows detected\n",
    "- **1 = Thick Cloud**: Dense clouds that completely obscure the ground\n",
    "- **2 = Thin Cloud**: Semi-transparent clouds or haze\n",
    "- **3 = Shadow**: Cloud shadows on the ground\n",
    "\n",
    "The masked images have had all non-zero classes (clouds and shadows) removed, which provides cleaner data for analysis of crop conditions. This can significantly improve the accuracy of vegetation indices and other agricultural metrics derived from the imagery.\n",
    "\n",
    "For more information about OmniCloudMask, visit:\n",
    "- GitHub repository: https://github.com/DPIRD-DMA/OmniCloudMask\n",
    "- Paper: https://www.sciencedirect.com/science/article/pii/S0034425725000987"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2837be37",
   "metadata": {},
   "source": [
    "### 9a. Upsample OCM mask to 3x3m and apply to original high-res image\n",
    "\n",
    "This step ensures that the OCM cloud/shadow mask (generated at 10x10m) is upsampled to match the original 3x3m PlanetScope image, so the final masked output preserves the native resolution for downstream analysis."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}