This document outlines the current limitations of our single-image optical cloud removal model and establishes the technical path forward to address them.
As observed in Test Case 2 (Row 2) of our Validation Results:
- When the input is fully cloud-obscured (100% cloud cover), the optical sensors receive zero signal from the ground.
- Instead of hallucinating detailed field structures, the AI output is a flat, featureless dark blue patch.
- L1/L2 Loss & MSE Optimization: Our training loss focuses on pixel-wise minimization. Under total occlusion, the mathematically optimal strategy for the model to minimize error is to output the conditional mean of all plausible clear pixels. This results in "regression to the mean," producing a flat, blurred average rather than sharp features.
- Metric Blindness (The PSNR/SSIM Trap): Even though the output is flat and missing all agricultural boundaries, it still scored a relatively high PSNR of 28.69 dB and SSIM of 0.7513. This is because the ground truth happened to be mostly uniform dark-blue vegetation. The missing field lines represent a small fraction of the total pixels, so the mathematical penalty for missing them is low.
Warning
High aggregate PSNR/SSIM scores can mask complete reconstruction failures in heavily clouded regions.
Our validation dataset relies on real temporal pairs taken roughly 10 days apart (e.g., cloudy image on March 28, clear ground truth on April 7).
In highly active agricultural regions (such as Assam), 10 days is enough time for:
- Crops to grow or be harvested.
- Soil to dry out or be flooded.
- Tillage or land clearing to occur.
Thus, a portion of the reconstruction error is not a model failure—it is a measurement of actual ground changes over time.
When applying the model directly to raw LISS-IV data from the Bhoonidhi portal, we observed major scaling issues:
- The model was trained on Sentinel-2 data scaled to reflectance values (0–10,000).
- Raw LISS-IV images are delivered in Digital Numbers (0–1,023).
Direct inference without translation resulted in garbage checkerboard outputs. To resolve this, a production-ready pipeline must include automated radiometric calibration using the metadata file (BAND_META.txt):
Because optical wavelengths cannot penetrate clouds, no pure-optical model can solve 100% cloud occlusion. The definitive solution is SAR-Optical Fusion.
graph LR
subgraph Input Sensors
Cloudy_Optical[Cloudy LISS-IV Optical: Green, Red, NIR]
Clear_SAR[Sentinel-1 SAR: Radar VV/VH]
end
subgraph AI Fusion Model
S1_S2_Concat[Sensor Fusion & Concatenation] --> Denoise_Loop[Iterative Reverse Diffusion]
end
Cloudy_Optical --> S1_S2_Concat
Clear_SAR -->|Radar penetrates clouds to provide ground structure| S1_S2_Concat
Denoise_Loop --> Clear_Output[Analysis-Ready Cloud-Free Output]
- Integrate Sentinel-1 SAR: Synthetic Aperture Radar (SAR) operates in microwave frequencies, passing directly through clouds. Fusing VV/VH polarization channels provides the model with the exact ground geometry (edges, fields, rivers) even under thick clouds.
- Implement Perceptual Loss: Adding LPIPS (Learned Perceptual Image Patch Similarity) or adversarial loss terms will penalize the model for outputting flat/blurred averages, forcing it to generate sharp, realistic textures.