|
| 1 | +# Executive Summary |
| 2 | + |
| 3 | +Modern satellite and climate data products are generated by large, |
| 4 | +evolving software systems. These systems effectively act as the |
| 5 | +“measurement instruments” in today’s workflows, and their outputs are |
| 6 | +used in high‑stakes decisions about climate trends, sensor performance, |
| 7 | +and long‑term reprocessing. For this reason, decision‑makers increasingly |
| 8 | +need uncertainty information that is not only scientifically sound, but |
| 9 | +also remains in step with the software as it changes. |
| 10 | + |
| 11 | +Traditional approaches to uncertainty assume a fixed mathematical model |
| 12 | +that experts can write down and analyse by hand. In practice, however, |
| 13 | +the real behaviour of an operational processing chain is defined by the |
| 14 | +implementation itself: many modules, multiple dependencies, and regular |
| 15 | +updates. Keeping hand‑crafted formulas aligned with such a codebase is |
| 16 | +difficult and expensive, and tends to break down exactly in the large‑scale, |
| 17 | +long‑lived projects where robust uncertainty information is most important. |
| 18 | + |
| 19 | +The approach adopted here treats the software implementation as the primary |
| 20 | +object and derives uncertainty information directly from it. Instead of |
| 21 | +manually deriving sensitivities, we rely on capabilities that are already |
| 22 | +built into modern machine‑learning libraries: they can automatically assess |
| 23 | +how small changes in the inputs affect the outputs, even for complex models |
| 24 | +and large data volumes. These capabilities are mature, openly available, |
| 25 | +and widely used in industry and research, which reduces technical risk and |
| 26 | +avoids locking the project into niche or proprietary tooling. |
| 27 | + |
| 28 | +At the same time, the method works with data in the way it actually appears |
| 29 | +in remote sensing and climate science: as images, spectra, and spatiotemporal |
| 30 | +fields, not just as flat tables. This allows uncertainty information to |
| 31 | +reflect spatial and temporal patterns and correlations, rather than |
| 32 | +collapsing everything into a single number per pixel or per product. The |
| 33 | +result is an uncertainty description that is richer, more realistic, and |
| 34 | +better aligned with how the data are used. |
| 35 | + |
| 36 | +> [!NOTE] |
| 37 | +> For management, the key benefits are: traceable and up‑to‑date uncertainty |
| 38 | +> estimates that follow the evolution of the software; reuse of robust |
| 39 | +> open‑source technology with a large user base; and an uncertainty framework |
| 40 | +> that scales to future data volumes and product complexity without requiring |
| 41 | +> continual manual rework of analytical models. |
| 42 | +
|
| 43 | +# Technical Summary |
| 44 | + |
| 45 | +GUM, the Guide to the Expression of Uncertainty in Measurement, assumes |
| 46 | +that measurement procedures can be described by relatively simple, fixed |
| 47 | +analytical models. A measurand is expressed as a function of a small set |
| 48 | +of input quantities, this function is written down explicitly, and local |
| 49 | +linearisation around a working point yields partial derivatives that are |
| 50 | +then used in the law of propagation of uncertainty. In this setting, |
| 51 | +Jacobians are available in closed form, and uncertainty propagation |
| 52 | +reduces to comparatively low‑dimensional matrix operations. |
| 53 | + |
| 54 | +This picture becomes strained once measurement procedures are implemented |
| 55 | +as large, evolving software systems. In many modern applications—remote |
| 56 | +sensing retrieval chains, satellite calibration pipelines, high‑resolution |
| 57 | +climate data processing—the “measurement device” is not a static laboratory |
| 58 | +instrument with a simple response model, but a complex codebase that changes |
| 59 | +as algorithms, parameterisations, and dependencies are updated. The genuine |
| 60 | +forward map from inputs to outputs is whatever the current version of the |
| 61 | +code computes. Maintaining an analytical model and its hand‑derived Jacobians |
| 62 | +in sync with such a code is labour‑intensive and error‑prone, and often not |
| 63 | +feasible at all. |
| 64 | + |
| 65 | +Algorithmic differentiation (AD) offers a way to recover the GUM |
| 66 | +idea—propagation via sensitivities—without requiring a separate analytical |
| 67 | +model. Instead of differentiating formulas, AD differentiates the program |
| 68 | +itself. Given an implementation that computes outputs from inputs, AD |
| 69 | +systematically applies the chain rule across all elementary operations |
| 70 | +to obtain exact derivatives up to machine precision. This makes it possible |
| 71 | +to extract Jacobians and higher‑order derivatives from complex, nonlinear |
| 72 | +algorithms, and to keep sensitivity information automatically aligned with |
| 73 | +the current state of the code as it evolves. |
| 74 | + |
| 75 | +At the same time, the data structures involved in these applications are |
| 76 | +no longer simple vectors and matrices. Remote sensing images are naturally |
| 77 | +two‑dimensional, spectral imagers add a third dimension, and long‑term |
| 78 | +Earth system records add time and possibly additional geophysical |
| 79 | +dimensions. Treating such fields as flattened vectors discards the explicit |
| 80 | +spatial and temporal structure of both the data and their uncertainties. |
| 81 | +A more faithful approach is to treat all quantities—inputs, parameters, |
| 82 | +outputs, sensitivities, and covariance descriptions—as tensors, and to |
| 83 | +formulate the law of propagation of uncertainty directly in tensor form. |
| 84 | +In that formulation, propagated uncertainties arise from contractions of |
| 85 | +Jacobian tensors with input uncertainty tensors over appropriate index sets, |
| 86 | +preserving the inherent multidimensional relationships. |
| 87 | + |
| 88 | +Together, these ideas point to a conceptual need: a framework that |
| 89 | +(i) regards the executable code as the measurement model, |
| 90 | +(ii) obtains sensitivities via algorithmic differentiation rather than |
| 91 | +manual calculus, and |
| 92 | +(iii) works natively with tensor‑valued data and uncertainty descriptions |
| 93 | +instead of forcing everything into vector–matrix form. |
| 94 | + |
| 95 | +> [!NOTE] |
| 96 | +> After reading this, a technically inclined reader is naturally led to |
| 97 | +> ask: how can one implement such a tensor‑aware, AD‑based uncertainty |
| 98 | +> propagation framework for real‑world measurement and calibration |
| 99 | +> workflows? |
0 commit comments