Skip to content

Latest commit

 

History

History
113 lines (99 loc) · 5.18 KB

File metadata and controls

113 lines (99 loc) · 5.18 KB

LRAUV Data Workflow

The sequence of steps to process LRAUV data is as follows:

flowchart LR
    Z[(Original log<br/>.nc4 file from<br>/mbari/LRAUV/)]
    A[nc42netcdfs.py]
    B[(\_Group\_*<br/>.nc files)]
    C[combine.py]
    D[(_combined.nc4)]
    E[align.py]
    
    Z --> A
    A --> B
    B --> C
    C --> D
    D --> E
    
    style Z fill:#e1f5ff
    style B fill:#e1f5ff
    style D fill:#e1f5ff
Loading
flowchart LR
    E[align.py]
    F[(_align.nc4)]
    G[resample.py]
    H[(_nS.nc)]
    I[archive.py]
    J[(archived files in<br>/mbari/LRAUV/)]
    
    E --> F
    F --> G
    G --> H
    H --> I
    I --> J
    
    style F fill:#e1f5ff
    style H fill:#e1f5ff
    style J fill:#e1f5ff
Loading

Details of each step are described in the respective scripts and in the description of output netCDF files below. The output file directory structure on the local file system's work directory is as follows:

├── data
│   ├── lrauv_data
│   │   ├── <auv_name>           <- e.g.: ahi, brizo, pontus, tethys, ...
│   │   │   ├── missionlogs/year/dlist_dir
│   │   │   │   ├── <log_dir>    <- e.g.: ahi/missionlogs/2025/20250908_20250912/20250911T201546/202509112015_202509112115.nc4
│   │   │   │   │   ├── <nc4>    <- .nc4 file containing original data - created by unserialize
│   │   │   │   │   ├── <nc>     <- .nc files, one for each group from the .nc4 file
|   |   |   |   |   |                data identical to original in NetCDF4 format,
|   |   |   |   |   |                but in more interoperable NetCDF3 format 
|   |   |   |   |   |                - created by nc42netcdfs.py
│   │   │   │   │   ├── <_combined>     <- A single NetCDF4 .nc4 file containing all the
|   |   |   |   |   |                   varibles from the .nc files along with nudged
|   |   |   |   |   |                   latitudes and longitudes - created by combine.py
│   │   │   │   │   ├── <_align> <- .nc4 file with all measurement variables
|   |   |   |   |   |               having associated coordinate variables
|   |   |   |   |   |               at original instrument sampling rate
|   |   |   |   |   |                - created by align.py
│   │   │   │   │   ├── <_nS>    <- .nc file with all measurement variables
                                    resampled to a common time grid at n
                                    Second intervals - created by resample.py

nc42netcdfs.py
    Extract the groups and the variables we want from the groups into 
    individual .nc files. These data are saved using NetCDF4 format as
    there are many unlimited dimensions that are not allowed in NetCDF3.
    The data in the .nc files are identical to what is in the .nc4 groups.

combine.py
    Combine all group data into a single NetCDF file with consolidated
    time coordinates. When GPS fix data is available, this step includes
    nudging the underwater portions of the navigation positions to the
    GPS fixes done at the surface. GPS fixes are filtered to ensure
    monotonically increasing timestamps before nudging. Some minimal QC
    is done in this step, namely removal of non-monotonic times. The
    nudged coordinates are added as separate variables (nudged_longitude,
    nudged_latitude) with their own time dimension. For missions without
    GPS data, the combine step completes successfully but without nudged
    coordinates. Since xarray writes time as int64 values the files are
    saved with the .nc4 extension so that hyrax/opendap can serve them.

align.py
    Interpolate nudged lat/lon variables to the original sampling
    intervals for each instrument's record variables. This step requires
    nudged coordinates from combine.py and will fail with an informative
    error if they are not present (as in missions without GPS data).
    This format is analogous to the .nc4 files produced by the LRAUV
    unserialize process. These are the best files to use for the highest
    temporal resolution of the data. Unlike the .nc4 files, align.py's
    output files use a naming convention rather than netCDF4 groups for
    each instrument. Since xarray writes time as int64 values the files are
    saved with the .nc4 extension so that hyrax/opendap can serve them.

resample.py
    Produce a netCDF file with all of the instrument's record variables
    resampled to the same temporal interval. The coordinate variables are
    also resampled to the same temporal interval and named with standard
    depth, latitude, and longitude names. These are the best files to
    use for loading data into STOQS and for analyses requiring all the
    data to be on the same spatial temporal grid.

archive.py
    Copy the netCDF files to the archive directory. The archive directory
    is initially in the AUVCTD share on atlas which is shared with the
    data from the Dorado Gulper vehicle, but can also be on the M3 share
    on thalassa near the original log data.