Skip to content

feat: New loader for XDI files #203

@pbeaucage

Description

@pbeaucage

While working on something else, I whipped up the attached function which loads XDI-format files, here specifically from BMM at NSLS2, but it seems to be a real standard. Would be cool to get this into a new loader, roughly mirroring the ESRF ID2 loader. The gist is that there is a large semantically meaningful header, that can point out at external hdf data sources.


def loadSingleScan(filepath):
    filepath = pathlib.Path(filepath)
    basepath = filepath.parent
    file = filepath.name
    
    metadata = {}
    header = None
    data_lines = []
    
    with open(filepath, 'r') as f:
        for line in f:
            if line.startswith('#'):
                # Extract metadata lines like "# Key: Value"
                if ':' in line:
                    key_value = line[1:].strip().split(':', 1)
                    if len(key_value) == 2:
                        key, value = key_value
                        metadata[key.strip()] = value.strip()
                header = line  # Will keep updating, so the last # line becomes the header
            elif line.strip() and not line.startswith('//') and not line.startswith('--'):
                data_lines.append(line)
    
    # Final header line (the second-to-last overall # line)
    column_names = header[1:].strip().split()
    
    # Read data into a DataFrame
    from io import StringIO
    df = pd.read_csv(StringIO(''.join(data_lines)), sep='\s+', names=column_names)
    
    # Now `df` contains your data and `metadata` has all the XDI metadata
    # Step 1: Promote DataFrame to xarray Dataset
    ds = df.set_index('energy').to_xarray()
    
    # Step 2: Load HDF5 image stack
    with h5py.File(basepath/metadata['Scan.pilatus100k_hdf5_file'], 'r') as f:
        image_data = f['entry/data/data'][()]  # shape: (293, 195, 487)
    # Confirm dimensions match DataFrame
    assert image_data.shape[0] == len(df), "Image stack and dataframe do not align in length!"
    # Step 3: Insert image as new variable
    ds['pilatus100k'] = (('energy', 'pix_y', 'pix_x'), image_data)
    ds.attrs.update(metadata)
    ds.energy.attrs['unit'] = 'eV'
    return ds

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions