Skip to content

Add Support for Multifile Distribution Downloads#167

Merged
kylstevenson merged 2 commits into
mainfrom
feature/mdl
Feb 9, 2026
Merged

Add Support for Multifile Distribution Downloads#167
kylstevenson merged 2 commits into
mainfrom
feature/mdl

Conversation

@kylstevenson
Copy link
Copy Markdown
Member

@kylstevenson kylstevenson commented Jan 14, 2026

This adds support for downloading multiple files from a single distribution. Previously, the SDK assumed each distribution contained exactly one file. This update enables distributions to contain multiple files and provides methods to list, filter, and download them individually or in bulk.

What's Changed

New Features

1. List Distribution Files

A new method to discover all files available in a distribution:

Map<String, DistributionFile> files = fusion.listDistributionFiles(
    "my-catalog",       // catalog identifier
    "my-dataset",       // dataset identifier
    "my-series-member", // series member identifier
    "csv",              // distribution/file format
    0                   // maxResults (0 = all files)
);

2. Selective File Downloads

Download specific files from a distribution instead of all files:

List<String> fileNames = Arrays.asList("file1", "file2");
fusion.download("my-catalog", "my-dataset", "my-series-member", "csv", "/downloads", fileNames);

3. Multifile Streaming

Stream multiple files with individual InputStream access:

Map<String, InputStream> streams = fusion.downloadStream("my-catalog", "my-dataset", "my-series-member", "csv");
InputStream fileStream = streams.get("file1");

5. Checksum Validation changes for Glue Delivery

For datasets delivered via AWS Glue (currently the only delivery channel supporting multifile distributions), checksum validation is automatically skipped when checksums are not provided by the API. This prevents download failures for Glue-delivered multifile distributions where checksums may not be available.

The SDK now:

  • Detects if a dataset uses the glue delivery channel
  • Automatically skips checksum validation when no checksum is provided
  • Maintains strict validation for all other delivery channels and single-file distributions

Breaking Changes

1. downloadStream() Return Type Changed

Before:

InputStream stream = fusion.downloadStream("catalog", "dataset", "series", "csv");

After:

Map<String, InputStream> streams = fusion.downloadStream("catalog", "dataset", "series", "csv");
InputStream stream = streams.get("filename");
// Or for single-file distributions:
InputStream stream = streams.values().iterator().next();

Impact: All existing code using downloadStream() must be updated to handle the Map<String, InputStream> return type.

2. download() File Naming Changed

Before:

  • Single file downloaded with generated name: {catalog}_{dataset}_{series}.{distribution}
  • Example: common_API_TEST_20230308.csv

After:

  • All files downloaded with their original API names (unless no file name is provided)
  • Files saved to: {downloadPath}/{filename}
  • Example: downloads/file1.csv, downloads/file2.csv

Impact: Any code relying on the previous file naming convention needs to be updated to handle the new naming scheme. In the future we should return a File object to that specific file when downloading.

@kylstevenson kylstevenson requested a review from a team as a code owner January 14, 2026 14:04
@kylstevenson kylstevenson force-pushed the feature/mdl branch 4 times, most recently from 11f49da to 5c34813 Compare January 15, 2026 16:40
Update method signature for distribution files to match existing methods

Skip checksum validation for multifile downloads where no checksum is provided
Copy link
Copy Markdown

@Ramadevie Ramadevie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kylstevenson kylstevenson merged commit 47fca57 into main Feb 9, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants