Skip to content

feat(storage): add datasets resource-type prefix to dataset logical paths#5911

Open
aicam wants to merge 37 commits into
apache:mainfrom
aicam:feat/repo-type-update-logical-path
Open

feat(storage): add datasets resource-type prefix to dataset logical paths#5911
aicam wants to merge 37 commits into
apache:mainfrom
aicam:feat/repo-type-update-logical-path

Conversation

@aicam

@aicam aicam commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Introduces a datasets resource-type prefix to dataset logical file paths, changing the format from:

/ownerEmail/datasetName/versionName/fileRelativePath

to:

/datasets/ownerEmail/datasetName/versionName/fileRelativePath

This namespaces dataset paths under an explicit resource-type segment, leaving room for other resource types in the future.

Changes

Backend (Scala)

  • FileResolver: strips a known RESOURCE_TYPE_PREFIXES = Set("datasets") leading segment before parsing; docs/examples updated.
  • DatasetFileNode.fromLakeFSRepositoryCommittedObjects: builds an intermediate datasets directory node as the parent of owner nodes.
  • DatasetResource: descends through the new datasets node to reach the owner node; total-size calculation rooted at the datasets node.
  • FileResolverSpec: test fixtures updated to the prefixed path format.

Frontend (TypeScript)

  • dataset-file.ts: parseFilePathToDatasetFile strips the datasets prefix if present; parseDatasetFileToFilePath emits it.
  • datasetVersionFileTree.ts: relative-path extraction now strips four leading segments (datasets/owner/dataset/version) instead of three.

Notes

  • Net diff is small (6 files, +45/-21). The branch was synced with the latest apache/main (via the aicam fork's main) before opening this PR.

🤖 Generated with Claude Code

aicam and others added 30 commits February 24, 2026 13:58
…ent via public cluster services

- Add CloudMapperSourceOpDesc, ReferenceGenome, ReferenceGenomeEnum operator classes
- Add FileResolver.resolveDirectory for resolving dataset directories by path
- Add DatasetFileDocument directory mode: downloads all files as a zip via LakeFS/FileService
- Add DocumentFactory.openReadonlyDocument isDirectory parameter
- Add ENV_FILE_SERVICE_LIST_DIRECTORY_OBJECTS_ENDPOINT env var
- Add Kubernetes Helm chart and PVC for the cloudmapper service

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ntend integration

- Add ClusterResource, ClusterCallbackResource, ClusterServiceClient, ClusterUtils backend API for managing EC2 clusters
- Add cluster dashboard component with launch/stop/terminate/start actions and management modal
- Add ClusterSelectionComponent and ClusterAutoCompleteComponent for operator property panel
- Add DirectoryPathInput and DirectorySelection components for dataset directory selection
- Add cluster route in app-routing, cluster declarations in app.module
- Add cluster_enabled feature flag to gui-config, dashboard sidebar, and admin settings
- Add clusterautocomplete and directorypathinput formly field types
- Register cluster/directoryName/fastQFiles/fastAFiles/gtfFile fields in operator property editor
- Add SQL schema for cluster and cluster_activity tables
- Add dknet logo, CloudBioMapper operator icon, and sequence-alignment workflow assets
- Add DatasetDirectoryDocument and PathUtils storage utilities

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

👋 Thanks for opening this pull request, @aicam!

It looks like the pull request description doesn't quite follow our template yet:

  • The What changes were proposed in this PR? section is missing; please keep the template's headings.
  • The How was this PR tested? section is missing; please keep the template's headings.
  • The Was this PR authored or co-authored using generative AI tooling? section is missing; please keep the template's headings.

Filling out the template helps reviewers understand and triage your contribution faster. Please edit the description to complete it. This message will disappear automatically once the template is followed.

You can find the template prompts by editing the description, or see CONTRIBUTING.md for the full contribution flow.

@github-actions github-actions Bot added feature frontend Changes related to the frontend GUI common platform Non-amber Scala service paths labels Jun 23, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @Yicong-Huang, @aglinxinyuan, @Ma77Ball
    You can notify them by mentioning @Yicong-Huang, @aglinxinyuan, @Ma77Ball in a comment.

@codecov-commenter

codecov-commenter commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 30.00000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.06%. Comparing base (8803d08) to head (e38a355).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ache/texera/service/resource/DatasetResource.scala 0.00% 3 Missing ⚠️
...tend/src/app/common/type/datasetVersionFileTree.ts 0.00% 3 Missing ⚠️
...pache/texera/amber/core/storage/FileResolver.scala 75.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5911      +/-   ##
============================================
- Coverage     54.11%   54.06%   -0.05%     
+ Complexity     2819     2811       -8     
============================================
  Files          1103     1103              
  Lines         42650    42654       +4     
  Branches       4588     4589       +1     
============================================
- Hits          23079    23061      -18     
- Misses        18226    18245      +19     
- Partials       1345     1348       +3     
Flag Coverage Δ *Carryforward flag
access-control-service 70.44% <ø> (ø)
agent-service 34.36% <ø> (ø) Carriedforward from 8803d08
amber 55.55% <75.00%> (-0.09%) ⬇️
computing-unit-managing-service 1.65% <ø> (ø)
config-service 57.35% <ø> (ø)
file-service 58.45% <0.00%> (-0.15%) ⬇️
frontend 48.09% <0.00%> (-0.03%) ⬇️
pyamber 90.20% <ø> (ø) Carriedforward from 8803d08
python 90.76% <ø> (ø) Carriedforward from 8803d08
workflow-compiling-service 58.69% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common feature frontend Changes related to the frontend GUI platform Non-amber Scala service paths

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants