Skip to content

Improved study and experiment handling, paired end data handling, multiple experiment handling#142

Draft
bedroesb wants to merge 17 commits into
mainfrom
study-level
Draft

Improved study and experiment handling, paired end data handling, multiple experiment handling#142
bedroesb wants to merge 17 commits into
mainfrom
study-level

Conversation

@bedroesb

@bedroesb bedroesb commented May 27, 2026

Copy link
Copy Markdown
Member
  • Moved ENA-specific study metadata partly into assay comments, while keeping core study title and description on the ISA Study itself. The adapter now reads study.title and study.description for the submitted ENA study title/description, while assay comments still carry ENA-specific fields such as STUDY_ABSTRACT, STUDY_TYPE, and new_study_type. This closes Include extra metadata for target repositories #3.
  • Updated the ENA study adapter so assay comments are no longer the source of truth for everything: core descriptive metadata now comes from the ISA study, and only ENA-specific study fields are read from the assay level.
  • Removed the explicit ENA PROJECT_SET submission from our adapter flow and deleted the unused project adapter. We now submit STUDY_SET, EXPERIMENT_SET, and RUN_SET; any linked PRJEB... project is created by ENA/Webin rather than by MARS. Closes ISA Investigation is being submitted as a ENA Study/Project  #80.
  • Switched ENA experiment generation to read ENA-native process parameter names directly from the assay workflow, instead of relying on hardcoded platform values or custom parameter-name mappings. Closes assays > processSequence > parameterValues are not being parsed by the ENA endpoint #73.
  • Updated experiment-to-sample linking so each ENA experiment resolves the BioSamples accession from the specific study sample its library derives from, rather than reusing one global sample accession.
  • Preserved and clarified the original bottom-up experiment-building flow from data files to sequencing process to library to experiment.
  • Updated ENA run generation so one sequencing process produces one ENA RUN, allowing paired-end runs to contain both FASTQ files in a single DATA_BLOCK while single-end runs still produce one file per run.
  • Updated ENA receipt mapping so top-level ENA study accessions resolve back to the assay path in the MARS receipt, rather than the parent study/investigation path.
  • Updated ENA run receipt mapping so grouped runs are expanded back onto the corresponding ISA dataFiles, including assigning one paired-end ERR... accession to both paired FASTQ entries.
  • Refreshed the example ISA JSON files with more realistic valid ENA metadata values, moved study title/description back onto the ISA Study, and added a richer multi-file example covering two source/sample chains, two experiments, one paired-end run, and one single-end run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant