Skip to content

Implement OntologySourceReference usage #36

@Zalfsten

Description

@Zalfsten

Summary

Currently, all OntologyAnnotation objects are created with tsr = "" and the xxx_version field from DB views is silently dropped. This is a known limitation (see arc-building/design.md, Key Decision 5). ARCs serialize with "ontologySourceReferences": [], losing ontology version provenance.

Motivation

The DB views expose (xxx_term, xxx_uri, xxx_version) triples for every ontology field. The version refers to the ontology source version (e.g. "2024-01-01" for ENVO), not to the term itself. ARCtrl models this via OntologySourceReference objects registered on ArcInvestigation.OntologySourceReferences, back-referenced from OntologyAnnotation.tsr by name. Without this, consuming tools cannot resolve ontology source metadata.

Proposed Implementation

  1. After all mapper functions run, collect all distinct (xxx_term, xxx_uri, xxx_version) combinations from the investigation's contacts, publications, and annotation tables.
  2. Group by ontology source name (xxx_term) and create one OntologySourceReference(name=..., version=..., file=..., description=...) per source.
  3. Append these to ArcInvestigation.OntologySourceReferences.
  4. Update _make_oa() to set tsr to the matching OntologySourceReference.name instead of "".

Acceptance Criteria

  • ArcInvestigation.OntologySourceReferences is non-empty when the DB provides ontology version data.
  • OntologyAnnotation.tsr matches the corresponding OntologySourceReference.name.
  • Produced JSON-LD contains a populated ontologySourceReferences array.
  • Unit tests cover _make_oa() with a non-empty tsr value.
  • Key Decision 5 in arc-building/design.md is updated to reflect the new behaviour.

References

  • spec/features/arc-building/design.md — Key Decision 5
  • spec/skills/arctrl.mdOntologySourceReference section
  • middleware/sql_to_arc/src/middleware/sql_to_arc/mapper.py_make_oa()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions