ArcStore (arc_store/__init__.py) is an abstract base class that defines the
Git persistence interface. Two implementations exist:
GitRepo(arc_store/git_repo.py) — primary implementation. Clones the repository via SSH or HTTPS using the Git CLI (GitPython), writes the ISA file structure generated by arctrl, and pushes the result. Works with any Git-compatible server.GitlabApi(arc_store/gitlab_api.py) — deprecated. Used the GitLab REST API to write files directly without a local clone. Retained for backwards compatibility only; new deployments should useGitRepo.
The caller (ArcManager.sync_to_gitlab) is responsible for parsing the ARC from
JSON, selecting the configured backend, and recording CouchDB events. The store
itself only handles Git.
ArcManager.sync_to_gitlab(rdi, arc_json_string)
├─→ ARC.from_rocrate_json_string(arc_json_string) ← arctrl parse
└─→ ArcStore.create_or_update(arc_id, arc_obj)
└─→ GitRepo (or GitlabApi — deprecated)
├─→ clone / pull
├─→ write ISA files via arctrl WriteAsync
└─→ commit + push
GitRepo uses GitPython to manage a temporary local clone:
- Clone or pull the remote repository to a temp directory.
- Call
arctrl.ARC.WriteAsyncto write the ISA/ARC file structure. - Stage all changes, commit, and push.
- Clean up the temp directory.
SSH and HTTPS authentication are both supported via RemoteGitProvider
(arc_store/remote_git_provider.py), which injects credentials into the remote
URL or SSH command.
Git errors are classified at push time:
is_transient_git_error(exc)→ raiseArcStoreTransientError(network, 50x)is_soft_git_error(exc)→ repo or branch not found; treated as permanent- All other
GitCommandError→ permanent
-
ArcStoreTransientErrorvs permanent errors — Callers (ArcManager) need to distinguish retryable failures from permanent ones to decide whether to schedule a Celery retry. The store raisesArcStoreTransientErrorfor network and availability issues; all other exceptions are treated as permanent by the caller. -
GitRepopreferred overGitlabApi— The REST API approach required chunking file actions and had limits on commit size. A real Git clone-and-push is simpler, more reliable, and server-agnostic.GitlabApiis kept only to not break existing deployments and will be removed in a future release. -
Temporary local clone, not persistent workspace — Each sync operation clones to a fresh temp directory and deletes it afterwards. This avoids stale state from concurrent workers or failed previous runs.
-
RemoteGitProviderinjects credentials — Credential injection is isolated inRemoteGitProviderso thatGitRepoitself has no knowledge of authentication schemes. SSH and HTTPS credential formats differ; the provider abstracts that difference.