refactor(backend): centralize accession version parsing#6638
Conversation
|
Claude finished @corneliusroemer's task in 2m 38s —— View job Code Review
Clean refactor. Consolidating the two independent parsing implementations into a single A few minor notes left as inline comments:
The 400→422 status change is intentional and documented in the PR description; the test updates correctly reflect it. |
| ) | ||
| } | ||
|
|
||
| val version = versionString.toLongOrNull() |
There was a problem hiding this comment.
Negative versions (e.g. "LOC_123.-1") pass this check since toLongOrNull() accepts -1L. If negative versions are semantically invalid, consider adding a version < 1 guard here. This may be pre-existing behaviour and outside this PR's scope, but worth a conscious decision.
| val version = it.substring(lastDot + 1).toLongOrNull() | ||
| ?: throw BadRequestException("Invalid version in accession version '$it', expected a number") | ||
| AccessionVersion(it.substring(0, lastDot), version) | ||
| AccessionVersion.fromString(it) |
There was a problem hiding this comment.
The getUnprocessedMetadata endpoint now returns 422 on an invalid filter but has no @ApiResponse(responseCode = "422", ...) annotation. The old 400 case was also undocumented, but since the status changed it's worth adding one to keep the OpenAPI spec accurate.
There was a problem hiding this comment.
Pull request overview
This PR centralizes parsing and validation of accession.version strings into a single strict parser (AccessionVersion.fromString) and reuses it in backend endpoints/services that previously implemented their own parsing logic, making validation rules consistent and improving error reporting.
Changes:
- Added
AccessionVersion.fromString()to strictly parseaccession.version(exactly one dot, both parts present, numeric version, no dots in accession). - Replaced duplicated parsing logic in
SubmissionControllerandSeqSetCitationsDatabaseServicewith the centralized parser. - Updated/added tests to reflect the new parsing behavior and HTTP 422 responses for invalid accession-version inputs.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| backend/src/main/kotlin/org/loculus/backend/api/SubmissionTypes.kt | Adds centralized strict AccessionVersion.fromString() parsing/validation logic. |
| backend/src/main/kotlin/org/loculus/backend/controller/SubmissionController.kt | Delegates query filter parsing to AccessionVersion.fromString() (invalid filters now surface as 422). |
| backend/src/main/kotlin/org/loculus/backend/service/seqsetcitations/SeqSetCitationsDatabaseService.kt | Replaces manual split/parse with AccessionVersion.fromString() for accession-version handling. |
| backend/src/test/kotlin/org/loculus/backend/api/SubmissionTypesTest.kt | Adds unit tests for the centralized parser (valid, invalid-format, non-numeric version). |
| backend/src/test/kotlin/org/loculus/backend/controller/submission/GetUnprocessedMetadataEndpointTest.kt | Updates expectation from HTTP 400 to 422 for invalid accessionVersions filter. |
| backend/src/test/kotlin/org/loculus/backend/controller/seqsetcitations/SeqSetValidationEndpointsTest.kt | Updates expected validation error message for non-numeric version to match centralized parser output. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| val accession = value.substringBeforeLast('.', missingDelimiterValue = "") | ||
| val versionString = value.substringAfterLast('.', missingDelimiterValue = "") | ||
|
|
||
| if (accession.isEmpty() || versionString.isEmpty() || '.' in accession) { |
There was a problem hiding this comment.
The '.' in accession condition rejects inputs like "a.b.1" by checking the substring before the last dot. This is correct, but the logic may be slightly non-obvious at a glance. Since accession = value.substringBeforeLast('.'), the only way '.' in accession is true is if there are 2+ dots in value. A comment here could help future readers understand the invariant, or alternatively you could rewrite it as an explicit dot-count check.
| ) | ||
| } | ||
|
|
||
| val version = versionString.toLongOrNull() |
There was a problem hiding this comment.
Negative versions (e.g. "LOC_123.-1") pass this check since toLongOrNull() accepts -1L. If negative versions are semantically invalid, consider adding if (version < 1) (or < 0) here. This may be pre-existing behaviour and outside this PR's scope, but worth a conscious decision.
|
|
||
| @Test | ||
| fun `GIVEN accession version with invalid format THEN throws format exception`() { | ||
| listOf("LOC_123", ".1", "LOC_123.", "LOC.123.1").forEach { |
There was a problem hiding this comment.
Consider adding "" (empty string) and "." (single dot only) to the invalid format list. Both are rejected by the current logic, and explicit tests make the boundary conditions clear.
| listOf("LOC_123", ".1", "LOC_123.", "LOC.123.1").forEach { | |
| listOf("LOC_123", ".1", "LOC_123.", "LOC.123.1", "", ".").forEach { |
| val version = it.substring(lastDot + 1).toLongOrNull() | ||
| ?: throw BadRequestException("Invalid version in accession version '$it', expected a number") | ||
| AccessionVersion(it.substring(0, lastDot), version) | ||
| AccessionVersion.fromString(it) |
There was a problem hiding this comment.
This endpoint no longer has a @ApiResponse(responseCode = "422", ...) annotation to document the new 422 behaviour. The old 400 case was also undocumented on this method, but since the status is now different from 400 and the method already exposes 422 responses elsewhere in the controller, it's worth adding one here to keep the OpenAPI spec accurate. Fix this →
Summary
This introduces a single strict parser for Loculus accession-version strings and uses it in both backend locations that previously parsed
accession.versionvalues independently.AccessionVersion.fromString()now validates that the input contains exactly one separator, that both accession and version are present, and that the version is numeric. Accessions containing dots are rejected as invalid input. Parsing failures surface asUnprocessableEntityExceptionwith a message that distinguishes malformed structure from a non-numeric version.The submission metadata filter and sequence-set validation now delegate to this centralized parser. This removes duplicated parsing logic and ensures both endpoints apply the same validation rules. Invalid accession-version filters consequently return HTTP 422 rather than HTTP 400.
Validation
./gradlew ktlintFormat --console=plain./gradlew test --console=plain -x checkDocker --tests org.loculus.backend.api.SubmissionTypesTestThe endpoint test suites could not run in this environment: Docker is unavailable, while
USE_NONDOCKER_INFRA=truerequires the missingrunuserbinary.🚀 Preview: Add
previewlabel to enable