Skip to content

Commit bbbfac5

Browse files
authored
Merge pull request #230 from fairagro/feature/ARC_as_string
Feature/arc as string
2 parents d937e4f + 17fd852 commit bbbfac5

7 files changed

Lines changed: 134 additions & 63 deletions

File tree

.github/workflows/reusable-build.yml

Lines changed: 4 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -74,16 +74,10 @@ jobs:
7474
NEW_VERSION="${BASE_VERSION}-rc.${BRANCH_LABEL}.${{ github.run_number }}"
7575
7676
# PEP 440 compliant variant for Python package builds (hatch-vcs / setuptools-scm).
77-
# Format: X.Y.Z.devN+branch.name
78-
PEP440_BRANCH=$(echo "$GITHUB_REF_NAME" \
79-
| sed 's|feature/||' \
80-
| tr '[:upper:]' '[:lower:]' \
81-
| sed -E 's/[^a-z0-9]+/./g; s/^\.+//; s/\.+$//; s/\.{2,}/./g' \
82-
| cut -c1-30)
83-
if [[ -z "$PEP440_BRANCH" ]]; then
84-
PEP440_BRANCH="feature"
85-
fi
86-
PEP440_VERSION="${BASE_VERSION}.dev${{ github.run_number }}+${PEP440_BRANCH}"
77+
# For pre-releases, use simpler format that hatchling can parse: X.Y.ZaN.branch
78+
# This is PEP 440 compliant and works with hatchling's version parser
79+
# Simple .devN format without branch information for maximum compatibility
80+
PEP440_VERSION="${BASE_VERSION}.dev${{ github.run_number }}"
8781
else
8882
NEW_VERSION="${BASE_VERSION}"
8983
PEP440_VERSION="${BASE_VERSION}"

.github/workflows/reusable-release.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -147,11 +147,11 @@ jobs:
147147
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }}
148148
run: uv publish --publish-url https://upload.pypi.org/legacy/ dist/*
149149

150-
- name: Publish to TestPyPI (pre-release)
150+
- name: Publish to PyPI (pre-release)
151151
if: inputs.release_type == 'feature'
152152
env:
153-
UV_PUBLISH_TOKEN: ${{ secrets.TEST_PYPI_TOKEN }}
154-
run: uv publish --publish-url https://test.pypi.org/legacy/ dist/*
153+
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }}
154+
run: uv publish --publish-url https://upload.pypi.org/legacy/ dist/*
155155

156156
create-release-tag:
157157
name: CreateReleaseTag/Release

middleware/api_client/README.md

Lines changed: 9 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -51,13 +51,13 @@ async def main():
5151
# Send a single ARC
5252
response = await client.create_or_update_arc(
5353
rdi="my-rdi",
54-
arc=arc,
54+
arc=arc, # Can be ARC object, dict, or JSON string
5555
)
5656
print(f"ARC status: {response.status}")
5757

5858
# Or run a harvest workflow
5959
async def arc_stream():
60-
yield arc
60+
yield arc # Can yield ARC objects, dicts, or JSON strings
6161

6262
harvest = await client.harvest_arcs(
6363
rdi="my-rdi",
@@ -85,22 +85,22 @@ asyncio.run(main())
8585

8686
## API Methods
8787

88-
### `create_or_update_arc(rdi: str, arc: ARC | dict) -> ArcResult`
88+
### `create_or_update_arc(rdi: str, arc: ARC | dict | str) -> ArcResult`
8989

9090
Create or update one ARC in the Middleware API.
9191

9292
**Parameters:**
9393

9494
- `rdi` (str): The RDI identifier (e.g., "edaphobase").
95-
- `arc` (ARC | dict): ARC object from arctrl or pre-serialised RO-Crate dict.
95+
- `arc` (ARC | dict | str): ARC object from arctrl, pre-serialised RO-Crate dict, or JSON string.
9696

9797
**Returns:**
9898

9999
- `ArcResult`: Contains the result of the operation.
100100

101101
**Raises:**
102102

103-
- `ApiClientError`: If the request fails due to HTTP errors or network issues.
103+
- `ApiClientError`: If the request fails due to HTTP errors, network issues, or invalid JSON.
104104

105105
**Example:**
106106

@@ -112,17 +112,18 @@ arc = ARC.from_arc_investigation(inv)
112112

113113
response = await client.create_or_update_arc(
114114
rdi="edaphobase",
115-
arc=arc,
115+
arc=arc, # Can also be dict or JSON string
116116
)
117117
```
118118

119-
### `harvest_arcs(rdi: str, arcs: AsyncIterator[ARC | dict], expected_datasets: int | None = None) -> HarvestResult`
119+
### `harvest_arcs(rdi: str, arcs: AsyncIterator[ARC | dict | str], expected_datasets: int | None = None) -> HarvestResult`
120120

121121
Convenience workflow to create a harvest, upload all ARCs from an async iterator, and complete the harvest.
122122

123123
- Uses `config.max_concurrency` by default.
124124
- Continues on item-level submission errors and skips failed items.
125125
- Cancels the harvest only for catastrophic errors.
126+
- Supports ARC objects, pre-serialised RO-Crate dicts, and JSON strings.
126127

127128
All errors are raised as `ApiClientError` exceptions:
128129

@@ -132,24 +133,7 @@ from middleware.api_client import ApiClientError
132133
try:
133134
response = await client.create_or_update_arc(
134135
rdi="my-rdi",
135-
arc=arc,
136+
arc=arc, # Can be ARC object, dict, or JSON string
136137
)
137138
except ApiClientError as e:
138139
print(f"API Error: {e}")
139-
```
140-
141-
## Configuration via Environment Variables
142-
143-
You can override configuration values using environment variables:
144-
145-
```bash
146-
export API_URL="https://production-api:8000"
147-
export CLIENT_CERT_PATH="/secure/certs/prod-cert.pem"
148-
export CLIENT_KEY_PATH="/secure/certs/prod-key.pem"
149-
```
150-
151-
Or use Docker secrets in `/run/secrets/`.
152-
153-
## License
154-
155-
This is part of the FAIRagro Advanced Middleware project.

middleware/api_client/src/middleware/api_client/api_client.py

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -262,13 +262,13 @@ def _process_completed_arc_tasks(
262262
async def _submit_arcs_parallel(
263263
self,
264264
harvest_id: str,
265-
arcs: "AsyncGenerator[ARC | dict[str, Any], None] | AsyncIterator[ARC | dict[str, Any]]",
265+
arcs: "AsyncGenerator[ARC | dict[str, Any] | str, None] | AsyncIterator[ARC | dict[str, Any] | str]",
266266
) -> int:
267267
"""Submit all ARCs in bounded parallelism and return number of skipped ARC submissions."""
268268
pending_tasks: set[asyncio.Task[None]] = set()
269269
failed_submissions = 0
270270

271-
async def submit_one(arc_item: "ARC | dict[str, Any]") -> None:
271+
async def submit_one(arc_item: "ARC | dict[str, Any] | str") -> None:
272272
await self.submit_arc_in_harvest(harvest_id, arc_item)
273273

274274
async for arc in arcs:
@@ -446,10 +446,18 @@ async def _delete(self, path: str) -> None:
446446
# ------------------------------------------------------------------
447447

448448
@classmethod
449-
def _serialize_arc(cls, arc: "ARC | dict[str, Any]") -> dict[str, Any]:
450-
"""Serialize an ARC object to a plain RO-Crate JSON dict."""
449+
def _serialize_arc(cls, arc: "ARC | dict[str, Any] | str") -> dict[str, Any]:
450+
"""Serialize an ARC object, dict, or JSON string to a plain RO-Crate JSON dict."""
451451
if isinstance(arc, dict):
452452
return arc
453+
if isinstance(arc, str):
454+
try:
455+
data = json.loads(arc)
456+
if not isinstance(data, dict):
457+
raise ApiClientError(f"JSON string must represent a dictionary, got {type(data).__name__}")
458+
return cast(dict[str, Any], data)
459+
except json.JSONDecodeError as e:
460+
raise ApiClientError(f"Invalid JSON string provided for ARC: {e}") from e
453461
return cast(dict[str, Any], json.loads(arc.ToROCrateJsonString()))
454462

455463
@classmethod
@@ -473,7 +481,7 @@ def _parse_harvest_response(cls, data: Any) -> HarvestResult:
473481
async def create_or_update_arc(
474482
self,
475483
rdi: str,
476-
arc: "ARC | dict[str, Any]",
484+
arc: "ARC | dict[str, Any] | str",
477485
) -> ArcResult:
478486
"""Create or update an ARC.
479487
@@ -483,7 +491,7 @@ async def create_or_update_arc(
483491
484492
Args:
485493
rdi: RDI identifier.
486-
arc: ARC object or a pre-serialised RO-Crate JSON dict.
494+
arc: ARC object, a pre-serialised RO-Crate JSON dict, or a JSON string.
487495
488496
Returns:
489497
:class:`ArcResult` with the result of the operation.
@@ -579,7 +587,7 @@ async def cancel_harvest(self, harvest_id: str) -> None:
579587
async def submit_arc_in_harvest(
580588
self,
581589
harvest_id: str,
582-
arc: "ARC | dict[str, Any]",
590+
arc: "ARC | dict[str, Any] | str",
583591
) -> ArcResult:
584592
"""Submit an ARC within an active harvest run.
585593
@@ -588,7 +596,7 @@ async def submit_arc_in_harvest(
588596
589597
Args:
590598
harvest_id: Harvest identifier.
591-
arc: ARC object or a pre-serialised RO-Crate JSON dict.
599+
arc: ARC object, a pre-serialised RO-Crate JSON dict, or a JSON string.
592600
593601
Returns:
594602
:class:`ArcResult` with the result of the operation.
@@ -601,7 +609,7 @@ async def submit_arc_in_harvest(
601609
async def harvest_arcs(
602610
self,
603611
rdi: str,
604-
arcs: "AsyncGenerator[ARC | dict[str, Any], None] | AsyncIterator[ARC | dict[str, Any]]",
612+
arcs: "AsyncGenerator[ARC | dict[str, Any] | str, None] | AsyncIterator[ARC | dict[str, Any] | str]",
605613
expected_datasets: int | None = None,
606614
) -> HarvestResult:
607615
"""Create a harvest, upload all ARCs from an async generator, then complete it.
@@ -618,8 +626,8 @@ async def harvest_arcs(
618626
619627
Args:
620628
rdi: RDI identifier for the harvest.
621-
arcs: Async generator or async iterator yielding ARC objects or
622-
pre-serialised RO-Crate dicts.
629+
arcs: Async generator or async iterator yielding ARC objects,
630+
pre-serialised RO-Crate dicts, or JSON strings.
623631
expected_datasets: Optional hint about the total number of ARCs.
624632
625633
Returns:
@@ -631,7 +639,7 @@ async def harvest_arcs(
631639
632640
Example::
633641
634-
async def my_arcs() -> AsyncGenerator[dict, None]:
642+
async def my_arcs() -> AsyncGenerator[dict | str, None]:
635643
for arc in source:
636644
yield arc
637645

middleware/api_client/tests/unit/test_client.py

Lines changed: 92 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,28 @@ async def test_create_or_update_arc_with_dict(client_config: Config) -> None:
226226
assert isinstance(response, ArcResult)
227227

228228

229+
@pytest.mark.asyncio
230+
@respx.mock
231+
async def test_create_or_update_arc_with_json_string(client_config: Config) -> None:
232+
"""Test create_or_update_arc with a JSON string."""
233+
route = respx.post(f"{client_config.api_url}v3/arcs").mock(
234+
return_value=httpx.Response(http.HTTPStatus.OK, json=_ARC_RESPONSE)
235+
)
236+
async with ApiClient(client_config) as client:
237+
response = await client.create_or_update_arc(rdi="test-rdi", arc='{"id": "mock-arc"}')
238+
assert route.called
239+
assert isinstance(response, ArcResult)
240+
assert response.arc_id == "arc-123"
241+
242+
243+
@pytest.mark.asyncio
244+
async def test_create_or_update_arc_with_invalid_json_string(client_config: Config) -> None:
245+
"""Test create_or_update_arc with an invalid JSON string."""
246+
async with ApiClient(client_config) as client:
247+
with pytest.raises(ApiClientError, match="Invalid JSON string provided for ARC"):
248+
await client.create_or_update_arc(rdi="test-rdi", arc='{"id": "mock-arc"')
249+
250+
229251
@pytest.mark.asyncio
230252
@respx.mock
231253
async def test_create_or_update_arc_http_error(client_config: Config) -> None:
@@ -495,13 +517,35 @@ async def test_submit_arc_in_harvest_invalid_response(client_config: Config) ->
495517
await client.submit_arc_in_harvest("harvest-456", arc={"id": "mock"})
496518

497519

520+
@pytest.mark.asyncio
521+
@respx.mock
522+
async def test_submit_arc_in_harvest_with_json_string(client_config: Config) -> None:
523+
"""Test submit_arc_in_harvest with a JSON string."""
524+
route = respx.post(f"{client_config.api_url}v3/harvests/harvest-456/arcs").mock(
525+
return_value=httpx.Response(http.HTTPStatus.OK, json=_ARC_RESPONSE)
526+
)
527+
async with ApiClient(client_config) as client:
528+
response = await client.submit_arc_in_harvest("harvest-456", arc='{"id": "mock-arc"}')
529+
assert route.called
530+
assert isinstance(response, ArcResult)
531+
assert response.arc_id == "arc-123"
532+
533+
534+
@pytest.mark.asyncio
535+
async def test_submit_arc_in_harvest_with_invalid_json_string(client_config: Config) -> None:
536+
"""Test submit_arc_in_harvest with an invalid JSON string."""
537+
async with ApiClient(client_config) as client:
538+
with pytest.raises(ApiClientError, match="Invalid JSON string provided for ARC"):
539+
await client.submit_arc_in_harvest("harvest-456", arc='{"id": "mock-arc"')
540+
541+
498542
# ---------------------------------------------------------------------------
499543
# harvest_arcs
500544
# ---------------------------------------------------------------------------
501545

502546

503-
async def _arc_gen(*arcs: "dict[str, Any]") -> AsyncGenerator["dict[str, Any]", None]:
504-
"""Yield the provided arc dicts as an async generator."""
547+
async def _arc_gen(*arcs: "dict[str, Any] | str | ARC") -> AsyncGenerator["dict[str, Any] | str | ARC", None]:
548+
"""Yield the provided arc dicts, JSON strings, or ARC objects as an async generator."""
505549
for arc in arcs:
506550
yield arc
507551

@@ -650,6 +694,52 @@ async def test_harvest_arcs_cancels_on_catastrophic_error(client_config: Config)
650694
assert cancel_route.called
651695

652696

697+
@pytest.mark.asyncio
698+
@respx.mock
699+
async def test_harvest_arcs_with_json_string(client_config: Config) -> None:
700+
"""harvest_arcs supports JSON strings in async generator."""
701+
completed_response = {**_HARVEST_RESPONSE, "status": "COMPLETED", "completed_at": "2024-01-01T01:00:00Z"}
702+
respx.post(f"{client_config.api_url}v3/harvests").mock(
703+
return_value=httpx.Response(http.HTTPStatus.OK, json=_HARVEST_RESPONSE)
704+
)
705+
respx.post(f"{client_config.api_url}v3/harvests/harvest-456/arcs").mock(
706+
return_value=httpx.Response(http.HTTPStatus.OK, json=_ARC_RESPONSE)
707+
)
708+
respx.post(f"{client_config.api_url}v3/harvests/harvest-456/complete").mock(
709+
return_value=httpx.Response(http.HTTPStatus.OK, json=completed_response)
710+
)
711+
712+
arcs = _arc_gen(
713+
'{"id": "arc-1-string"}',
714+
{"id": "arc-2-dict"},
715+
ARC.from_arc_investigation(ArcInvestigation.create(identifier="test", title="Test")),
716+
)
717+
async with ApiClient(client_config) as client:
718+
result = await client.harvest_arcs("test-rdi", arcs, expected_datasets=3)
719+
720+
assert isinstance(result, HarvestResult)
721+
assert result.status == "COMPLETED"
722+
723+
724+
@pytest.mark.asyncio
725+
@respx.mock
726+
async def test_harvest_arcs_with_invalid_json_string(client_config: Config) -> None:
727+
"""harvest_arcs raises ApiClientError when JSON string is invalid."""
728+
# Mock the harvest creation endpoint to prevent actual HTTP requests
729+
respx.post(f"{client_config.api_url}v3/harvests").mock(
730+
return_value=httpx.Response(http.HTTPStatus.OK, json=_HARVEST_RESPONSE)
731+
)
732+
# Mock the harvest cancellation endpoint
733+
respx.delete(f"{client_config.api_url}v3/harvests/harvest-456").mock(
734+
return_value=httpx.Response(http.HTTPStatus.NO_CONTENT)
735+
)
736+
737+
async with ApiClient(client_config) as client:
738+
arcs = _arc_gen('{"id": "arc-1"') # Single invalid JSON string
739+
with pytest.raises(ApiClientError, match="Invalid JSON string provided for ARC"):
740+
await client.harvest_arcs("test-rdi", arcs)
741+
742+
653743
@pytest.mark.asyncio
654744
@respx.mock
655745
async def test_harvest_arcs_cancel_failure_does_not_mask_original_error(client_config: Config) -> None:

spec/ci-cd/design.md

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -138,17 +138,13 @@ On push feature/* or schedule:
138138
14. **PEP 440 parallel version for Python packages**
139139
— Docker semver pre-release format (`1.2.3-rc.branch.42`) is not valid
140140
PEP 440. The build phase computes a parallel `pep440_version` in the format
141-
`1.2.3.dev42+branch.name` and injects it via
141+
`1.2.3.dev42` and injects it via
142142
`SETUPTOOLS_SCM_PRETEND_VERSION` to override hatch-vcs version discovery,
143143
so Docker and Python packages share the same numeric baseline.
144+
— This simple `.devN` format was chosen for maximum compatibility with both
145+
hatchling and PyPI, using a global run number for uniqueness across all branches.
144146

145-
15. **Registry selection via `release_type` input**
146-
— The `publish-pypi` job selects `https://upload.pypi.org/legacy/` for
147-
`release_type == 'final'` and `https://test.pypi.org/legacy/` for
148-
`release_type == 'feature'`. Separate secrets (`PYPI_TOKEN`,
149-
`TEST_PYPI_TOKEN`) are used for each registry.
150-
151-
16. **Python packages built once in the build phase, reused in release**
147+
15. **Python packages built once in the build phase, reused in release**
152148
`reusable-build.yml` includes a `python-build` job that produces wheels
153149
and sdists for both publishable packages and uploads them as the artifact
154150
`python-packages-{version}`. This mirrors the Docker transfer-artifact

spec/ci-cd/spec.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -97,8 +97,7 @@ pipelines run on GitHub Actions.
9797

9898
- [ ] Publish Python packages to PyPI for `middleware/api_client` and `middleware/shared` components.
9999
- [ ] PyPI packages must be published whenever a Docker image is successfully pushed to a registry.
100-
- [ ] Final releases from `main` branch must publish packages to **PyPI** (<https://pypi.org>).
101-
- [ ] Feature branch pre-releases must publish packages to **TestPyPI** (<https://test.pypi.org/>).
100+
- [ ] Both final releases from `main` and feature branch pre-releases must publish packages to **PyPI** (<https://pypi.org>).
102101
- [ ] Packages must be published only after the `reusable-check.yml` security scans have passed.
103102
- [ ] Publish the `middleware/api_client` component under the name `fairagro-middleware-api-client`.
104103
- [ ] Publish the `middleware/shared` component under the name `fairagro-middleware-shared`.
@@ -109,7 +108,7 @@ pipelines run on GitHub Actions.
109108
- [ ] Each package must include required dependencies from `pyproject.toml`.
110109
- [ ] PyPI packages must use the exact same semantic version as the Docker image.
111110
- [ ] Final release from `main`: `MAJOR.MINOR.PATCH`.
112-
- [ ] Feature branch pre-release: `MAJOR.MINOR.PATCH-rc.{branch-label}.{run_number}`.
111+
- [ ] Feature branch pre-release: `MAJOR.MINOR.PATCH.dev{RUN_NUMBER}` (PEP 440 compliant format using global run number for uniqueness).
113112
- [ ] If a GitHub release is created, the packages must be added to the artifact list.
114113
- [ ] If a github release is created, include `pip install` commands for each package with exact version information.
115114
- [ ] If a GitHub release is created, provide fallback instructions for local installation from source.

0 commit comments

Comments
 (0)