Skip to content

Commit da91f1b

Browse files
authored
Feature/validate_output_structure (#1284)
* validate output on initialize and startCompute * update docs * validate storage
1 parent 3ae9595 commit da91f1b

6 files changed

Lines changed: 286 additions & 121 deletions

File tree

docs/API.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1463,6 +1463,7 @@ starts a free compute job and returns jobId if succesfull
14631463
| additionalViewers | object | | optional array of addresses that are allowed to fetch the result |
14641464
| queueMaxWaitTime | number | | optional max time in seconds a job can wait in the queue before being started |
14651465
| encryptedDockerRegistryAuth | string | | Ecies encrypted docker auth schema for image (see [Private Docker Registries with Per-Job Authentication](../env.md#private-docker-registries-with-per-job-authentication)) |
1466+
| output | string | | Ecies encrypted with instructions for uploading compute results (see [C2D result upload to remote storage](../Storage.md#c2d-result-upload-to-remote-storage)) |
14661467

14671468
#### Request
14681469

docs/Storage.md

Lines changed: 143 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ Ocean Node supports five storage backends for assets (e.g. algorithm or data fil
44

55
## Supported types
66

7-
| Type | `type` value | Description |
8-
| ---------- | ------------- | ------------------------------------ |
9-
| **URL** | `url` | File served via HTTP/HTTPS |
10-
| **IPFS** | `ipfs` | File identified by IPFS CID |
11-
| **Arweave**| `arweave` | File identified by Arweave transaction ID |
12-
| **S3** | `s3` | File in S3-compatible storage (AWS, Ceph, MinIO, etc.) |
13-
| **FTP** | `ftp` | File served via FTP or FTPS |
7+
| Type | `type` value | Description |
8+
| ----------- | ------------ | ------------------------------------------------------ |
9+
| **URL** | `url` | File served via HTTP/HTTPS |
10+
| **IPFS** | `ipfs` | File identified by IPFS CID |
11+
| **Arweave** | `arweave` | File identified by Arweave transaction ID |
12+
| **S3** | `s3` | File in S3-compatible storage (AWS, Ceph, MinIO, etc.) |
13+
| **FTP** | `ftp` | File served via FTP or FTPS |
1414

1515
All file objects can optionally include encryption metadata: `encryptedBy` and `encryptMethod` (e.g. `AES`, `ECIES`).
1616

@@ -31,12 +31,12 @@ Files are fetched from a given URL using HTTP GET or POST.
3131
}
3232
```
3333

34-
| Field | Required | Description |
35-
| --------- | -------- | ------------------------------------------------ |
36-
| `type` | Yes | Must be `"url"` |
37-
| `url` | Yes | Full HTTP/HTTPS URL to the file |
38-
| `method` | Yes | `"get"` or `"post"` |
39-
| `headers` | No | Optional request headers (key-value object) |
34+
| Field | Required | Description |
35+
| --------- | -------- | ------------------------------------------- |
36+
| `type` | Yes | Must be `"url"` |
37+
| `url` | Yes | Full HTTP/HTTPS URL to the file |
38+
| `method` | Yes | `"get"` or `"post"` |
39+
| `headers` | No | Optional request headers (key-value object) |
4040

4141
### Validation
4242

@@ -64,10 +64,10 @@ Files are resolved via an IPFS gateway using a content identifier (CID).
6464
}
6565
```
6666

67-
| Field | Required | Description |
68-
| ------ | -------- | ------------------------------ |
69-
| `type` | Yes | Must be `"ipfs"` |
70-
| `hash` | Yes | IPFS content identifier (CID) |
67+
| Field | Required | Description |
68+
| ------ | -------- | ----------------------------- |
69+
| `type` | Yes | Must be `"ipfs"` |
70+
| `hash` | Yes | IPFS content identifier (CID) |
7171

7272
The node builds the download URL as: `{ipfsGateway}/ipfs/{hash}` (e.g. `https://ipfs.io/ipfs/QmXoy...`).
7373

@@ -96,10 +96,10 @@ Files are identified by an Arweave transaction ID and fetched via an Arweave gat
9696
}
9797
```
9898

99-
| Field | Required | Description |
100-
| --------------- | -------- | -------------------------- |
101-
| `type` | Yes | Must be `"arweave"` |
102-
| `transactionId` | Yes | Arweave transaction ID |
99+
| Field | Required | Description |
100+
| --------------- | -------- | ---------------------- |
101+
| `type` | Yes | Must be `"arweave"` |
102+
| `transactionId` | Yes | Arweave transaction ID |
103103

104104
The node builds the download URL as: `{arweaveGateway}/{transactionId}`.
105105

@@ -135,21 +135,21 @@ Files are stored in S3-compatible object storage. The node uses the AWS SDK and
135135
}
136136
```
137137

138-
| Field | Required | Description |
139-
| --------- | -------- | ----------- |
140-
| `type` | Yes | Must be `"s3"` |
141-
| `s3Access` | Yes | Object with endpoint, bucket, object key, and credentials (see below). |
138+
| Field | Required | Description |
139+
| ---------- | -------- | ---------------------------------------------------------------------- |
140+
| `type` | Yes | Must be `"s3"` |
141+
| `s3Access` | Yes | Object with endpoint, bucket, object key, and credentials (see below). |
142142

143143
**`s3Access` fields:**
144144

145-
| Field | Required | Description |
146-
| ----------------- | -------- | ----------- |
147-
| `endpoint` | Yes | S3 endpoint URL (e.g. `https://s3.amazonaws.com`, `https://nyc3.digitaloceanspaces.com`, or `https://my-ceph.example.com`) |
148-
| `bucket` | Yes | Bucket name |
149-
| `objectKey` | Yes | Object key (path within the bucket) |
150-
| `accessKeyId` | Yes | Access key for the S3-compatible API |
151-
| `secretAccessKey` | Yes | Secret key for the S3-compatible API |
152-
| `region` | No | Region (e.g. `us-east-1`). Optional; defaults to `us-east-1` if omitted. Some backends (e.g. Ceph) may ignore it. |
145+
| Field | Required | Description |
146+
| ----------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
147+
| `endpoint` | Yes | S3 endpoint URL (e.g. `https://s3.amazonaws.com`, `https://nyc3.digitaloceanspaces.com`, or `https://my-ceph.example.com`) |
148+
| `bucket` | Yes | Bucket name |
149+
| `objectKey` | Yes | Object key (path within the bucket) |
150+
| `accessKeyId` | Yes | Access key for the S3-compatible API |
151+
| `secretAccessKey` | Yes | Secret key for the S3-compatible API |
152+
| `region` | No | Region (e.g. `us-east-1`). Optional; defaults to `us-east-1` if omitted. Some backends (e.g. Ceph) may ignore it. |
153153
| `forcePathStyle` | No | If `true`, use path-style addressing (e.g. `endpoint/bucket/key`). Required for some S3-compatible services (e.g. MinIO). Default `false` (virtual-host style, e.g. `bucket.endpoint/key`, standard for AWS S3). |
154154

155155
### Validation
@@ -186,9 +186,9 @@ For FTPS (TLS):
186186
}
187187
```
188188

189-
| Field | Required | Description |
190-
| ------ | -------- | ----------- |
191-
| `type` | Yes | Must be `"ftp"` |
189+
| Field | Required | Description |
190+
| ------ | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
191+
| `type` | Yes | Must be `"ftp"` |
192192
| `url` | Yes | Full FTP or FTPS URL. Supports `ftp://` and `ftps://`. May include credentials as `ftp://user:password@host:port/path`. Default port is 21 for FTP and 990 for FTPS. |
193193

194194
### Validation
@@ -207,6 +207,113 @@ FTPStorage supports `upload(filename, stream)`. If the file object’s `url` end
207207

208208
---
209209

210+
## C2D result upload to remote storage
211+
212+
Compute-to-Data jobs can upload their output archive to a remote backend instead of keeping it only on local node disk.
213+
214+
### How it works
215+
216+
1. You build a `ComputeOutput` JSON object with:
217+
- `remoteStorage`: one of the storage objects from this document (`url`, `s3`, `ftp`, etc.)
218+
- optional `encryption`: currently only `AES` is accepted, with a hex key
219+
2. You ECIES-encrypt that JSON into a string and send it in the compute command as `output`.
220+
3. When the job finishes:
221+
- if `output` is present and remote storage supports upload, Ocean Node uploads the tar archive remotely
222+
- otherwise, Ocean Node falls back to local `outputs.tar` behavior
223+
224+
### `ComputeOutput` shape
225+
226+
```json
227+
{
228+
"remoteStorage": {
229+
"type": "s3",
230+
"s3Access": {
231+
"endpoint": "https://s3.amazonaws.com",
232+
"region": "us-east-1",
233+
"bucket": "my-c2d-results",
234+
"objectKey": "jobs/result.tar",
235+
"accessKeyId": "AKIA...",
236+
"secretAccessKey": "..."
237+
}
238+
},
239+
"encryption": {
240+
"encryptMethod": "AES",
241+
"key": "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"
242+
}
243+
}
244+
```
245+
246+
Notes:
247+
248+
- `output` itself is **not plain JSON** in the compute request; it must be an ECIES-encrypted string.
249+
- `encryption.key` must be at least 32 bytes (64 hex chars).
250+
- `encryption.encryptMethod` must be `AES` if provided.
251+
252+
### End-to-end example
253+
254+
#### 1) Create plaintext output instructions
255+
256+
```json
257+
{
258+
"remoteStorage": {
259+
"type": "ftp",
260+
"url": "ftp://user:password@ftp.example.com:21/results/"
261+
},
262+
"encryption": {
263+
"encryptMethod": "AES",
264+
"key": "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"
265+
}
266+
}
267+
```
268+
269+
#### 2) Encrypt the JSON
270+
271+
You can use `POST /api/services/encrypt` to encrypt the JSON string for Ocean Node:
272+
273+
```bash
274+
curl -X POST "https://<node>/api/services/encrypt?consumerAddress=<0xAddress>&nonce=<nonce>&signature=<signature>" \
275+
-H "Content-Type: text/plain" \
276+
--data-raw '{"remoteStorage":{"type":"ftp","url":"ftp://user:password@ftp.example.com:21/results/"},"encryption":{"encryptMethod":"AES","key":"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"}}'
277+
```
278+
279+
The response is the encrypted blob (hex string).
280+
If your encrypt response includes `0x` prefix, remove it before sending as compute `output` (compute handlers decode `output` as raw hex bytes).
281+
282+
#### 3) Send compute command with `output`
283+
284+
Example for `freeStartCompute`:
285+
286+
```json
287+
{
288+
"command": "freeStartCompute",
289+
"consumerAddress": "0x...",
290+
"signature": "0x...",
291+
"nonce": "123",
292+
"environment": "<env-id>",
293+
"datasets": [],
294+
"algorithm": {
295+
"meta": {
296+
"rawcode": "print('hello')",
297+
"container": {
298+
"image": "python",
299+
"tag": "3.10",
300+
"entrypoint": "python",
301+
"checksum": "..."
302+
}
303+
}
304+
},
305+
"output": "<ecies-encrypted-output-string>"
306+
}
307+
```
308+
309+
### Uploaded filename and fallback behavior
310+
311+
- For remote upload, Ocean Node writes: `outputs-<clusterHash>-<jobId>.tar`
312+
- If `output` is missing/empty, or chosen storage does not support upload, Ocean Node stores output locally (`outputs.tar`) as before.
313+
- If remote upload fails, job status is set to `ResultsUploadFailed`.
314+
315+
---
316+
210317
## Summary
211318

212319
- **URL**: flexible HTTP(S) endpoints; optional custom headers and `unsafeURLs` filtering.

src/@types/commands.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,7 @@ export interface ComputeInitializeCommand extends Command {
220220
policyServer?: any // object to pass to policy server
221221
queueMaxWaitTime?: number // max time in seconds a job can wait in the queue before being started
222222
encryptedDockerRegistryAuth?: string
223+
output?: string // this is always an ECIES encrypted string, that decodes to ComputeOutput interface
223224
}
224225

225226
export interface FreeComputeStartCommand extends Command {

src/components/core/compute/initialize.ts

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,12 @@ import { getAlgorithmImage } from '../../c2d/compute_engine_docker.js'
3333
import { Credentials, DDOManager } from '@oceanprotocol/ddo-js'
3434
import { checkCredentials } from '../../../utils/credentials.js'
3535
import { PolicyServer } from '../../policyServer/index.js'
36-
import { generateUniqueID, getAlgoChecksums, validateAlgoForDataset } from './utils.js'
36+
import {
37+
generateUniqueID,
38+
getAlgoChecksums,
39+
validateAlgoForDataset,
40+
validateOutput
41+
} from './utils.js'
3742

3843
export class ComputeInitializeHandler extends CommandHandler {
3944
validate(command: ComputeInitializeCommand): ValidateParams {
@@ -207,7 +212,14 @@ export class ComputeInitializeHandler extends CommandHandler {
207212
)
208213
}
209214
}
210-
215+
const isValidOutput = await validateOutput(
216+
node,
217+
task.output,
218+
await getConfiguration()
219+
)
220+
if (isValidOutput.status.httpStatus !== 200) {
221+
return isValidOutput
222+
}
211223
// check algo
212224
let index = 0
213225
const policyServer = new PolicyServer()

0 commit comments

Comments
 (0)