Skip to content

feat(rest): add scan plan endpoints to REST catalog client#614

Merged
wgtmac merged 5 commits into
apache:mainfrom
gsandeep1241:sandeepg-scan-plan-endpoint-for-rest-catalog-client-2
Jun 9, 2026
Merged

feat(rest): add scan plan endpoints to REST catalog client#614
wgtmac merged 5 commits into
apache:mainfrom
gsandeep1241:sandeepg-scan-plan-endpoint-for-rest-catalog-client-2

Conversation

@gsandeep1241

@gsandeep1241 gsandeep1241 commented Apr 5, 2026

Copy link
Copy Markdown
Contributor
  • Introduced 4 endpoints: planning table scan, fetching a scan result, cancelling a scan plan and fetching scan planning tasks (endpoint.h)
  • Introduced corresponding request and response objects (types.h)
  • Implemented handlers (catalog.h). The handlers construct a request object and return a response object
    • Handlers use newly introduced helper methods to serialize and deserialize request response json objects
  • Added unit tests for all the above helper methods and handler methods
  • Added integration tests for the handlers

@gsandeep1241 gsandeep1241 force-pushed the sandeepg-scan-plan-endpoint-for-rest-catalog-client-2 branch 2 times, most recently from 4fde2a5 to ab6c8d4 Compare April 5, 2026 20:47
Comment thread src/iceberg/catalog/rest/error_handlers.cc
Comment thread src/iceberg/catalog/rest/json_serde.cc Outdated
Comment thread src/iceberg/catalog/rest/resource_paths.cc Outdated
Comment thread src/iceberg/catalog/rest/json_serde.cc Outdated
@gsandeep1241 gsandeep1241 force-pushed the sandeepg-scan-plan-endpoint-for-rest-catalog-client-2 branch from 17f12b3 to 86714b9 Compare May 3, 2026 05:07
@gsandeep1241 gsandeep1241 requested a review from zhjwpku May 3, 2026 05:08
@wgtmac

wgtmac commented May 4, 2026

Copy link
Copy Markdown
Member

I've just finished my first round of review. Thanks @gsandeep1241 for working on this and everyone for the review! Let me know what you think.

Comment thread src/iceberg/catalog/rest/error_handlers.cc Outdated
Comment thread src/iceberg/catalog/rest/error_handlers.cc Outdated
Comment thread src/iceberg/catalog/rest/resource_paths.h Outdated
Comment thread src/iceberg/catalog/rest/resource_paths.cc Outdated
Comment thread src/iceberg/catalog/rest/resource_paths.cc Outdated
Comment thread src/iceberg/catalog/rest/types.h Outdated
Comment thread src/iceberg/catalog/rest/types.h Outdated
Comment thread src/iceberg/test/rest_json_serde_test.cc
Comment thread src/iceberg/json_serde_internal.h Outdated
Comment thread src/iceberg/json_serde_internal.h Outdated
Comment thread src/iceberg/catalog/rest/types.h Outdated

@gsandeep1241 gsandeep1241 left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wgtmac for the detailed review. I am part way into addressing your comments. I'm posting the partial comments early because there was one open question I wanted to check - changes to catalog.h because if we go that route, we can simplify the PR.

The rest of your comments I don't have a conflict with and will address them.

Comment thread src/iceberg/catalog/rest/error_handlers.cc Outdated
Comment thread src/iceberg/catalog/rest/error_handlers.cc Outdated
Comment thread src/iceberg/catalog/rest/resource_paths.cc Outdated
Comment thread src/iceberg/catalog/rest/resource_paths.h Outdated
Comment thread src/iceberg/catalog/rest/types.h Outdated
bool use_snapshot_schema = false;
std::optional<int64_t> start_snapshot_id;
std::optional<int64_t> end_snapshot_id;
std::vector<std::string> statsFields;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread src/iceberg/catalog/rest/types.h Outdated
std::optional<int64_t> start_snapshot_id;
std::optional<int64_t> end_snapshot_id;
std::vector<std::string> statsFields;
std::optional<int64_t> min_rows_required;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread src/iceberg/catalog/rest/types.h Outdated
Comment thread src/iceberg/catalog.h Outdated
/// \param table The table to scan.
/// \param context The scan context containing snapshot, filter, and other options.
/// \return A PlanTableScanResponse with the plan status and initial scan tasks.
virtual Result<rest::PlanTableScanResponse> PlanTableScan(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debated this and I felt that a catalog handler should really honor a Request-Response contract. (That said, I'm not doing the "Request" part as such). The java implementation I felt handled core logic directly coupled with the handlers.

But if we want to keep it similar, it makes sense to split into two PRs to prevent this from getting complicated. I'll remove the stuff in catalog.h and rest_catalog.h.

Agree?

@wgtmac

wgtmac commented May 21, 2026

Copy link
Copy Markdown
Member

Is this ready to review? I see there are CI failures and unaddressed comments. Please also rebase it to the latest main branch to make sure namespace separator is respected.

@wgtmac wgtmac added the awaiting author action Author needs to address comments, resolve conflicts, answer questions, etc. label May 28, 2026
@gsandeep1241 gsandeep1241 force-pushed the sandeepg-scan-plan-endpoint-for-rest-catalog-client-2 branch from 5f7c8f8 to d388f81 Compare June 4, 2026 05:46
@gsandeep1241 gsandeep1241 requested a review from wgtmac June 5, 2026 06:25
@gsandeep1241

Copy link
Copy Markdown
Contributor Author

@wgtmac Addressed all comments (except one, where I left a question for you). Please take a look when you get the chance. Thank you!

Comment thread src/iceberg/catalog/rest/json_serde.cc Outdated
Comment thread src/iceberg/catalog/rest/types.h
Comment thread src/iceberg/catalog/rest/json_serde.cc Outdated
Sandeep Gottimukkala added 4 commits June 6, 2026 23:37
- Add PlanTableScan, FetchPlanningResult, CancelPlanning, FetchScanTasks
  methods to RestCatalog, wiring them to the corresponding REST endpoints
- Add scan plan endpoint definitions and resource path helpers
- Add ScanPlanErrorHandler with proper 404/406 error dispatch
  (NoSuchPlanIdException, NoSuchPlanTaskException, etc.)
- Add PlanTableScanRequest/FetchScanTasksRequest serialization helpers
- Add kNoSuchPlanId and kNoSuchPlanTask ErrorKind values
- Add DataFileFromJson and FileScanTasksFromJson overloads with
  partition spec and schema support
- Add endpoint and integration tests for all 4 scan plan operations
- Flatten BaseScanTaskResponse hierarchy into three independent structs
- Add error payload to PlanTableScanResponse and FetchPlanningResultResponse for failed status
- Key delete file index by file_path instead of pointer identity
- Add partition serialization to ToJson(DataFile)
- Update tests for all changes
@wgtmac wgtmac force-pushed the sandeepg-scan-plan-endpoint-for-rest-catalog-client-2 branch from c2235ce to 3961bd7 Compare June 9, 2026 03:46
@wgtmac wgtmac removed the awaiting author action Author needs to address comments, resolve conflicts, answer questions, etc. label Jun 9, 2026
@wgtmac

wgtmac commented Jun 9, 2026

Copy link
Copy Markdown
Member

I've pushed a commit to tighten the REST scan JSON serde to match Java/spec behavior more closely. This includes preserving residual filters, distinguishing missing vs empty scan task fields, requiring contextual DataFile serialization/deserialization with partition spec/schema validation, adding request-side FromJson support for scan requests, and simplifying delete-file deduplication by reusing DeleteFileSet. I also adjusted the tests to cover these parity cases in the existing style.

Thanks @gsandeep1241 for working on this!

@wgtmac wgtmac force-pushed the sandeepg-scan-plan-endpoint-for-rest-catalog-client-2 branch from 3961bd7 to e884cb4 Compare June 9, 2026 04:02
@wgtmac wgtmac merged commit b650822 into apache:main Jun 9, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants