fix(azure): fall back to client-side filtering for list_with_offset on OneLake#696
fix(azure): fall back to client-side filtering for list_with_offset on OneLake#696kevinjqliu wants to merge 5 commits into
Conversation
|
Need to rebase #694 to pass CI |
|
@tustvold could you take a look? |
9b2d474 to
c014f56
Compare
|
rebased to pull in #694 |
rtyler
left a comment
There was a problem hiding this comment.
The only suggestion that I have is to provide a true listing of the OneLake location used for the manual testing in case anybody motivated enough to come along and try that manual test later needs to 😄
Is there a publicly linkable reference to the bug witht he OneLake service that can be referenced as well?
I am cautiously optimistic that this could get into a 0.13.3 so some downstream breaks can be resolved ahead of 0.14
|
Apologies for the delay on getting to this, been absolutely swamped at work, is this still required or has onelake added support for this? |
@tustvold It's still required for OneLake! |
|
@crepererum and I are working on on a release -- I put this PR on the list and will try and move it along |
|
I verified that this is fixed and deployed on the OneLake side. So this PR is no longer needed; we want users to take advantage of |
@kevinjqliu Delta reads still fail (I last tested on Sunday), were you perhaps testing with an unreleased version of Object Store? Or, was there a very recent deploy/fix in OneLake within the last day? EDIT: |
|
The fix is getting rolled out to production environments. It might not be there yet for some regions. Give it another try in a few days. I'll follow up next week and test it again. |
|
@shehabgamin i checked internally, this is now deployed |
@kevinjqliu Still doesn't work for me. Here's a reproducible example: |
|
Just hit this too, so adding another data point. Still broken for me as of 2026-06-30, different tenant from LakeSail, North Europe region.
I replayed that same List Blobs call by hand with the same token, only changing
So any non-empty Same result on two table types: a Lakehouse schema-shortcut table (GUID abfss URL) and a Warehouse table (friendly-name abfss URL). So whatever got deployed, Update 2026-07-03: Further narrowed this down. Same code, same auth, both GUID abfss URLs:
Also reproduced independently of delta-rs with DuckDB's delta extension (which uses delta-kernel-rs / object_store as well). |
|
Thank you both for verifying again. Seems like there was a deployment issue and we had to roll back. I will update again once the new deployment is complete |
Which issue does this PR close?
Closes #695.
Rationale for this change
list_with_offsetreturns empty results on OneLake when both of the following conditions are met:onelake.blob.fabric.microsoft.com/MyWorkspace/lakehouse.Lakehouse/...) rather than by GUIDstartFromquery parameter — theList Blobsrequest includesstartFrom, which OneLake silently ignores in this case, returning 200 OK with zero results rather than rejecting the request(Note that when using GUID-based URLs (
onelake.blob.fabric.microsoft.com/{workspace-guid}),startFromworks correctly.)This regression was introduced in #623 which implemented optimizations using
startFrompushdown.We (The OneLake team) are actively fixing
startFromwith friendly-name URLs on our side.In the meantime,
object_storecan fall back to client-side filtering for OneLake endpoints.I created a tracking issue (#697) to re-enable this optimization for OneLake once the fix is complete
What changes are included in this PR?
*.fabric.microsoft.com) and skipstartFrom, falling back to client-side filtering -- same approach already used for Azurite#[ignore]integration test (test_onelake_list_with_offset) that verifies offset exclusivity, boundary cases, and ordering against a live OneLake endpointTested locally
GUID-based URI works:
Friendly name based URI fails:
Are there any user-facing changes?
No, this is a behavioral change for
list_with_offsetagainst OneLake endpoints