Skip to content

fix(azure): fall back to client-side filtering for list_with_offset on OneLake#696

Closed
kevinjqliu wants to merge 5 commits into
apache:mainfrom
kevinjqliu:kevinjqliu/onelake-startFrom
Closed

fix(azure): fall back to client-side filtering for list_with_offset on OneLake#696
kevinjqliu wants to merge 5 commits into
apache:mainfrom
kevinjqliu:kevinjqliu/onelake-startFrom

Conversation

@kevinjqliu

@kevinjqliu kevinjqliu commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #695.

Rationale for this change

list_with_offset returns empty results on OneLake when both of the following conditions are met:

  1. Friendly-name URLs — the endpoint addresses the workspace and lakehouse by display name (e.g. onelake.blob.fabric.microsoft.com/MyWorkspace/lakehouse.Lakehouse/...) rather than by GUID
  2. startFrom query parameter — the List Blobs request includes startFrom, which OneLake silently ignores in this case, returning 200 OK with zero results rather than rejecting the request

(Note that when using GUID-based URLs (onelake.blob.fabric.microsoft.com/{workspace-guid}), startFrom works correctly.)

This regression was introduced in #623 which implemented optimizations using startFrom pushdown.

We (The OneLake team) are actively fixing startFrom with friendly-name URLs on our side.
In the meantime, object_store can fall back to client-side filtering for OneLake endpoints.
I created a tracking issue (#697) to re-enable this optimization for OneLake once the fix is complete

What changes are included in this PR?

  • Auto-detect Fabric/OneLake endpoints (*.fabric.microsoft.com) and skip startFrom, falling back to client-side filtering -- same approach already used for Azurite
  • Add #[ignore] integration test (test_onelake_list_with_offset) that verifies offset exclusivity, boundary cases, and ordering against a live OneLake endpoint

Tested locally

GUID-based URI works:

export AZURE_STORAGE_TOKEN=$(az account get-access-token --resource https://storage.azure.com/ --query accessToken -o tsv) && \
ONELAKE_URL="https://msit-onelake.blob.fabric.microsoft.com/OLSTeamWorkspace/lh.Lakehouse/Files/test_startfrom" \
cargo test --features azure test_onelake_list_with_offset -- --ignored --no-capture

Friendly name based URI fails:

export AZURE_STORAGE_TOKEN=$(az account get-access-token --resource https://storage.azure.com/ --query accessToken -o tsv) && \
ONELAKE_URL="https://msit-onelake.blob.fabric.microsoft.com/45a753bc-c074-42cf-8b30-5dfa920b241f/3e56af9e-3832-47ed-b18c-dcc58b562e87/Files/test_startfrom" \
cargo test --features azure test_onelake_list_with_offset -- --ignored --no-capture

Are there any user-facing changes?

No, this is a behavioral change for list_with_offset against OneLake endpoints

@kevinjqliu

Copy link
Copy Markdown
Contributor Author

Need to rebase #694 to pass CI

@kevinjqliu

Copy link
Copy Markdown
Contributor Author

@tustvold could you take a look?

@kevinjqliu kevinjqliu force-pushed the kevinjqliu/onelake-startFrom branch from 9b2d474 to c014f56 Compare April 23, 2026 21:06
@kevinjqliu

Copy link
Copy Markdown
Contributor Author

rebased to pull in #694

@rtyler rtyler left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only suggestion that I have is to provide a true listing of the OneLake location used for the manual testing in case anybody motivated enough to come along and try that manual test later needs to 😄

Is there a publicly linkable reference to the bug witht he OneLake service that can be referenced as well?

I am cautiously optimistic that this could get into a 0.13.3 so some downstream breaks can be resolved ahead of 0.14

@tustvold

tustvold commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Apologies for the delay on getting to this, been absolutely swamped at work, is this still required or has onelake added support for this?

@shehabgamin

Copy link
Copy Markdown

is this still required or has onelake added support for this?

@tustvold It's still required for OneLake!

@alamb

alamb commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@crepererum and I are working on on a release -- I put this PR on the list and will try and move it along

@kevinjqliu

Copy link
Copy Markdown
Contributor Author

I verified that this is fixed and deployed on the OneLake side. So this PR is no longer needed; we want users to take advantage of startFrom whenever possible

@kevinjqliu kevinjqliu closed this Jun 9, 2026
@shehabgamin

shehabgamin commented Jun 9, 2026

Copy link
Copy Markdown

I verified that this is fixed and deployed on the OneLake side. So this PR is no longer needed; we want users to take advantage of startFrom whenever possible

@kevinjqliu Delta reads still fail (I last tested on Sunday), were you perhaps testing with an unreleased version of Object Store?

Or, was there a very recent deploy/fix in OneLake within the last day?

EDIT:
Confirmed Delta reads are still broken as of today. Delta writes work though.

@kevinjqliu

Copy link
Copy Markdown
Contributor Author

The fix is getting rolled out to production environments. It might not be there yet for some regions. Give it another try in a few days. I'll follow up next week and test it again.

@kevinjqliu kevinjqliu deleted the kevinjqliu/onelake-startFrom branch June 17, 2026 05:37
@kevinjqliu

Copy link
Copy Markdown
Contributor Author

@shehabgamin i checked internally, this is now deployed

@shehabgamin

Copy link
Copy Markdown

@shehabgamin i checked internally, this is now deployed

@kevinjqliu Still doesn't work for me. Here's a reproducible example:

>>> import os
... os.environ["SAIL_CATALOG__LIST"] = (
...     '[{type="onelake", name="ls-onelake", '
...     'url="SOME_ONELAKE_URL/LakeSail.Lakehouse", '
...     'bearer_token="' + BEARER_TOKEN + '"}]'
... )
... os.environ["AZURE_STORAGE_TOKEN"] = BEARER_TOKEN
...
>>> from pysail.spark import SparkConnectServer
... from pyspark.sql import SparkSession
... server = SparkConnectServer(); server.start()
... _, port = server.listening_address
... spark = SparkSession.builder.remote(f"sc://localhost:{port}").getOrCreate()
...
[2026-06-23T21:41:59Z INFO sail_python::spark::server] Starting the Spark Connect server on 127.0.0.1:49728...
>>> spark.sql("SHOW NAMESPACES").show()
[2026-06-23T21:42:04Z INFO sail_session::session_manager::actor::handler] creating session bef5cd8d-c4fe-4bac-bf25-c98e7233c9dd

+----+----------+-----------+-----------+
|name|   catalog|description|locationUri|
+----+----------+-----------+-----------+
| dbo|ls-onelake|       NULL|       NULL|
+----+----------+-----------+-----------+

>>>
>>> spark.sql("SHOW TABLES IN dbo").show()

+--------+--------------------+-----------+
|database|           tableName|isTemporary|
+--------+--------------------+-----------+
|     dbo|iceberg_dim_products|      false|
|     dbo|  delta_dim_products|      false|
+--------+--------------------+-----------+

>>>
>>> df = spark.read.format("csv").option("header", "true").load(
...     "abfss://SOME_ONELAKE_URL@onelake.dfs.fabric.microsoft.com/"
...     "LakeSail.Lakehouse/Files/Dim_Products.csv"
... )
>>>
>>> df.write.format("delta").save(
...     "abfss://SOME_ONELAKE_URL@onelake.dfs.fabric.microsoft.com/"
...     "LakeSail.Lakehouse/Tables/dbo/delta_dim_products_test"
... )

[2026-06-23T21:42:46Z WARN sail_delta_lake::kernel::transaction] Post-commit: failed to load state for version 0 (post-commit activities skipped): Missing commit file: expected final version 0, replay reached -1
>>>
>>> spark.sql("SHOW TABLES IN dbo").show()
+--------+--------------------+-----------+
|database|           tableName|isTemporary|
+--------+--------------------+-----------+
|     dbo|delta_dim_product...|      false|
|     dbo|iceberg_dim_products|      false|
|     dbo|  delta_dim_products|      false|
+--------+--------------------+-----------+

>>> spark.read.format("delta").load(
...     "abfss://SOME_ONELAKE_URL@onelake.dfs.fabric.microsoft.com/"
...     "LakeSail.Lakehouse/Tables/dbo/delta_dim_products_test"
... ).show(5)
Traceback (most recent call last):
... [TRUNCATED]
pyspark.errors.exceptions.connect.AnalysisException: Invalid table location: No commit files found in _delta_log
>>>

@maartenhubrechts-anb

maartenhubrechts-anb commented Jun 30, 2026

Copy link
Copy Markdown

Just hit this too, so adding another data point. Still broken for me as of 2026-06-30, different tenant from LakeSail, North Europe region.

deltalake 1.6.0 (object_store Azure backend) fails with Generic delta kernel error: No files in log segment. I put a TLS-intercepting proxy in front of it to see what it actually sends. The log listing request is this:

GET /<workspace>?restype=container&comp=list
    &prefix=<lakehouse>/Tables/<schema>/<table>/_delta_log/
    &startFrom=<lakehouse>/Tables/<schema>/<table>/_delta_log/00000000000000000000

I replayed that same List Blobs call by hand with the same token, only changing startFrom:

startFrom blobs returned
(omitted) …/_delta_log/00000000000000000000.json, …0001.json
…/_delta_log/00000000000000000000 (what object_store sends) none
…/_delta_log/ (the prefix itself) none

So any non-empty startFrom comes back empty, even values that sort before the actual files (…00000000000000000000 is less than …00000000000000000000.json, so those should still be returned). And the empty response is HTTP 200 with <Blobs /> and an empty <NextMarker />, so it's a final page, not a pagination thing that object_store is failing to follow.

Same result on two table types: a Lakehouse schema-shortcut table (GUID abfss URL) and a Warehouse table (friendly-name abfss URL).

So whatever got deployed, startFrom on _delta_log is still returning nothing here. Can grab more captures if that helps.

Update 2026-07-03: Further narrowed this down. Same code, same auth, both GUID abfss URLs:

  • native Lakehouse table → reads fine
  • schema-shortcut table → No files in log segment

Also reproduced independently of delta-rs with DuckDB's delta extension (which uses delta-kernel-rs / object_store as well).

@kevinjqliu

Copy link
Copy Markdown
Contributor Author

Thank you both for verifying again. Seems like there was a deployment issue and we had to roll back. I will update again once the new deployment is complete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MicrosoftAzure::list_with_offset returns empty on OneLake since 0.13.0 (regression from #623)

6 participants