Conversation
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
Looks like something changed, which caused the CI to fail:
https://github.com/apache/iceberg-python/commits/main/
First attempt to isolate the issue (checking if it is related to
coverage)
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
While working on apache#2004 I've noticed some small discrepancies that I think would be good to address in a separate PR. <!-- Thanks for opening a pull request! --> <!-- In the case this PR will resolve an issue, please replace ${GITHUB_ISSUE_ID} below with the actual Github issue id. --> <!-- Closes #${GITHUB_ISSUE_ID} --> # Rationale for this change # Are these changes tested? # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. --> --------- Co-authored-by: Kevin Liu <kevin.jq.liu@gmail.com>
# Rationale for this change Resolves apache#1946 # Are these changes tested? Yes, using a test that used to fail before :) # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
Closes apache#2028
# Rationale for this change
Provide expected result aligned with `spark` implementation.
This PR fixes a bug where predicate evaluation for a column that is
missing from the parquet file schema will return no result. This is due
to `_ColumnNameTranslator` visitor returning `AlwaysFalse` when the
column cannot be found in the file schema. The solution is to pass in
the projected field value for evaluation. This follows the order of
operation described in
https://iceberg.apache.org/spec/#column-projection
# Are these changes tested?
I've checked it on script attached to issue + new test was added.
Yes, added some unit tests for
`_ColumnNameTranslator`/`translate_column_names`
Added a test for predicate evaluation for projected columns.
# Are there any user-facing changes?
Kinda yes, because results of some scans now different.
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: Roman Shanin <rshanin@bhft.com>
Co-authored-by: Kevin Liu <kevin.jq.liu@gmail.com>
Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Bumps [mypy-boto3-dynamodb](https://github.com/youtype/mypy_boto3_builder) from 1.39.0 to 1.40.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/youtype/mypy_boto3_builder/releases">mypy-boto3-dynamodb's releases</a>.</em></p> <blockquote> <h2>8.8.0 - Python 3.8 runtime is back</h2> <h3>Changed</h3> <ul> <li><code>[services]</code> <code>install_requires</code> section is calculated based on dependencies in use, so <code>typing-extensions</code> version is set properly</li> <li><code>[all]</code> Replaced <code>typing</code> imports with <code>collections.abc</code> with a fallback to <code>typing</code> for Python <3.9</li> <li><code>[all]</code> Added aliases for <code>builtins.list</code>, <code>builtins.set</code>, <code>builtins.dict</code>, and <code>builtins.type</code>, so Python 3.8 runtime should work as expected again (reported by <a href="https://github.com/YHallouard"><code>@YHallouard</code></a> in <a href="https://redirect.github.com/youtype/mypy_boto3_builder/issues/340">#340</a> and <a href="https://github.com/Omri-Ben-Yair"><code>@Omri-Ben-Yair</code></a> in <a href="https://redirect.github.com/youtype/mypy_boto3_builder/issues/336">#336</a>)</li> <li><code>[all]</code> Unions use the same type annotations as the rest of the structures due to proper fallbacks</li> </ul> <h3>Fixed</h3> <ul> <li><code>[services]</code> Universal input/output shapes were not replaced properly in service subresources</li> <li><code>[docs]</code> Simplified doc links rendering for services</li> <li><code>[services]</code> Cleaned up unnecessary imports in <code>client.pyi</code></li> <li><code>[builder]</code> Import records with fallback are always rendered</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/youtype/mypy_boto3_builder/commits">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 3.0.1 to 3.1.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pypa/cibuildwheel/releases">pypa/cibuildwheel's releases</a>.</em></p> <blockquote> <h2>v3.1.3</h2> <ul> <li>🐛 Fix bug where "latest" dependencies couldn't update to pip 25.2 on Windows (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2537">#2537</a>)</li> <li>🛠 Use pytest-rerunfailures to improve some of our iOS/Android tests (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2527">#2527</a>, <a href="https://redirect.github.com/pypa/cibuildwheel/issues/2539">#2539</a>)</li> <li>🛠 Remove some GraalPy Windows workarounds in our tests (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2501">#2501</a>)</li> </ul> <h2>v3.1.2</h2> <ul> <li>⚠️ Add an error if <code>CIBW_FREE_THREADING_SUPPORT</code> is set; you are likely missing 3.13t wheels, please use the <code>enable</code>/<code>CIBW_ENABLE</code> (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2520">#2520</a>)</li> <li>🛠 <code>riscv64</code> now enabled if you target that architecture, it's now supported on PyPI (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2509">#2509</a>)</li> <li>🛠 Add warning when using <code>cpython-experimental-riscv64</code> (no longer needed) (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2526">#2526</a>, <a href="https://redirect.github.com/pypa/cibuildwheel/issues/2528">#2528</a>)</li> <li>🛠 iOS versions bumped, fixing issues with 3.14 (now RC 1) (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2530">#2530</a>)</li> <li>🐛 Fix bug in Android running wheel from our GitHub Action (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2517">#2517</a>)</li> <li>🐛 Fix warning when using <code>test-skip</code> of <code>"*-macosx_universal2:arm64"</code> (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2522">#2522</a>)</li> <li>🐛 Fix incorrect number of wheels reported in logs, again (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2517">#2517</a>)</li> <li>📚 We welcome our Android platform maintainer (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2516">#2516</a>)</li> </ul> <h2>v3.1.1</h2> <ul> <li>🐛 Fix a bug showing an incorrect wheel count at the end of execution, and misrepresenting test-only runs in the GitHub Action summary (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2512">#2512</a>)</li> <li>📚 Docs fix (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2510">#2510</a>)</li> </ul> <h2>v3.1.0</h2> <ul> <li>🌟 CPython 3.14 wheels are now built by default - without the <code>"cpython-prerelease"</code> <code>enable</code> set. It's time to build and upload these wheels to PyPI! This release includes CPython 3.14.0rc1, which is guaranteed to be ABI compatible with the final release. (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2507">#2507</a>) Free-threading is no longer experimental in 3.14, so you have to skip it explicitly with <code>'cp31?t-*'</code> if you don't support it yet. (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2503">#2503</a>)</li> <li>🌟 Adds the ability to <a href="https://cibuildwheel.pypa.io/en/stable/platforms/#android">build wheels for Android</a>! Set the <a href="https://cibuildwheel.pypa.io/en/stable/options/#platform"><code>platform</code> option</a> to <code>android</code> on Linux or macOS to try it out! (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2349">#2349</a>)</li> <li>🌟 Adds Pyodide 0.28, which builds 3.13 wheels (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2487">#2487</a>)</li> <li>✨ Support for 32-bit <code>manylinux_2_28</code> (now a consistent default) and <code>manylinux_2_34</code> added (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2500">#2500</a>)</li> <li>🛠 Improved summary, will also use markdown summary output on GHA (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2469">#2469</a>)</li> <li>🛠 The riscv64 images now have a working default (as they are now part of pypy/manylinux), but are still experimental (and behind an <code>enable</code>) since you can't push them to PyPI yet (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2506">#2506</a>)</li> <li>🛠 Fixed a typo in the 3.9 MUSL riscv64 identifier (<code>cp39-musllinux_ricv64</code> -> <code>cp39-musllinux_riscv64</code>) (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2490">#2490</a>)</li> <li>🛠 Mistyping <code>--only</code> now shows the correct possibilities, and even suggests near matches on Python 3.14+ (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2499">#2499</a>)</li> <li>🛠 Only support one output from the repair step on linux like other platforms; auditwheel fixed this over four years ago! (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2478">#2478</a>)</li> <li>🛠 We now use pattern matching extensively (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2434">#2434</a>)</li> <li>📚 We now have platform maintainers for our special platforms and interpreters! (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2481">#2481</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md">pypa/cibuildwheel's changelog</a>.</em></p> <blockquote> <h3>v3.1.3</h3> <p><em>1 August 2025</em></p> <ul> <li>🐛 Fix bug where "latest" dependencies couldn't update to pip 25.2 on Windows (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2537">#2537</a>)</li> <li>🛠 Use pytest-rerunfailures to improve some of our iOS/Android tests (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2527">#2527</a>, <a href="https://redirect.github.com/pypa/cibuildwheel/issues/2539">#2539</a>)</li> <li>🛠 Remove some GraalPy Windows workarounds in our tests (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2501">#2501</a>)</li> </ul> <h3>v3.1.2</h3> <p><em>29 July 2025</em></p> <ul> <li>⚠️ Add an error if <code>CIBW_FREE_THREADING_SUPPORT</code> is set; you are likely missing 3.13t wheels, please use the <code>enable</code>/<code>CIBW_ENABLE</code> (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2520">#2520</a>)</li> <li>🛠 <code>riscv64</code> now enabled if you target that architecture, it's now supported on PyPI (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2509">#2509</a>)</li> <li>🛠 Add warning when using <code>cpython-experimental-riscv64</code> (no longer needed) (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2526">#2526</a>, <a href="https://redirect.github.com/pypa/cibuildwheel/issues/2528">#2528</a>)</li> <li>🛠 iOS versions bumped, fixing issues with 3.14 (now RC 1) (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2530">#2530</a>)</li> <li>🐛 Fix bug in Android running wheel from our GitHub Action (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2517">#2517</a>)</li> <li>🐛 Fix warning when using <code>test-skip</code> of <code>"*-macosx_universal2:arm64"</code> (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2522">#2522</a>)</li> <li>🐛 Fix incorrect number of wheels reported in logs, again (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2517">#2517</a>)</li> <li>📚 We welcome our Android platform maintainer (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2516">#2516</a>)</li> </ul> <h3>v3.1.1</h3> <p><em>24 July 2025</em></p> <ul> <li>🐛 Fix a bug showing an incorrect wheel count at the end of execution, and misrepresenting test-only runs in the GitHub Action summary (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2512">#2512</a>)</li> <li>📚 Docs fix (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2510">#2510</a>)</li> </ul> <h3>v3.1.0</h3> <p><em>23 July 2025</em></p> <ul> <li>🌟 CPython 3.14 wheels are now built by default - without the <code>"cpython-prerelease"</code> <code>enable</code> set. It's time to build and upload these wheels to PyPI! This release includes CPython 3.14.0rc1, which is guaranteed to be ABI compatible with the final release. (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2507">#2507</a>) Free-threading is no longer experimental in 3.14, so you have to skip it explicitly with <code>'cp31?t-*'</code> if you don't support it yet. (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2503">#2503</a>)</li> <li>🌟 Adds the ability to <a href="https://cibuildwheel.pypa.io/en/stable/platforms/#android">build wheels for Android</a>! Set the <a href="https://cibuildwheel.pypa.io/en/stable/options/#platform"><code>platform</code> option</a> to <code>android</code> on Linux or macOS to try it out! (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2349">#2349</a>)</li> <li>🌟 Adds Pyodide 0.28, which builds 3.13 wheels (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2487">#2487</a>)</li> <li>✨ Support for 32-bit <code>manylinux_2_28</code> (now a consistent default) and <code>manylinux_2_34</code> added (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2500">#2500</a>)</li> <li>🛠 Improved summary, will also use markdown summary output on GHA (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2469">#2469</a>)</li> <li>🛠 The riscv64 images now have a working default (as they are now part of pypy/manylinux), but are still experimental (and behind an <code>enable</code>) since you can't push them to PyPI yet (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2506">#2506</a>)</li> <li>🛠 Fixed a typo in the 3.9 MUSL riscv64 identifier (<code>cp39-musllinux_ricv64</code> -> <code>cp39-musllinux_riscv64</code>) (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2490">#2490</a>)</li> <li>🛠 Mistyping <code>--only</code> now shows the correct possibilities, and even suggests near matches on Python 3.14+ (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2499">#2499</a>)</li> <li>🛠 Only support one output from the repair step on linux like other platforms; auditwheel fixed this over four years ago! (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2478">#2478</a>)</li> <li>🛠 We now use pattern matching extensively (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2434">#2434</a>)</li> <li>📚 We now have platform maintainers for our special platforms and interpreters! (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2481">#2481</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pypa/cibuildwheel/commit/352e01339f0a173aa2a3eb57f01492e341e83865"><code>352e013</code></a> Bump version: v3.1.3</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/c463e56ba22f7f7e6c8871b006a06384c08cff34"><code>c463e56</code></a> tests: another iOS flaky spot (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2539">#2539</a>)</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/8c5c738023fee8aad6412105b42ea798066b1438"><code>8c5c738</code></a> docs(project): add Falcon to working examples (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2538">#2538</a>)</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/feeb3992a7ea36ffbc9d4446debea40f9aa24861"><code>feeb399</code></a> tests: add flaky test handling (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2527">#2527</a>)</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/60b9cc95db51f9f5e48562fcb1b3f7ac3f9cb4a1"><code>60b9cc9</code></a> fix: never call pip directly (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2537">#2537</a>)</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/e2c7102ed7981cd79d28a5eb0a196f8242b1adab"><code>e2c7102</code></a> chore: remove some GraalPy Windows workarounds. (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2501">#2501</a>)</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/9e4e50bd76b3190f55304387e333f6234823ea9b"><code>9e4e50b</code></a> Bump version: v3.1.2</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/8ef9414f60b366420233447f0abd96586ed394c7"><code>8ef9414</code></a> [pre-commit.ci] pre-commit autoupdate (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2532">#2532</a>)</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/1953c0497215dcf2711e1fbfd3ae8952e8ad604c"><code>1953c04</code></a> Adding <a href="https://github.com/mhsmith"><code>@mhsmith</code></a> as platform maintainer for Android (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2516">#2516</a>)</li> <li><a href="https://github.com/pypa/cibuildwheel/commit/46a6d279953e2947496fa28a22ded264f4027a5f"><code>46a6d27</code></a> Bump iOS support package versions. (<a href="https://redirect.github.com/pypa/cibuildwheel/issues/2530">#2530</a>)</li> <li>Additional commits viewable in <a href="https://github.com/pypa/cibuildwheel/compare/v3.0.1...v3.1.3">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [griffe](https://github.com/mkdocstrings/griffe) from 1.7.3 to 1.9.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/mkdocstrings/griffe/releases">griffe's releases</a>.</em></p> <blockquote> <h2>1.9.0</h2> <h2><a href="https://github.com/mkdocstrings/griffe/releases/tag/1.9.0">1.9.0</a> - 2025-07-28</h2> <p><!-- raw HTML omitted --><a href="https://github.com/mkdocstrings/griffe/compare/1.8.0...1.9.0">Compare with 1.8.0</a><!-- raw HTML omitted --></p> <h3>Features</h3> <ul> <li>Support PEP 695 generics (<a href="https://github.com/mkdocstrings/griffe/commit/be28e9c9835a709fca0a78990c56e8d652a71a8c">be28e9c</a> by Victor Westerhuis). <a href="https://redirect.github.com/mkdocstrings/griffe/issues/342">Issue-342</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/348">PR-348</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> </ul> <h2>1.8.0</h2> <h2><a href="https://github.com/mkdocstrings/griffe/releases/tag/1.8.0">1.8.0</a> - 2025-07-23</h2> <p><!-- raw HTML omitted --><a href="https://github.com/mkdocstrings/griffe/compare/1.7.3...1.8.0">Compare with 1.7.3</a><!-- raw HTML omitted --></p> <h3>Features</h3> <ul> <li>Add method to functions and classes to build and return a stringified signature (<a href="https://github.com/mkdocstrings/griffe/commit/8ef1486e9b1f0872cca3b1cd2419144b702a0c1e">8ef1486</a> by ISOREX). <a href="https://github.com/mkdocstrings/griffe/discussions/376">Discussion-376</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/381">PR-381</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> <li>Enhance Sphinx-style parameter parsing to handle invalid type info (<a href="https://github.com/mkdocstrings/griffe/commit/cbce5a2c2429dc92e15ac3a8fe53db55825ebd6c">cbce5a2</a> by Edouard Choinière). <a href="https://redirect.github.com/mkdocstrings/griffe/pull/396">PR-396</a></li> <li>Parse Sphinx parameter types as expressions (<a href="https://github.com/mkdocstrings/griffe/commit/70dda21d15dfdf5807dde370fb636d69eea6272b">70dda21</a> by Edouard Choinière). <a href="https://redirect.github.com/mkdocstrings/griffe/pull/392">PR-392</a></li> </ul> <h3>Bug Fixes</h3> <ul> <li>Avoid SyntaxError when loading modules encoded in UTF8 with BOM (<a href="https://github.com/mkdocstrings/griffe/commit/b3461901ae08204ea6184025a006f5d34152d30d">b346190</a> by John Hennig). <a href="https://redirect.github.com/mkdocstrings/griffe/issues/386">Issue-386</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/387">PR-387</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> <li>Correctly parenthesize expressions (<a href="https://github.com/mkdocstrings/griffe/commit/a8c5585c8a45a4d6b67bd5dc36d7054478d3873d">a8c5585</a> by Abraham Cheung). <a href="https://redirect.github.com/mkdocstrings/griffe/pull/389">PR-389</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> </ul> <h3>Code Refactoring</h3> <ul> <li>Be more consistent regarding not overriding submodules with aliases (<a href="https://github.com/mkdocstrings/griffe/commit/be1963cca6d7d49bcc41fdf05570b1bfba934330">be1963c</a> by Timothée Mazzucotelli).</li> <li>Allow <code>ExprName.parent</code> to be of type <code>griffe.Function</code> (<a href="https://github.com/mkdocstrings/griffe/commit/acafbd8b6d97fe8370f3eb730e2154e19b2c1a54">acafbd8</a> by Edouard Choinière). <a href="https://github.com/mkdocstrings/griffe/discussions/391">Issue-391</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/395">PR-395</a></li> <li>Normalize labels for attributes (<a href="https://github.com/mkdocstrings/griffe/commit/1b376cd39ce99730910d8344abbfd5c53ce28300">1b376cd</a> by Timothée Mazzucotelli).</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/mkdocstrings/griffe/blob/main/CHANGELOG.md">griffe's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/mkdocstrings/griffe/releases/tag/1.9.0">1.9.0</a> - 2025-07-28</h2> <p><!-- raw HTML omitted --><a href="https://github.com/mkdocstrings/griffe/compare/1.8.0...1.9.0">Compare with 1.8.0</a><!-- raw HTML omitted --></p> <h3>Features</h3> <ul> <li>Support PEP 695 generics (<a href="https://github.com/mkdocstrings/griffe/commit/be28e9c9835a709fca0a78990c56e8d652a71a8c">be28e9c</a> by Victor Westerhuis). <a href="https://redirect.github.com/mkdocstrings/griffe/issues/342">Issue-342</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/348">PR-348</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> </ul> <h2><a href="https://github.com/mkdocstrings/griffe/releases/tag/1.8.0">1.8.0</a> - 2025-07-23</h2> <p><!-- raw HTML omitted --><a href="https://github.com/mkdocstrings/griffe/compare/1.7.3...1.8.0">Compare with 1.7.3</a><!-- raw HTML omitted --></p> <h3>Features</h3> <ul> <li>Add method to functions and classes to build and return a stringified signature (<a href="https://github.com/mkdocstrings/griffe/commit/8ef1486e9b1f0872cca3b1cd2419144b702a0c1e">8ef1486</a> by ISOREX). <a href="https://github.com/mkdocstrings/griffe/discussions/376">Discussion-376</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/381">PR-381</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> <li>Enhance Sphinx-style parameter parsing to handle invalid type info (<a href="https://github.com/mkdocstrings/griffe/commit/cbce5a2c2429dc92e15ac3a8fe53db55825ebd6c">cbce5a2</a> by Edouard Choinière). <a href="https://redirect.github.com/mkdocstrings/griffe/pull/396">PR-396</a></li> <li>Parse Sphinx parameter types as expressions (<a href="https://github.com/mkdocstrings/griffe/commit/70dda21d15dfdf5807dde370fb636d69eea6272b">70dda21</a> by Edouard Choinière). <a href="https://redirect.github.com/mkdocstrings/griffe/pull/392">PR-392</a></li> </ul> <h3>Bug Fixes</h3> <ul> <li>Avoid SyntaxError when loading modules encoded in UTF8 with BOM (<a href="https://github.com/mkdocstrings/griffe/commit/b3461901ae08204ea6184025a006f5d34152d30d">b346190</a> by John Hennig). <a href="https://redirect.github.com/mkdocstrings/griffe/issues/386">Issue-386</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/387">PR-387</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> <li>Correctly parenthesize expressions (<a href="https://github.com/mkdocstrings/griffe/commit/a8c5585c8a45a4d6b67bd5dc36d7054478d3873d">a8c5585</a> by Abraham Cheung). <a href="https://redirect.github.com/mkdocstrings/griffe/pull/389">PR-389</a>, Co-authored-by: Timothée Mazzucotelli <a href="mailto:dev@pawamoy.fr">dev@pawamoy.fr</a></li> </ul> <h3>Code Refactoring</h3> <ul> <li>Be more consistent regarding not overriding submodules with aliases (<a href="https://github.com/mkdocstrings/griffe/commit/be1963cca6d7d49bcc41fdf05570b1bfba934330">be1963c</a> by Timothée Mazzucotelli).</li> <li>Allow <code>ExprName.parent</code> to be of type <code>griffe.Function</code> (<a href="https://github.com/mkdocstrings/griffe/commit/acafbd8b6d97fe8370f3eb730e2154e19b2c1a54">acafbd8</a> by Edouard Choinière). <a href="https://github.com/mkdocstrings/griffe/discussions/391">Issue-391</a>, <a href="https://redirect.github.com/mkdocstrings/griffe/pull/395">PR-395</a></li> <li>Normalize labels for attributes (<a href="https://github.com/mkdocstrings/griffe/commit/1b376cd39ce99730910d8344abbfd5c53ce28300">1b376cd</a> by Timothée Mazzucotelli).</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/mkdocstrings/griffe/commit/032779aa3bdfbdaeb6411da4f8853318ff2e8424"><code>032779a</code></a> chore: Prepare release 1.9.0</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/be28e9c9835a709fca0a78990c56e8d652a71a8c"><code>be28e9c</code></a> feat: Support PEP 695 generics</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/27a385b435d3c09eedeffe980416f7012e65ed6f"><code>27a385b</code></a> chore: Prepare release 1.8.0</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/8ef1486e9b1f0872cca3b1cd2419144b702a0c1e"><code>8ef1486</code></a> feat: Add method to functions and classes to build and return a stringified s...</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/b3461901ae08204ea6184025a006f5d34152d30d"><code>b346190</code></a> fix: Avoid SyntaxError when loading modules encoded in UTF8 with BOM</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/a8c5585c8a45a4d6b67bd5dc36d7054478d3873d"><code>a8c5585</code></a> fix: Correctly parenthesize expressions</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/0a051861d4fc3e064f2d19d53e1abab112316771"><code>0a05186</code></a> ci: Ignore Mypy warnings</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/cbce5a2c2429dc92e15ac3a8fe53db55825ebd6c"><code>cbce5a2</code></a> feat: Enhance Sphinx-style parameter parsing to handle invalid type info</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/2d77bf16a0b6f823e2bbe54a49f2c26e4cd1e290"><code>2d77bf1</code></a> Merge branch 'main' of github.com:mkdocstrings/griffe</li> <li><a href="https://github.com/mkdocstrings/griffe/commit/be1963cca6d7d49bcc41fdf05570b1bfba934330"><code>be1963c</code></a> refactor: Be more consistent regarding not overriding submodules with aliases</li> <li>Additional commits viewable in <a href="https://github.com/mkdocstrings/griffe/compare/1.7.3...1.9.0">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.6.15 to 9.6.16. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/squidfunk/mkdocs-material/releases">mkdocs-material's releases</a>.</em></p> <blockquote> <h2>mkdocs-material-9.6.16</h2> <ul> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8349">#8349</a>: Info plugin doesn't correctly detect virtualenv in some cases</li> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8334">#8334</a>: Find-in-page detects matches in hidden search result list</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG">mkdocs-material's changelog</a>.</em></p> <blockquote> <p>mkdocs-material-9.6.16 (2025-07-26)</p> <ul> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8349">#8349</a>: Info plugin doesn't correctly detect virtualenv in some cases</li> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8334">#8334</a>: Find-in-page detects matches in hidden search result list</li> </ul> <p>mkdocs-material-9.6.15 (2025-07-01)</p> <ul> <li>Updated Mongolian translations</li> <li>Improved semantic markup of "edit this page" button</li> <li>Improved info plugin virtual environment resolution</li> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8291">#8291</a>: Large font size setting throws of breakpoints in JavaScript</li> </ul> <p>mkdocs-material-9.6.14 (2025-05-13)</p> <ul> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8215">#8215</a>: Social plugin crashes when CairoSVG is updated to 2.8</li> </ul> <p>mkdocs-material-9.6.13 (2025-05-10)</p> <ul> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8204">#8204</a>: Annotations showing list markers in print view</li> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8153">#8153</a>: Improve style of cardinality symbols in Mermaid.js ER diagrams</li> </ul> <p>mkdocs-material-9.6.12 (2025-04-17)</p> <ul> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8158">#8158</a>: Flip footnote back reference icon for right-to-left languages</li> </ul> <p>mkdocs-material-9.6.11 (2025-04-01)</p> <ul> <li>Updated Docker image to latest Alpine Linux</li> <li>Bump required Jinja version to 3.1</li> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8133">#8133</a>: Jinja filter <code>items</code> not available (9.6.10 regression)</li> <li>Fixed <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8128">#8128</a>: Search plugin not entirely disabled via enabled setting</li> </ul> <p>mkdocs-material-9.6.10 (2025-03-30)</p> <p>This version is a pure refactoring release, and does not contain new features or bug fixes. It strives to improve the compatibility of our templates with alternative Jinja-like template engines that we're currently exploring, including minijinja.</p> <p>Additionally, it replaces several instances of Python function invocations with idiomatic use of template filters. All instances where variables have been mutated inside templates have been replaced. Most changes have been made in partials, and only a few in blocks, and all of them are fully backward compatible, so no changes to overrides are necessary.</p> <p>Note that this release does not replace the Jinja template engine with minijinja. However, our templates are now 99% compatible with minijinja, which means we can explore alternative Jinja-compatible implementations. Additionally, immutability and removal of almost all Python function invocations means much more idiomatic templating.</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/8b949816ba16ad46e7c4c72912e5bd8c6254dcfd"><code>8b94981</code></a> Prepare 9.6.16 release</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/e5c7ab542745013397b7e7b284960ed57de306c1"><code>e5c7ab5</code></a> Updated dependencies</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/53385529f0481e89ff6fba49f855fd0dbbd16cdb"><code>5338552</code></a> Fixed handling of inconsistent drive letter case</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/6d4f756461dd556e459d4769d6e1ec7ce7bce4ae"><code>6d4f756</code></a> Fixed dotpath venv guessing</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/88bdcf5f16696eb540fcbf8bc24244dc1c5f965f"><code>88bdcf5</code></a> Fixed empty username fallback</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/d0c4bd618fd01b849ba85aee5275c23481214eea"><code>d0c4bd6</code></a> Merge pull request <a href="https://redirect.github.com/squidfunk/mkdocs-material/issues/8346">#8346</a> from squidfunk/dependabot/npm_and_yarn/form-data-3.0.4</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/9c1e4deb5c96c6baa076431f0cb4ede20ba35fab"><code>9c1e4de</code></a> Bump form-data from 3.0.1 to 3.0.4</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/b2d235eb65d717202fedbe70c6856f2560e66c40"><code>b2d235e</code></a> Updated Premium sponsors</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/e54ff0632c46e811e6adc7a104c0bd567ed22fc0"><code>e54ff06</code></a> Updated Premium sponsors</li> <li><a href="https://github.com/squidfunk/mkdocs-material/commit/212b7ab8f74a78a56e8f09b032633bc69289639a"><code>212b7ab</code></a> Updated Premium sponsors</li> <li>Additional commits viewable in <a href="https://github.com/squidfunk/mkdocs-material/compare/9.6.15...9.6.16">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pyiceberg-core](https://rust.iceberg.apache.org) from 0.5.1 to 0.6.0. [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Rationale for this change Just a small bump, noticed this while working on some other stuff # Are these changes tested? # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
# Rationale for this change Added this wayyyy back: apache/iceberg#5570. I think this is nicer than the full Java classpath # Are these changes tested? # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
Identified two issues that can be worked on in parallel
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
Closes apache#2123
# Rationale for this change
Fixing sanitization behaviour to match specification and Java
implementation
# Are these changes tested?
Yes - Unit and integration tests
# Are there any user-facing changes?
Yes - Field names will be sanitized to be Avro compatible if not already
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
# Rationale for this change This is a refactor of the `_get_column_projection_values` to rely on field-IDs rather than names. Field IDs will never change, while partitions and column names can be updated in a tables' lifetime. # Are these changes tested? # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
# Rationale for this change I noticed we just passed in the value, without setting the type explicitly. By default, PyArrow will for example upscale 1 to an int64 field, while the column is of type int32 in the table. # Are these changes tested? # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
# Rationale for this change Missed this in another PR # Are these changes tested? # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
) <!-- Thanks for opening a pull request! --> <!-- In the case this PR will resolve an issue, please replace ${GITHUB_ISSUE_ID} below with the actual Github issue id. --> Closes apache#1853 This adds a new repr function that ensures that `initial-default` and `write-default` will not appear if they are None. Unfortunately, this functionality isn't baked into Pydantic. # Rationale for this change __repr__ changes may be breaking. # Are these changes tested? Tests included. # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
Closes apache#2270
Related to apache#1045
# Rationale for this change
This allows us to read nanosecond information from pyarrow. Right now,
we always downcast to microseconds or throw an error. By passing through
the format-version, we can grab nanosecond precision *just for v3
tables*
# Are these changes tested?
Included a test. I can't do a test involving writing since we don't
support v3 writing yet (there's a PR out for that)
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: Fokko Driesprong <fokko@apache.org>
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
Similar to apache#2299
This PR adds the rest of the parameters to
[`pyarrow.fs.AzureFileSystem`](https://arrow.apache.org/docs/python/generated/pyarrow.fs.AzureFileSystem.html)
Note the [Azure Data Lake configuration
page](https://github.com/apache/iceberg-python/blob/main/mkdocs/docs/configuration.md#azure-data-lake)
already has these 3 parameters
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
…he#2143) <!-- Thanks for opening a pull request! --> <!-- Closes apache#2150 --> # Rationale for this change - Consolidates snapshot expiration functionality from the standalone `ExpireSnapshots` class into the `MaintenanceTable` class for a unified maintenance API. - Resolves planned work left over from apache#1880, and closes apache#2142 - Achieves feature and API parity with the Java implementation for snapshot retention and table maintenance. # Features & Enhancements - Introduces `table.maintenance.expire_snapshots()` as the unified entry point for snapshot expiration and future maintenance operations. - Retains the existing `ExpireSnapshots` implementation internally. The `expire_snapshots()` method on `MaintenanceTable` now returns an `ExpireSnapshots` object, preserving transaction semantics and supporting context manager usage: ```python with table.maintenance.expire_snapshots() as expire_snapshots: expire_snapshots.by_id(1) expire_snapshots.by_id(2) ``` - Focuses this PR on refactoring and documentation improvements, while maintaining compatibility with the prior `ExpireSnapshots` interface. - Sets a foundation for future expansion of the `MaintenanceTable` abstraction to encapsulate additional maintenance operations. # Bug Fixes & Cleanups - **ManageSnapshots Cleanup ([apache#2151](apache#2151 - Removes an unrelated instance variable from the `ManageSnapshots` class, aligning with the Java reference implementation. # Testing & Documentation - **Testing:** - Tested the new API interface including: - Expiration by ID - Protection of branch/tag snapshots - **Documentation:** - Added and updated documentation to describe: - API usage examples Preview: <img width="1686" height="1015" alt="Screenshot 2025-08-11 at 1 37 04 PM" src="https://github.com/user-attachments/assets/f469f3fc-b4b1-4ec9-b1ca-b9185e22643e" /> # Are these changes tested? Yes. All changes are tested.~, with this PR predicated on the final changes from apache#1200.~ This work builds on the framework introduced by @jayceslesar in apache#1200 for the `MaintenanceTable`. # Are there any user-facing changes? --- **Closes:** - Closes apache#2151 - Closes apache#2142 --------- Co-authored-by: Fokko Driesprong <fokko@apache.org> Co-authored-by: Kevin Liu <kevin.jq.liu@gmail.com>
# Rationale for this change I was looking into this, and took the liberty of changing the API to a datetime rather than milliseconds to avoid anyone passing in seconds or microseconds. # Are these changes tested? # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. -->
|
/gemini review |
There was a problem hiding this comment.
Summary of Changes
Hello @kevinjqliu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a suite of enhancements to PyIceberg, primarily focusing on expanding authentication capabilities for REST catalogs and cloud storage, refining table maintenance operations, and bolstering schema evolution and column projection logic. Key updates include the addition of Google authentication for REST catalogs, improved ADLS token and credential handling, and a new, more intuitive API for managing table snapshots. Furthermore, the changes significantly improve how PyIceberg processes schemas and data, especially concerning missing columns and nanosecond timestamp precision, ensuring greater compatibility and data integrity across various Iceberg table versions and data sources.
Highlights
- Google Authentication for REST Catalog: The REST catalog now supports Google authentication, allowing users to configure their catalog to authenticate using Google Application Default Credentials or a specified service account key file. This significantly enhances security and integration with Google Cloud services.
- Enhanced ADLS Authentication: Expanded Azure Data Lake Storage (ADLS) authentication options now include direct token and credential support, as well as authentication via client ID, client secret, and tenant ID. This provides greater flexibility and security for connecting to ADLS Gen2.
- Refactored Table Maintenance API: The table maintenance operations, particularly snapshot expiration, have been refactored into a new
table.maintenanceAPI. This provides a more intuitive and organized interface for managing table lifecycle, allowing for operations like expiring snapshots by ID or by age. - Improved Schema Evolution and Column Projection: Significant improvements have been made to how PyIceberg handles schema evolution and column projection. The system can now correctly evaluate expressions against columns that might be missing in the physical data file by leveraging projected values or default field values, ensuring data consistency across schema changes. Additionally, PyArrow schema conversion now properly accounts for Iceberg table format versions, including support for nanosecond timestamp precision in format version 3.
- Robust Avro Field Name Sanitization: To ensure broader compatibility and correct data representation, Avro field names are now comprehensively sanitized. This includes handling special characters and emojis, with the original field name preserved using a new
iceberg-field-nameproperty in the Avro schema.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces significant enhancements and refactoring across the PyIceberg library. Key additions include authentication support for ADLS using tokens and for the REST catalog using Google credentials. The table maintenance API has been improved by moving snapshot expiration into a dedicated maintenance submodule, which clarifies the API structure. The implementation of column projection for filters is a major feature that will improve query performance. There are also important updates for nanosecond timestamp precision and Avro schema sanitization to improve cross-platform compatibility. The codebase has been improved with various refactorings and bug fixes, and the test suite has been substantially expanded to cover the new functionality.
| def __init__(self, token_string: str) -> None: | ||
| self._token = token_string | ||
| # If no expiry provided, set 1 hour from now | ||
| self._expires_on = int(time.time()) + 3600 |
There was a problem hiding this comment.
| if ADLS_TOKEN in properties: | ||
| token = properties.get(ADLS_TOKEN) | ||
| if token is not None: | ||
| credential = StaticTokenCredential(token) | ||
| else: | ||
| credential = None | ||
| elif ADLS_CREDENTIAL in properties: | ||
| credential = properties.get(ADLS_CREDENTIAL) | ||
| else: | ||
| credential = None |
There was a problem hiding this comment.
This block of code for determining the credential can be simplified for better readability and conciseness. Using an assignment expression (walrus operator) can make the code more direct.
credential = None
if token := properties.get(ADLS_TOKEN):
credential = StaticTokenCredential(token)
elif ADLS_CREDENTIAL in properties:
credential = properties.get(ADLS_CREDENTIAL)| # Validate that all three are provided together for ClientSecretCredential | ||
| credential_keys = ["client_id", "client_secret", "tenant_id"] | ||
| provided_keys = [key for key in credential_keys if key in client_kwargs] | ||
| if provided_keys and len(provided_keys) != len(credential_keys): | ||
| missing_keys = [key for key in credential_keys if key not in client_kwargs] | ||
| raise ValueError( | ||
| f"client_id, client_secret, and tenant_id must all be provided together " | ||
| f"to use ClientSecretCredential for Azure authentication. " | ||
| f"Provided: {provided_keys}, Missing: {missing_keys}" | ||
| ) |
There was a problem hiding this comment.
The logic for validating that all required credential keys are provided can be expressed more concisely and efficiently using set operations. This would also make the intent of the code clearer.
| # Validate that all three are provided together for ClientSecretCredential | |
| credential_keys = ["client_id", "client_secret", "tenant_id"] | |
| provided_keys = [key for key in credential_keys if key in client_kwargs] | |
| if provided_keys and len(provided_keys) != len(credential_keys): | |
| missing_keys = [key for key in credential_keys if key not in client_kwargs] | |
| raise ValueError( | |
| f"client_id, client_secret, and tenant_id must all be provided together " | |
| f"to use ClientSecretCredential for Azure authentication. " | |
| f"Provided: {provided_keys}, Missing: {missing_keys}" | |
| ) | |
| # Validate that all three are provided together for ClientSecretCredential | |
| credential_keys = {"client_id", "client_secret", "tenant_id"} | |
| provided_keys = credential_keys.intersection(client_kwargs) | |
| if provided_keys and provided_keys != credential_keys: | |
| missing_keys = credential_keys - provided_keys | |
| raise ValueError( | |
| f"client_id, client_secret, and tenant_id must all be provided together " | |
| f"to use ClientSecretCredential for Azure authentication. " | |
| f"Provided: {sorted(list(provided_keys))}, Missing: {sorted(list(missing_keys))}" | |
| ) |
There was a problem hiding this comment.
Code Review
This pull request introduces a wide range of improvements and new features across the PyIceberg library. Key additions include support for Google Cloud and enhanced ADLS authentication, providing more flexibility for users on different cloud platforms. The table maintenance API has been refactored, with operations like snapshot expiration now accessible through a cleaner table.maintenance interface. There are also significant enhancements to column projection and initial support for nanosecond precision timestamps, aligning with the latest Iceberg specifications. The transaction management has been made more robust. Overall, these are excellent changes that improve the library's capabilities, API design, and correctness. I have a couple of suggestions for improvement.
| if partition_value := accessors[partition_field.field_id].get(file.partition): | ||
| projected_missing_fields[field_id] = partition_value |
There was a problem hiding this comment.
This logic for column projection has a couple of potential issues. First, it's missing IndexError handling for schemas that have evolved, which could cause crashes when processing older data files that don't have a new partition field. Second, the if partition_value := ... check will incorrectly skip falsy but valid partition values (like 0 or False), leading to incorrect data projection.
I suggest restoring the try...except block and ensuring all non-missing values are projected, regardless of their truthiness.
| if partition_value := accessors[partition_field.field_id].get(file.partition): | |
| projected_missing_fields[field_id] = partition_value | |
| try: | |
| projected_missing_fields[field_id] = accessors[partition_field.field_id].get(file.partition) | |
| except IndexError: | |
| # This can happen when a new partition field was added after the file was written. | |
| # In this case, we can't project the value. | |
| pass |
| class StaticTokenCredential(AsyncTokenCredential): | ||
| def __init__(self, token_string: str) -> None: | ||
| self._token = token_string | ||
| # If no expiry provided, set 1 hour from now | ||
| self._expires_on = int(time.time()) + 3600 | ||
|
|
||
| async def get_token(self, *scopes: str, **kwargs: Any) -> AccessToken: | ||
| return AccessToken(self._token, self._expires_on) |
There was a problem hiding this comment.
The StaticTokenCredential hardcodes the token expiry to one hour. This could be problematic if the provided token has a shorter lifetime. It would be more robust to allow configuring the expiry time, or at least clearly document this fixed expiry behavior in the class's docstring.
class StaticTokenCredential(AsyncTokenCredential):
"""A credential that uses a static token.
The token expiry is set to 1 hour from initialization.
"""
def __init__(self, token_string: str) -> None:
self._token = token_string
self._expires_on = int(time.time()) + 3600
async def get_token(self, *scopes: str, **kwargs: Any) -> AccessToken:
return AccessToken(self._token, self._expires_on)
Rationale for this change
Are these changes tested?
Are there any user-facing changes?