Skip to content

Bump unstructured from 0.15.13 to 0.22.21#938

Closed
dependabot[bot] wants to merge 1 commit intodevelopmentfrom
dependabot/pip/unstructured-0.22.21
Closed

Bump unstructured from 0.15.13 to 0.22.21#938
dependabot[bot] wants to merge 1 commit intodevelopmentfrom
dependabot/pip/unstructured-0.22.21

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Apr 20, 2026

Bumps unstructured from 0.15.13 to 0.22.21.

Release notes

Sourced from unstructured's releases.

0.22.21

What's Changed

Full Changelog: Unstructured-IO/unstructured@0.22.20...0.22.21

0.22.20

What's Changed

New Contributors

Full Changelog: Unstructured-IO/unstructured@0.22.18...0.22.20

0.22.18

What's Changed

Full Changelog: Unstructured-IO/unstructured@0.22.16...0.22.18

0.22.16

Enhancements

  • Formula markdown export (element_to_md / elements_to_md): New keyword-only formula_markdown_style ("auto", "display_math", "plain"; default "auto"). In "auto", display math ($$ ... $$) is used only when the text looks like notation (heuristic score) and contains no $/$$ (avoids breaking Markdown and noisy OCR captions). "display_math" wraps whenever safe (still falls back to plain if $ would corrupt fences). "plain" emits text only. Optional normalize_formula (default True) maps common Unicode operators to LaTeX-like tokens; normalize_formula stays before keyword-only options so positional encoding / no_group_by_page callers are unchanged. Unicode is never mapped to \\sqrt{}. Module constants: FORMULA_MARKDOWN_AUTO, FORMULA_MARKDOWN_DISPLAY_MATH, FORMULA_MARKDOWN_PLAIN.

0.22.15

Security

  • security: fix(deps): upgrade vulnerable transitive dependencies [security]

0.22.14

Enhancements

  • Deduplicate PDF rendering: Remove _render_pdf_pages and delegate to unstructured-inference's convert_pdf_to_image (which already has lazy per-page rendering). Peak memory for path_only=True drops from O(n_pages) to O(1 page) — 97% reduction on a 100-page PDF. Bumps inference dep to >=1.6.2.

0.22.13

Enhancements

  • Speed up standardize_quotes: Replace loop-based character replacement with a single str.translate() call using a pre-computed translation table. Also fixes a pre-existing bug where left smart quotes were never normalized due to duplicate dictionary keys.

0.22.12

What's Changed

... (truncated)

Changelog

Sourced from unstructured's changelog.

0.22.21

Enhancements

  • Skip table chunking option: Add skip_table_chunking to basic/title chunking options. When True, Table elements are passed through unchanged without being split into TableChunk elements, regardless of their size. Defaults to False to preserve existing behavior.

0.22.20

Enhancements

  • Auto-detect vertical text for rotated PDFs: Add detect_vertical field to PDFMinerConfig and auto-enable it when rendered pages have /Rotate metadata, so pdfminer groups rotated text into proper words instead of per-character regions

0.22.19

Security

  • security: fix(deps): upgrade vulnerable transitive dependencies [security]

0.22.18

Fixes

  • Make ingest-test-fixtures-update-pr CI job also update the markdown versions of the fixtures.

Enhancements

  • Add page number support to v1 HTML parser: The v1 HTML parser now reads data-page-number attributes from ancestor elements and includes the page number in element metadata, consistent with the v2 parser behavior.

0.22.17

Fixes

  • Preserve semantic table headers across carried chunks: Carried rows in split table chunks now keep original header semantics (th stays th, including section header rows and wrapped header text), preventing header cells from degrading to data cells in continuation chunks.

0.22.16

Enhancements

  • Formula markdown export (element_to_md / elements_to_md): New keyword-only formula_markdown_style ("auto", "display_math", "plain"; default "auto"). In "auto", display math ($$ ... $$) is used only when the text looks like notation (heuristic score) and contains no $/$$ (avoids breaking Markdown and noisy OCR captions). "display_math" wraps whenever safe (still falls back to plain if $ would corrupt fences). "plain" emits text only. Optional normalize_formula (default True) maps common Unicode operators to LaTeX-like tokens; normalize_formula stays before keyword-only options so positional encoding / no_group_by_page callers are unchanged. Unicode is never mapped to \\sqrt{}. Module constants: FORMULA_MARKDOWN_AUTO, FORMULA_MARKDOWN_DISPLAY_MATH, FORMULA_MARKDOWN_PLAIN.

0.22.15

Security

  • security: fix(deps): upgrade vulnerable transitive dependencies [security]

0.22.14

Enhancements

... (truncated)

Commits
  • 3ac4443 feat: add option to skip table chunking (#4338)
  • dfb1653 Enable vertical text detection for rotated images (#4328)
  • d0aa8eb feat: add GHA workflow to build opencv wheels without ffmpeg (#4335)
  • 029f491 fix(deps): upgrade vulnerable transitive dependencies [security] (#4334)
  • 2437078 Fix fixtures update CI to regenerate markdown (#4332)
  • d299095 feat: add page number support to v1 html partition (#4327)
  • 615782a fix(chunking): preserve semantic headers in carried table chunks (#4313)
  • 264d569 feat: render Formula elements as $$ blocks with optional normalization (#4308)
  • 051b358 fix(deps): upgrade vulnerable transitive dependencies [security] (#4318)
  • affb9d6 refactor: deduplicate PDF rendering by delegating to unstructured-inference (...
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [unstructured](https://github.com/Unstructured-IO/unstructured) from 0.15.13 to 0.22.21.
- [Release notes](https://github.com/Unstructured-IO/unstructured/releases)
- [Changelog](https://github.com/Unstructured-IO/unstructured/blob/main/CHANGELOG.md)
- [Commits](Unstructured-IO/unstructured@0.15.13...0.22.21)

---
updated-dependencies:
- dependency-name: unstructured
  dependency-version: 0.22.21
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Apr 20, 2026
@dependabot @github
Copy link
Copy Markdown
Contributor Author

dependabot Bot commented on behalf of github Apr 27, 2026

Superseded by #942.

@dependabot dependabot Bot closed this Apr 27, 2026
@dependabot dependabot Bot deleted the dependabot/pip/unstructured-0.22.21 branch April 27, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants