Skip to content

Commit fd9d796

Browse files
potter-potterbadGarnetawalker4ryannikolaidis
authored
fix cve (#3989)
fix critical cve for h11. supposedly 0.16.0 fixes it. --------- Co-authored-by: Yao You <yao@unstructured.io> Co-authored-by: Austin Walker <austin@unstructured.io> Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: badGarnet <badGarnet@users.noreply.github.com>
1 parent 27f503c commit fd9d796

79 files changed

Lines changed: 761 additions & 7760 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## 0.17.6-dev1
1+
## 0.17.6-dev2
22

33
### Enhancements
44

@@ -9,6 +9,8 @@
99
Two executions of the same code, on the same file, produce different results. The order of the elements is random.
1010
This makes it impossible to write stable unit tests, for example, or to obtain reproducible results.
1111
- **Do not use NLP to determine element types for extracted elements with hi_res.** This avoids extraneous Title elements in hi_res outputs. This only applies to *extracted* elements, meaning text objects that are found outside of Object Detection objects which get mapped to *inferred* elements. (*extracted* and *inferred* elements get merged together to form the list of `Element`s returned by `pdf_partition()`)
12+
- Resolve open CVEs
13+
1214

1315
## 0.17.5
1416

requirements/base.txt

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ anyio==4.9.0
88
# via httpx
99
backoff==2.2.1
1010
# via -r ./base.in
11-
beautifulsoup4==4.13.3
11+
beautifulsoup4==4.13.4
1212
# via -r ./base.in
13-
certifi==2025.1.31
13+
certifi==2025.4.26
1414
# via
1515
# httpcore
1616
# httpx
@@ -42,11 +42,11 @@ exceptiongroup==1.2.2
4242
# via anyio
4343
filetype==1.2.0
4444
# via -r ./base.in
45-
h11==0.14.0
45+
h11==0.16.0
4646
# via httpcore
4747
html5lib==1.1
4848
# via -r ./base.in
49-
httpcore==1.0.7
49+
httpcore==1.0.9
5050
# via httpx
5151
httpx==0.28.1
5252
# via unstructured-client
@@ -62,13 +62,13 @@ jsonpath-python==1.0.6
6262
# via unstructured-client
6363
langdetect==1.0.9
6464
# via -r ./base.in
65-
lxml==5.3.1
65+
lxml==5.4.0
6666
# via -r ./base.in
6767
marshmallow==3.26.1
6868
# via
6969
# dataclasses-json
7070
# unstructured-client
71-
mypy-extensions==1.0.0
71+
mypy-extensions==1.1.0
7272
# via
7373
# typing-inspect
7474
# unstructured-client
@@ -80,9 +80,9 @@ numpy==2.0.2
8080
# via -r ./base.in
8181
olefile==0.47
8282
# via python-oxmsg
83-
orderly-set==5.3.0
83+
orderly-set==5.4.0
8484
# via deepdiff
85-
packaging==24.2
85+
packaging==25.0
8686
# via
8787
# marshmallow
8888
# unstructured-client
@@ -100,7 +100,7 @@ python-magic==0.4.27
100100
# via -r ./base.in
101101
python-oxmsg==0.0.2
102102
# via -r ./base.in
103-
rapidfuzz==3.12.2
103+
rapidfuzz==3.13.0
104104
# via -r ./base.in
105105
regex==2024.11.6
106106
# via nltk
@@ -119,13 +119,13 @@ six==1.17.0
119119
# unstructured-client
120120
sniffio==1.3.1
121121
# via anyio
122-
soupsieve==2.6
122+
soupsieve==2.7
123123
# via beautifulsoup4
124124
tqdm==4.67.1
125125
# via
126126
# -r ./base.in
127127
# nltk
128-
typing-extensions==4.13.0
128+
typing-extensions==4.13.2
129129
# via
130130
# -r ./base.in
131131
# anyio

requirements/deps/constraints.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,5 @@ importlib-metadata>=8.5.0
2222
unstructured-client>=0.23.0,<0.26.0
2323
# paddle constrains protobuf; maybe we should put paddle here since its version is pinned in .in file
2424
protobuf>=6.30.0
25+
# (yao) issues with pdfminer-six above 20250416
26+
pdfminer.six<20250416

requirements/dev.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,15 @@ distlib==0.3.9
1717
# via virtualenv
1818
filelock==3.18.0
1919
# via virtualenv
20-
identify==2.6.9
20+
identify==2.6.10
2121
# via pre-commit
2222
importlib-metadata==8.6.1
2323
# via
2424
# -c ././deps/constraints.txt
2525
# build
2626
nodeenv==1.9.1
2727
# via pre-commit
28-
packaging==24.2
28+
packaging==25.0
2929
# via
3030
# -c ./base.txt
3131
# -c ./test.txt
@@ -49,7 +49,7 @@ tomli==2.2.1
4949
# -c ./test.txt
5050
# build
5151
# pip-tools
52-
virtualenv==20.29.3
52+
virtualenv==20.30.0
5353
# via pre-commit
5454
wheel==0.45.1
5555
# via pip-tools

requirements/extra-docx.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@
44
#
55
# pip-compile ./extra-docx.in
66
#
7-
lxml==5.3.1
7+
lxml==5.4.0
88
# via
99
# -c ./base.txt
1010
# python-docx
1111
python-docx==1.1.2
1212
# via -r ./extra-docx.in
13-
typing-extensions==4.13.0
13+
typing-extensions==4.13.2
1414
# via
1515
# -c ./base.txt
1616
# python-docx

requirements/extra-markdown.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ importlib-metadata==8.6.1
88
# via
99
# -c ././deps/constraints.txt
1010
# markdown
11-
markdown==3.7
11+
markdown==3.8
1212
# via -r ./extra-markdown.in
1313
zipp==3.21.0
1414
# via importlib-metadata

requirements/extra-odt.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@
44
#
55
# pip-compile ./extra-odt.in
66
#
7-
lxml==5.3.1
7+
lxml==5.4.0
88
# via
99
# -c ./base.txt
1010
# python-docx
1111
pypandoc==1.15
1212
# via -r ./extra-odt.in
1313
python-docx==1.1.2
1414
# via -r ./extra-odt.in
15-
typing-extensions==4.13.0
15+
typing-extensions==4.13.2
1616
# via
1717
# -c ./base.txt
1818
# python-docx

requirements/extra-paddleocr.txt

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ anyio==4.9.0
1818
# httpx
1919
astor==0.8.1
2020
# via paddlepaddle
21-
beautifulsoup4==4.13.3
21+
beautifulsoup4==4.13.4
2222
# via
2323
# -c ./base.txt
2424
# unstructured-paddleocr
25-
certifi==2025.1.31
25+
certifi==2025.4.26
2626
# via
2727
# -c ./base.txt
2828
# httpcore
@@ -44,13 +44,13 @@ exceptiongroup==1.2.2
4444
# anyio
4545
fire==0.7.0
4646
# via unstructured-paddleocr
47-
fonttools==4.56.0
47+
fonttools==4.57.0
4848
# via unstructured-paddleocr
49-
h11==0.14.0
49+
h11==0.16.0
5050
# via
5151
# -c ./base.txt
5252
# httpcore
53-
httpcore==1.0.7
53+
httpcore==1.0.9
5454
# via
5555
# -c ./base.txt
5656
# httpx
@@ -68,7 +68,7 @@ imageio==2.37.0
6868
# via scikit-image
6969
lazy-loader==0.4
7070
# via scikit-image
71-
lxml==5.3.1
71+
lxml==5.4.0
7272
# via
7373
# -c ./base.txt
7474
# python-docx
@@ -102,14 +102,14 @@ opencv-python-headless==4.11.0.86
102102
# albumentations
103103
opt-einsum==3.3.0
104104
# via paddlepaddle
105-
packaging==24.2
105+
packaging==25.0
106106
# via
107107
# -c ./base.txt
108108
# lazy-loader
109109
# scikit-image
110110
paddlepaddle==3.0.0
111111
# via -r ./extra-paddleocr.in
112-
pillow==11.1.0
112+
pillow==11.2.1
113113
# via
114114
# imageio
115115
# paddlepaddle
@@ -121,17 +121,17 @@ protobuf==6.30.2
121121
# paddlepaddle
122122
pyclipper==1.3.0.post6
123123
# via unstructured-paddleocr
124-
pydantic==2.10.6
124+
pydantic==2.11.3
125125
# via albumentations
126-
pydantic-core==2.27.2
126+
pydantic-core==2.33.1
127127
# via pydantic
128128
python-docx==1.1.2
129129
# via unstructured-paddleocr
130130
pyyaml==6.0.2
131131
# via
132132
# albumentations
133133
# unstructured-paddleocr
134-
rapidfuzz==3.12.2
134+
rapidfuzz==3.13.0
135135
# via
136136
# -c ./base.txt
137137
# unstructured-paddleocr
@@ -153,21 +153,21 @@ sniffio==1.3.1
153153
# via
154154
# -c ./base.txt
155155
# anyio
156-
soupsieve==2.6
156+
soupsieve==2.7
157157
# via
158158
# -c ./base.txt
159159
# beautifulsoup4
160-
stringzilla==3.12.3
160+
stringzilla==3.12.5
161161
# via albucore
162-
termcolor==2.5.0
162+
termcolor==3.0.1
163163
# via fire
164164
tifffile==2024.8.30
165165
# via scikit-image
166166
tqdm==4.67.1
167167
# via
168168
# -c ./base.txt
169169
# unstructured-paddleocr
170-
typing-extensions==4.13.0
170+
typing-extensions==4.13.2
171171
# via
172172
# -c ./base.txt
173173
# albucore
@@ -178,6 +178,9 @@ typing-extensions==4.13.0
178178
# pydantic
179179
# pydantic-core
180180
# python-docx
181+
# typing-inspection
182+
typing-inspection==0.4.0
183+
# via pydantic
181184
unstructured-paddleocr==2.10.0
182185
# via -r ./extra-paddleocr.in
183186
urllib3==1.26.20

0 commit comments

Comments
 (0)