Skip to content

Commit fa49907

Browse files
chore: resolve cves (#534)
<!-- CURSOR_SUMMARY --> > [!NOTE] > **Security/Deps** > - Broad dependency upgrades in `requirements/base.txt`, `requirements/test.txt`, and `requirements/constraints.txt` (e.g., `fastapi`, `transformers`, `torch`, `unstructured[all-docs]`, `urllib3`, etc.) to remediate CVEs. > > **CI** > - Enhances `ci.yml` Docker test job with aggressive disk cleanup (remove preinstalled SDKs, `docker system prune -af --volumes`) and prints disk usage before/after to stabilize Docker builds. > > **Versioning** > - Bumps version to `0.0.91` in `__version__.py` and `preprocessing-pipeline-family.yaml`. > - Adds `CHANGELOG.md` entry for 0.0.91 noting CVE-driven upgrades. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 5d0b9d8. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->
1 parent abe7b8e commit fa49907

7 files changed

Lines changed: 95 additions & 80 deletions

File tree

.github/workflows/ci.yml

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,23 @@ jobs:
121121
- name: Free up disk space
122122
run: |
123123
# Clear some space (https://github.com/actions/runner-images/issues/2840)
124-
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/share/boost
124+
echo "Disk usage before cleanup:"
125+
df -h
126+
127+
# Remove unnecessary pre-installed software
128+
sudo rm -rf /usr/share/dotnet
129+
sudo rm -rf /opt/ghc
130+
sudo rm -rf /usr/local/share/boost
131+
sudo rm -rf /usr/local/lib/android
132+
sudo rm -rf /opt/hostedtoolcache/CodeQL
133+
sudo rm -rf /usr/local/.ghcup
134+
sudo rm -rf /usr/share/swift
135+
136+
# Clean up docker to ensure we start fresh
137+
docker system prune -af --volumes
138+
139+
echo "Disk usage after cleanup:"
140+
df -h
125141
- name: Test Dockerfile
126142
run: |
127143
python${{ env.PYTHON_VERSION }} -m venv .venv

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
## 0.0.91
2+
* Upgrade packages to resolve CVEs
3+
14
## 0.0.90
25
* Upgrade version to pull in latest unstructured verison and bump versions of dependancies.
36

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.0.90" # pragma: no cover
1+
__version__ = "0.0.91" # pragma: no cover

preprocessing-pipeline-family.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
name: general
2-
version: 0.0.90
2+
version: 0.0.91

requirements/base.txt

Lines changed: 42 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -4,29 +4,29 @@
44
#
55
# pip-compile --no-strip-extras ./requirements/base.in
66
#
7-
accelerate==1.11.0
7+
accelerate==1.12.0
88
# via unstructured-inference
99
aiofiles==25.1.0
1010
# via unstructured-client
11-
annotated-doc==0.0.3
11+
annotated-doc==0.0.4
1212
# via fastapi
1313
annotated-types==0.7.0
1414
# via pydantic
1515
antlr4-python3-runtime==4.9.3
1616
# via omegaconf
17-
anyio==4.11.0
17+
anyio==4.12.0
1818
# via
1919
# httpx
2020
# starlette
2121
backoff==2.2.1
2222
# via
2323
# -r requirements/base.in
2424
# unstructured
25-
beautifulsoup4==4.14.2
25+
beautifulsoup4==4.14.3
2626
# via unstructured
27-
cachetools==6.2.1
27+
cachetools==6.2.4
2828
# via google-auth
29-
certifi==2025.10.5
29+
certifi==2025.11.12
3030
# via
3131
# httpcore
3232
# httpx
@@ -38,7 +38,7 @@ charset-normalizer==3.4.4
3838
# pdfminer-six
3939
# requests
4040
# unstructured
41-
click==8.3.0
41+
click==8.3.1
4242
# via
4343
# -r requirements/base.in
4444
# nltk
@@ -65,26 +65,26 @@ emoji==2.15.0
6565
# via unstructured
6666
et-xmlfile==2.0.0
6767
# via openpyxl
68-
fastapi==0.121.0
68+
fastapi==0.128.0
6969
# via -r requirements/base.in
70-
filelock==3.20.0
70+
filelock==3.20.1
7171
# via
7272
# huggingface-hub
7373
# torch
7474
# transformers
7575
filetype==1.2.0
7676
# via unstructured
77-
flatbuffers==25.9.23
77+
flatbuffers==25.12.19
7878
# via onnxruntime
79-
fonttools==4.60.1
79+
fonttools==4.61.1
8080
# via matplotlib
81-
fsspec==2025.10.0
81+
fsspec==2025.12.0
8282
# via
8383
# huggingface-hub
8484
# torch
8585
google-api-core[grpc]==2.28.1
8686
# via google-cloud-vision
87-
google-auth==2.43.0
87+
google-auth==2.45.0
8888
# via
8989
# google-api-core
9090
# google-cloud-vision
@@ -131,7 +131,7 @@ idna==3.11
131131
# requests
132132
jinja2==3.1.6
133133
# via torch
134-
joblib==1.5.2
134+
joblib==1.5.3
135135
# via nltk
136136
kiwisolver==1.4.9
137137
# via matplotlib
@@ -147,19 +147,19 @@ markdown==3.10
147147
# via unstructured
148148
markupsafe==3.0.3
149149
# via jinja2
150-
marshmallow==3.26.1
150+
marshmallow==3.26.2
151151
# via dataclasses-json
152-
matplotlib==3.10.7
152+
matplotlib==3.10.8
153153
# via unstructured-inference
154-
ml-dtypes==0.5.3
154+
ml-dtypes==0.5.4
155155
# via onnx
156156
mpmath==1.3.0
157157
# via sympy
158158
msoffcrypto-tool==5.4.2
159159
# via unstructured
160160
mypy-extensions==1.1.0
161161
# via typing-inspect
162-
networkx==3.5
162+
networkx==3.6.1
163163
# via
164164
# torch
165165
# unstructured
@@ -188,7 +188,7 @@ olefile==0.47
188188
# python-oxmsg
189189
omegaconf==2.3.0
190190
# via effdet
191-
onnx==1.19.1
191+
onnx==1.20.0
192192
# via
193193
# unstructured
194194
# unstructured-inference
@@ -216,13 +216,13 @@ pandas==2.3.3
216216
# unstructured-inference
217217
pdf2image==1.17.0
218218
# via unstructured
219-
pdfminer-six==20250506
219+
pdfminer-six==20251230
220220
# via
221221
# unstructured
222222
# unstructured-inference
223223
pi-heif==1.1.1
224224
# via unstructured
225-
pikepdf==10.0.0
225+
pikepdf==10.1.0
226226
# via unstructured
227227
pillow==12.0.0
228228
# via
@@ -233,11 +233,11 @@ pillow==12.0.0
233233
# python-pptx
234234
# torchvision
235235
# unstructured-pytesseract
236-
proto-plus==1.26.1
236+
proto-plus==1.27.0
237237
# via
238238
# google-api-core
239239
# google-cloud-vision
240-
protobuf==6.33.0
240+
protobuf==6.33.2
241241
# via
242242
# google-api-core
243243
# google-cloud-vision
@@ -246,7 +246,7 @@ protobuf==6.33.0
246246
# onnx
247247
# onnxruntime
248248
# proto-plus
249-
psutil==7.1.3
249+
psutil==7.2.1
250250
# via
251251
# -r requirements/base.in
252252
# accelerate
@@ -257,40 +257,40 @@ pyasn1==0.6.1
257257
# rsa
258258
pyasn1-modules==0.4.2
259259
# via google-auth
260-
pycocotools==2.0.10
260+
pycocotools==2.0.11
261261
# via effdet
262262
pycparser==2.23
263263
# via cffi
264264
pycryptodome==3.23.0
265265
# via -r requirements/base.in
266-
pydantic==2.12.4
266+
pydantic==2.12.5
267267
# via
268268
# fastapi
269269
# unstructured-client
270270
pydantic-core==2.41.5
271271
# via pydantic
272-
pypandoc==1.15
272+
pypandoc==1.16.2
273273
# via unstructured
274-
pyparsing==3.2.5
274+
pyparsing==3.3.1
275275
# via matplotlib
276-
pypdf==6.1.3
276+
pypdf==6.5.0
277277
# via
278278
# -r requirements/base.in
279279
# unstructured
280280
# unstructured-client
281-
pypdfium2==5.0.0
281+
pypdfium2==5.2.0
282282
# via unstructured-inference
283283
python-dateutil==2.9.0.post0
284284
# via
285285
# matplotlib
286286
# pandas
287287
python-docx==1.2.0
288288
# via unstructured
289-
python-iso639==2025.2.18
289+
python-iso639==2025.11.16
290290
# via unstructured
291291
python-magic==0.4.27
292292
# via unstructured
293-
python-multipart==0.0.20
293+
python-multipart==0.0.21
294294
# via unstructured-inference
295295
python-oxmsg==0.0.2
296296
# via unstructured
@@ -327,7 +327,7 @@ requests-toolbelt==1.0.0
327327
# via unstructured-client
328328
rsa==4.9.1
329329
# via google-auth
330-
safetensors==0.6.2
330+
safetensors==0.7.0
331331
# via
332332
# accelerate
333333
# timm
@@ -339,9 +339,7 @@ six==1.17.0
339339
# html5lib
340340
# langdetect
341341
# python-dateutil
342-
sniffio==1.3.1
343-
# via anyio
344-
soupsieve==2.8
342+
soupsieve==2.8.1
345343
# via beautifulsoup4
346344
starlette==0.41.2
347345
# via
@@ -357,14 +355,14 @@ timm==1.0.22
357355
# unstructured-inference
358356
tokenizers==0.22.1
359357
# via transformers
360-
torch==2.9.0
358+
torch==2.9.1
361359
# via
362360
# accelerate
363361
# effdet
364362
# timm
365363
# torchvision
366364
# unstructured-inference
367-
torchvision==0.24.0
365+
torchvision==0.24.1
368366
# via
369367
# effdet
370368
# timm
@@ -374,7 +372,7 @@ tqdm==4.67.1
374372
# nltk
375373
# transformers
376374
# unstructured
377-
transformers==4.57.1
375+
transformers==4.57.3
378376
# via unstructured-inference
379377
typing-extensions==4.15.0
380378
# via
@@ -397,19 +395,19 @@ typing-inspect==0.9.0
397395
# via dataclasses-json
398396
typing-inspection==0.4.2
399397
# via pydantic
400-
tzdata==2025.2
398+
tzdata==2025.3
401399
# via pandas
402-
unstructured[all-docs]==0.18.18
400+
unstructured[all-docs]==0.18.24
403401
# via -r requirements/base.in
404-
unstructured-client==0.42.3
402+
unstructured-client==0.42.6
405403
# via unstructured
406404
unstructured-inference==1.1.1
407405
# via unstructured
408406
unstructured-pytesseract==0.3.15
409407
# via unstructured
410-
urllib3==2.5.0
408+
urllib3==2.6.2
411409
# via requests
412-
uvicorn==0.38.0
410+
uvicorn==0.40.0
413411
# via -r requirements/base.in
414412
webencodings==0.5.1
415413
# via html5lib

requirements/constraints.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,12 @@
44
#
55
# pip-compile --no-strip-extras ./requirements/constraints.in
66
#
7-
anyio==4.11.0
7+
anyio==4.12.0
88
# via starlette
99
idna==3.11
1010
# via anyio
1111
numpy==1.26.4
1212
# via -r requirements/constraints.in
13-
sniffio==1.3.1
14-
# via anyio
1513
starlette==0.41.2
1614
# via -r requirements/constraints.in
1715
typing-extensions==4.15.0

0 commit comments

Comments
 (0)