Skip to content

Commit 144d88f

Browse files
robodev-r2d2a-klos
andauthored
chore: refactor with black and flake8 (#156)
This pull request focuses on improving logging practices, code documentation, and minor code quality enhancements across both the `admin-api-lib` and `extractor-api-lib` libraries. The main changes include replacing generic error logging with `logger.exception()` for better stack trace visibility, adding or improving docstrings for classes and methods, and updating project dependencies. Additionally, some minor code cleanups and consistency improvements were made. **Logging improvements:** * Replaced `logger.error` (often with explicit `traceback.format_exc()`) with `logger.exception()` in multiple files to ensure stack traces are logged automatically and consistently. This affects error handling in file and source uploaders, document retrievers, and extractors. [[1]](diffhunk://#diff-5cf6e4adf466c99d1afc322897549f6cda78687be53f7d1d2f43ca7551e32aa4L57-R58) [[2]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL112-R113) [[3]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL127-R137) [[4]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL199-R199) [[5]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL232-R231) [[6]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L114-R121) [[7]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L153-R155) [[8]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L200-R204) [[9]](diffhunk://#diff-6d8a274cc6bd15025ebdd55cdb7525c2b01da207871c44440632c9ebea0ce4deL79-R77) **Documentation enhancements:** * Added or improved docstrings for classes and methods, including module-level docstrings in several files (e.g., `default_file_uploader.py`, `default_source_uploader.py`, `file_extractor.py`, `source_extractor.py`). [[1]](diffhunk://#diff-7dddc401b91c2ff4593c718a6211b203b1f5e15eb78cb33ec9e6da5756a0d173R11) [[2]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455R33) [[3]](diffhunk://#diff-6788b6d236b1f7d9e8ddcd74ed1f7c265ba82fa4b193f2eaa9d2ff9790a35c67R1-R2) [[4]](diffhunk://#diff-73a4b2010a5a6312fd5a5496546dca10ff9fc480e4c4be037c96372647366c7eR1-R2) [[5]](diffhunk://#diff-bcba3b03484e8a08bc3addb98d2b8f50cdaa5cf8e8d9605270bc69a6231599b7R25) [[6]](diffhunk://#diff-8980643c87d06c313959eb7f400a57787367a482168872181e875355cb55d64bL16-R17) [[7]](diffhunk://#diff-2b5524f0cb01b11e336def1a99356a243662de61a73be6dd5da1be89227cf112R39) [[8]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeR1-L3) [[9]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455R1-R2) **Dependency and configuration updates:** * Updated the version of both `admin-api-lib` and `extractor-api-lib` to `v3.3.0` in their respective `pyproject.toml` files. [[1]](diffhunk://#diff-9c5aeb0db77c2eec077d07ddc3b3810ae1a4a1e50ee7061fba37a46706c513fbL7-R7) [[2]](diffhunk://#diff-dede389bcfb615c4b45cd1da7ac14cbe9535305f41f19cce09e321c91a8bb323L7-R7) * Enabled `flake8-logging-format` and `flake8-docstrings` in both libraries for improved code linting and documentation enforcement. [[1]](diffhunk://#diff-9c5aeb0db77c2eec077d07ddc3b3810ae1a4a1e50ee7061fba37a46706c513fbL102-R103) [[2]](diffhunk://#diff-dede389bcfb615c4b45cd1da7ac14cbe9535305f41f19cce09e321c91a8bb323L141-R142) **Code quality and consistency:** * Improved method and parameter documentation for several upload and extraction methods, and made minor adjustments to method docstrings for clarity and consistency. [[1]](diffhunk://#diff-7dddc401b91c2ff4593c718a6211b203b1f5e15eb78cb33ec9e6da5756a0d173L19-R20) [[2]](diffhunk://#diff-07e3a38bb2ae6f51f1ce64c6e8e1898be13dc441c1c43071c8c2ecc5f352531eL24-R24) [[3]](diffhunk://#diff-8065840d5a5800fba5b7e366084521208dd6acdb106779806da0a04443d18667L152-R152) [[4]](diffhunk://#diff-8065840d5a5800fba5b7e366084521208dd6acdb106779806da0a04443d18667L184-R184) [[5]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL81-R82) [[6]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L81-R84) [[7]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L98) [[8]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL127-R137) [[9]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L114-R121) * Minor code cleanups, such as removing unnecessary imports (`traceback`), and improving warning message punctuation. [[1]](diffhunk://#diff-5cf6e4adf466c99d1afc322897549f6cda78687be53f7d1d2f43ca7551e32aa4L5) [[2]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeR1-L3) [[3]](diffhunk://#diff-6d8a274cc6bd15025ebdd55cdb7525c2b01da207871c44440632c9ebea0ce4deL6-L7) [[4]](diffhunk://#diff-7c45d126e034917dfbd2dbd921a73cd39c2b82a91752a1b6d3d60f9a01a04287L336-R336) These changes collectively improve maintainability, error visibility, and code documentation across the codebase. --------- Co-authored-by: Andreas Klos <andreas.klos@stackit.cloud>
1 parent 8937351 commit 144d88f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

61 files changed

+817
-112
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
**/.nx/*
12
*.tgz
23
*/Chart.lock
34
helm-chart/charts/admin-backend-0.0.1.tgz

libs/admin-api-lib/poetry.lock

Lines changed: 63 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

libs/admin-api-lib/pyproject.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
44

55
[tool.poetry]
66
name = "admin-api-lib"
7-
version = "v3.2.1"
7+
version = "v3.3.0"
88
description = "The admin backend is responsible for the document management. This includes deletion, upload and returning the source document."
99
authors = [
1010
"STACKIT GmbH & Co. KG <data-ai@stackit.cloud>",
@@ -99,8 +99,8 @@ flake8-wot = "^0.2.0"
9999
flake8-function-order = "^0.0.5"
100100
flake8-tidy-imports = "^4.10.0"
101101
black = "^25.1.0"
102-
# flake8-logging-format = "^2024.24.12"
103-
# flake8-docstrings = "^1.7.0"
102+
flake8-logging-format = "^2024.24.12"
103+
flake8-docstrings = "^1.7.0"
104104
pytest-asyncio = "^1.0.0"
105105

106106
[tool.poetry.dependencies]

libs/admin-api-lib/src/admin_api_lib/api_endpoints/file_uploader.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88

99

1010
class FileUploader(UploaderBase):
11+
"""File uploader endpoint of the admin API."""
1112

1213
@abstractmethod
1314
async def upload_file(
@@ -16,7 +17,7 @@ async def upload_file(
1617
file: UploadFile,
1718
) -> None:
1819
"""
19-
Uploads a source file for content extraction.
20+
Upload a source file for content extraction.
2021
2122
Parameters
2223
----------

libs/admin-api-lib/src/admin_api_lib/api_endpoints/source_uploader.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ async def upload_source(
2121
timeout: Optional[float],
2222
) -> None:
2323
"""
24-
Uploads the parameters for source content extraction.
24+
Upload the parameters for source content extraction.
2525
2626
Parameters
2727
----------

libs/admin-api-lib/src/admin_api_lib/api_endpoints/uploader_base.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,7 @@ class UploaderBase:
77
"""Base class for uploader API endpoints."""
88

99
def __init__(self):
10-
"""
11-
Initialize the UploaderBase.
12-
"""
10+
"""Initialize the UploaderBase."""
1311
self._background_threads = []
1412

1513
def _prune_background_threads(self) -> list[Thread]:

libs/admin-api-lib/src/admin_api_lib/apis/admin_api.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ async def upload_file(
149149
request: Request,
150150
) -> None:
151151
"""
152-
Uploads user selected sources.
152+
Upload user selected sources.
153153
154154
Parameters
155155
----------
@@ -181,7 +181,7 @@ async def upload_source(
181181
key_value_pair: List[KeyValuePair] = Body(None, description="The key-value pairs for the source"),
182182
) -> None:
183183
"""
184-
Uploads user selected sources.
184+
Upload user selected sources.
185185
186186
Parameters
187187
----------

libs/admin-api-lib/src/admin_api_lib/impl/api_endpoints/default_document_reference_retriever.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
import io
44
import logging
5-
import traceback
65

76
from fastapi import HTTPException, Response, status
87

@@ -54,10 +53,9 @@ async def adocument_reference_id_get(self, identification: str) -> Response:
5453
self._file_service.download_file(identification, document_buffer)
5554
logger.debug("DONE retrieving document with id: %s", identification)
5655
document_data = document_buffer.getvalue()
57-
except Exception as e:
58-
logger.error(
59-
"Error retrieving document with id: %s. Error: %s %s", identification, e, traceback.format_exc()
60-
)
56+
except Exception:
57+
# Log full stack trace without embedding the exception object in the message (G200)
58+
logger.exception("Error retrieving document with id: %s.", identification)
6159
raise ValueError(f"Document with id '{identification}' not found.")
6260
finally:
6361
document_buffer.close()

libs/admin-api-lib/src/admin_api_lib/impl/api_endpoints/default_file_uploader.py

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1+
"""Module for the default file uploader implementation."""
2+
13
import logging
24
from pathlib import Path
3-
import traceback
45
import urllib
56
import tempfile
67
import asyncio
@@ -78,7 +79,7 @@ async def upload_file(
7879
file: UploadFile,
7980
) -> None:
8081
"""
81-
Uploads a source file for content extraction.
82+
Upload a source file for content extraction.
8283
8384
Parameters
8485
----------
@@ -109,7 +110,7 @@ async def upload_file(
109110
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
110111
except Exception as e:
111112
self._key_value_store.upsert(source_name, Status.ERROR)
112-
logger.error("Error while uploading %s = %s", source_name, str(e))
113+
logger.exception("Error while uploading %s", source_name)
113114
raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e))
114115

115116
def _log_task_exception(self, task: asyncio.Task) -> None:
@@ -124,19 +125,16 @@ def _log_task_exception(self, task: asyncio.Task) -> None:
124125
if task.done() and not task.cancelled():
125126
try:
126127
task.result() # This will raise the exception if one occurred
127-
except Exception as e:
128-
logger.error("Background task failed with exception: %s", str(e))
129-
logger.debug("Background task exception traceback: %s", traceback.format_exc())
128+
except Exception:
129+
logger.exception("Background task failed with exception.")
130130

131131
def _prune_background_tasks(self) -> None:
132-
"""
133-
Remove completed background tasks from the list.
134-
"""
132+
"""Remove completed background tasks from the list."""
135133
self._background_tasks = [task for task in self._background_tasks if not task.done()]
136134

137135
def _check_if_already_in_processing(self, source_name: str) -> None:
138136
"""
139-
Checks if the source is already in processing state.
137+
Check if the source is already in processing state.
140138
141139
Parameters
142140
----------
@@ -196,9 +194,9 @@ async def _handle_source_upload(
196194
await asyncio.to_thread(self._rag_api.upload_information_piece, rag_information_pieces)
197195
self._key_value_store.upsert(source_name, Status.READY)
198196
logger.info("Source uploaded successfully: %s", source_name)
199-
except Exception as e:
197+
except Exception:
200198
self._key_value_store.upsert(source_name, Status.ERROR)
201-
logger.error("Error while uploading %s = %s", source_name, str(e))
199+
logger.exception("Error while uploading %s", source_name)
202200

203201
def _add_file_url(self, file_name: str, base_url: str, chunked_documents: list[Document]):
204202
document_url = f"{base_url.rstrip('/')}/document_reference/{urllib.parse.quote_plus(file_name)}"
@@ -229,6 +227,6 @@ async def _asave_new_document(
229227

230228
self._file_service.upload_file(Path(temp_file_path), filename)
231229
return filename
232-
except Exception as e:
233-
logger.error("Error during document saving: %s %s", e, traceback.format_exc())
230+
except Exception:
231+
logger.exception("Error during document saving")
234232
self._key_value_store.upsert(source_name, Status.ERROR)

libs/admin-api-lib/src/admin_api_lib/impl/api_endpoints/default_source_uploader.py

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
"""Module for the default source uploader implementation."""
2+
13
import logging
24
import asyncio
35
from threading import Thread
@@ -28,6 +30,7 @@
2830

2931

3032
class DefaultSourceUploader(SourceUploader):
33+
"""The DefaultSourceUploader is responsible for uploading source files for content extraction."""
3134

3235
def __init__(
3336
self,
@@ -78,7 +81,7 @@ async def upload_source(
7881
kwargs: list[KeyValuePair],
7982
) -> None:
8083
"""
81-
Uploads the parameters for source content extraction.
84+
Upload the parameters for source content extraction.
8285
8386
Parameters
8487
----------
@@ -95,7 +98,6 @@ async def upload_source(
9598
-------
9699
None
97100
"""
98-
99101
self._prune_background_threads()
100102

101103
source_name = f"{source_type}:{sanitize_document_name(name)}"
@@ -111,12 +113,12 @@ async def upload_source(
111113
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
112114
except Exception as e:
113115
self._key_value_store.upsert(source_name, Status.ERROR)
114-
logger.error("Error while uploading %s = %s", source_name, str(e))
116+
logger.exception("Error while uploading %s", source_name)
115117
raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e))
116118

117119
def _check_if_already_in_processing(self, source_name: str) -> None:
118120
"""
119-
Checks if the source is already in processing state.
121+
Check if the source is already in processing state.
120122
121123
Parameters
122124
----------
@@ -150,7 +152,7 @@ def _thread_worker(self, source_name, source_type, kwargs, timeout):
150152
logger.error("Upload of %s timed out after %s seconds", source_name, timeout)
151153
self._key_value_store.upsert(source_name, Status.ERROR)
152154
except Exception:
153-
logger.error("Error while uploading %s", source_name)
155+
logger.exception("Error while uploading %s", source_name)
154156
self._key_value_store.upsert(source_name, Status.ERROR)
155157
finally:
156158
loop.close()
@@ -197,6 +199,6 @@ async def _handle_source_upload(
197199
await asyncio.to_thread(self._rag_api.upload_information_piece, rag_information_pieces)
198200
self._key_value_store.upsert(source_name, Status.READY)
199201
logger.info("Source uploaded successfully: %s", source_name)
200-
except Exception as e:
202+
except Exception:
201203
self._key_value_store.upsert(source_name, Status.ERROR)
202-
logger.error("Error while uploading %s = %s", source_name, str(e))
204+
logger.exception("Error while uploading %s", source_name)

0 commit comments

Comments
 (0)