Skip to content

Commit b9708b3

Browse files
committed
Add import lifecycle event callback
1 parent 12b6ff0 commit b9708b3

9 files changed

Lines changed: 668 additions & 59 deletions

File tree

docs/architecture.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,8 @@ flowchart LR
4646
- owns the user-facing workflow
4747
- coordinates import/export operations
4848
- keeps the top-level API compact
49+
- exposes `import_data(..., on_event=...)` as an additive progress-reporting
50+
hook for import runs
4951

5052
### Schema
5153

@@ -77,6 +79,15 @@ flowchart LR
7779
- dispatches create/update/upsert logic
7880
- isolates backend execution from parsing concerns
7981

82+
### Import Session
83+
84+
`src/excelalchemy/core/import_session.py`
85+
86+
- owns one import run's lifecycle and mutable runtime state
87+
- emits structured lifecycle events when `on_event=...` is supplied
88+
- keeps those events on the same synchronous path as header validation, row
89+
execution, and result workbook rendering
90+
8091
### Rendering
8192

8293
`src/excelalchemy/core/rendering.py`

docs/domain-model.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ For component structure, see [`docs/architecture.md`](architecture.md).
3333
| Worksheet table | `src/excelalchemy/core/table.py` | Lightweight internal 2D table abstraction used for workbook import/export flow instead of pandas. | Internal, but important to understand |
3434
| Import session | `src/excelalchemy/core/import_session.py` | Owns one import run’s lifecycle, state, counts, header table, worksheet table, and result rendering decisions. | Internal |
3535
| Import session snapshot | `src/excelalchemy/core/import_session.py` | Immutable summary of the current import session phase and counts. | Internal |
36+
| Import lifecycle event callback | `src/excelalchemy/core/alchemy.py`, `src/excelalchemy/core/import_session.py` | Optional per-run callback passed to `ExcelAlchemy.import_data(...)` for synchronous lifecycle events. | Public concept |
3637
| Row aggregator | `src/excelalchemy/core/rows.py` | Reconstructs flattened worksheet rows back into model-shaped payloads. | Internal |
3738
| Import issue tracker | `src/excelalchemy/core/rows.py` | Maps cell and row issues back into workbook coordinates and result columns. | Internal |
3839
| Import executor | `src/excelalchemy/core/executor.py` | Validates row payloads and dispatches configured create/update/upsert callbacks. | Internal |
@@ -69,6 +70,8 @@ For component structure, see [`docs/architecture.md`](architecture.md).
6970
### Execution responsibilities
7071

7172
- `ExcelAlchemy` turns a config and schema into a usable workflow object.
73+
- `ExcelAlchemy.import_data(..., on_event=...)` can report lifecycle progress
74+
to a job or service layer while keeping the import itself synchronous.
7275
- `ExcelSchemaLayout` turns schema declarations into a flattened Excel layout.
7376
- `ExcelHeaderParser` and `ExcelHeaderValidator` decide whether an uploaded workbook matches that layout.
7477
- `RowAggregator` reconstructs model-shaped data from worksheet rows.
@@ -106,6 +109,7 @@ For component structure, see [`docs/architecture.md`](architecture.md).
106109
- `ExcelStorage` provides workbook input as `WorksheetTable` and accepts rendered workbook output for upload.
107110
- During import:
108111
- `ImportSession` coordinates the lifecycle
112+
- an optional `on_event` callback can observe lifecycle milestones inline
109113
- `ExcelHeaderParser` parses header rows
110114
- `ExcelHeaderValidator` validates them against `ExcelSchemaLayout`
111115
- `RowAggregator` reconstructs row payloads
@@ -137,6 +141,7 @@ For component structure, see [`docs/architecture.md`](architecture.md).
137141
- `ImportResult`
138142
- `CellErrorMap`
139143
- `RowIssueMap`
144+
- `ExcelAlchemy.import_data(..., on_event=...)`
140145

141146
### Internal concepts
142147

@@ -179,6 +184,9 @@ The import flow is the richest lifecycle in the repository.
179184
- Start point:
180185
- `ExcelAlchemy.import_data(...)`
181186
- implemented in `src/excelalchemy/core/alchemy.py`
187+
- Optional public progress hook:
188+
- `ExcelAlchemy.import_data(..., on_event=...)`
189+
- emits simple event dictionaries during the same synchronous import run
182190
- Runtime owner:
183191
- `ImportSession`
184192
- `src/excelalchemy/core/import_session.py`
@@ -197,6 +205,12 @@ The import flow is the richest lifecycle in the repository.
197205
- `HEADER_INVALID`
198206
- `DATA_INVALID`
199207
- `SUCCESS`
208+
- Event vocabulary:
209+
- `started`
210+
- `header_validated`
211+
- `row_processed`
212+
- `completed`
213+
- `failed`
200214
- Workbook-facing row result concept:
201215
- `ValidateRowResult`
202216
- values:

docs/public-api.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,9 @@ These modules are the recommended import paths for application code:
4949
The recommended backend configuration pattern in the 2.x line.
5050
- `ExcelArtifact`
5151
The recommended return shape when you need bytes, base64, or data URLs.
52+
- `ExcelAlchemy.import_data(..., on_event=...)`
53+
The additive public hook for synchronous import lifecycle events during one
54+
import run.
5255
- import inspection names:
5356
Prefer `worksheet_table`, `header_table`, `cell_error_map`, and
5457
`row_error_map` when reading import-run state from the facade.
@@ -113,6 +116,35 @@ For most application code, these are the recommended import paths:
113116
- `from excelalchemy.results import ...`
114117
Use this if you need result models or richer error-map helper types directly.
115118

119+
For synchronous job-style progress reporting, you can attach an event callback
120+
to the existing import call:
121+
122+
```python
123+
job_state = {'status': 'pending', 'processed_rows': 0, 'total_rows': 0}
124+
125+
def handle_import_event(event: dict[str, object]) -> None:
126+
if event['event'] == 'started':
127+
job_state['status'] = 'running'
128+
elif event['event'] == 'row_processed':
129+
job_state['processed_rows'] = event['processed_row_count']
130+
job_state['total_rows'] = event['total_row_count']
131+
elif event['event'] == 'completed':
132+
job_state['status'] = 'completed'
133+
job_state['result'] = event['result']
134+
elif event['event'] == 'failed':
135+
job_state['status'] = 'failed'
136+
137+
result = await alchemy.import_data(
138+
'employees.xlsx',
139+
'employee-import-result.xlsx',
140+
on_event=handle_import_event,
141+
)
142+
```
143+
144+
This is still a synchronous import. The callback runs inline during normal
145+
header validation, row execution, and result rendering, which makes it useful
146+
for service-layer progress tracking without introducing a new execution model.
147+
116148
If you are building API responses from import failures, the recommended public
117149
result helpers are:
118150

examples/employee_import_workflow.py

Lines changed: 41 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -82,9 +82,17 @@ async def create_employee(row: dict[str, object], context: dict[str, object] | N
8282
return row
8383

8484

85-
async def run_workflow() -> tuple[ImportResult, InMemoryImportStorage, dict[str, object]]:
85+
async def run_workflow() -> tuple[ImportResult, InMemoryImportStorage, dict[str, object], list[dict[str, object]]]:
8686
storage = InMemoryImportStorage()
87-
context: dict[str, object] = {'created_rows': []}
87+
context: dict[str, object] = {
88+
'created_rows': [],
89+
'job_progress': {
90+
'status': 'pending',
91+
'processed_rows': 0,
92+
'total_rows': 0,
93+
},
94+
}
95+
events: list[dict[str, object]] = []
8896

8997
alchemy = ExcelAlchemy(
9098
ImporterConfig.for_create(
@@ -98,14 +106,40 @@ async def run_workflow() -> tuple[ImportResult, InMemoryImportStorage, dict[str,
98106

99107
template = alchemy.download_template_artifact(filename='employee-template.xlsx')
100108
_build_import_fixture(storage, template.as_bytes())
101-
result = await alchemy.import_data('employee-import.xlsx', 'employee-import-result.xlsx')
102-
return result, storage, context
109+
110+
def handle_import_event(event: dict[str, object]) -> None:
111+
events.append(event)
112+
job_progress = context['job_progress']
113+
assert isinstance(job_progress, dict)
114+
115+
match event['event']:
116+
case 'started':
117+
job_progress['status'] = 'running'
118+
case 'row_processed':
119+
job_progress['processed_rows'] = event['processed_row_count']
120+
job_progress['total_rows'] = event['total_row_count']
121+
case 'completed':
122+
job_progress['status'] = 'completed'
123+
job_progress['result'] = event['result']
124+
job_progress['result_workbook_url'] = event['url']
125+
case 'failed':
126+
job_progress['status'] = 'failed'
127+
job_progress['error'] = event['error_message']
128+
129+
result = await alchemy.import_data(
130+
'employee-import.xlsx',
131+
'employee-import-result.xlsx',
132+
on_event=handle_import_event,
133+
)
134+
return result, storage, context, events
103135

104136

105137
def main() -> None:
106-
result, storage, context = asyncio.run(run_workflow())
138+
result, storage, context, events = asyncio.run(run_workflow())
107139
created_rows = context['created_rows']
140+
job_progress = context['job_progress']
108141
assert isinstance(created_rows, list)
142+
assert isinstance(job_progress, dict)
109143

110144
print('Employee import workflow completed')
111145
print(f'Result: {result.result}')
@@ -114,6 +148,8 @@ def main() -> None:
114148
print(f'Result workbook URL: {result.url}')
115149
print(f'Created rows: {len(created_rows)}')
116150
print(f'Uploaded artifacts: {sorted(storage.uploaded)}')
151+
print(f'Observed events: {[event["event"] for event in events]}')
152+
print(f'Job progress: {job_progress}')
117153

118154

119155
if __name__ == '__main__':

0 commit comments

Comments
 (0)