You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+19Lines changed: 19 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
9
9
10
10
### Added
11
11
12
+
**Parallel Execution**
13
+
14
+
- Added parallel task execution with `num_workers` parameter in `Benchmark.run()` using `ThreadPoolExecutor` (PR: #14)
15
+
- Added `ComponentRegistry` class for thread-safe component registration with thread-local storage (PR: #14)
16
+
- Added `TaskContext` for cooperative timeout checking with `check_timeout()`, `elapsed`, `remaining`, and `is_expired` properties (PR: #14)
17
+
- Added `TaskProtocol` dataclass with `timeout_seconds`, `timeout_action`, `max_retries`, `priority`, and `tags` fields for task-level execution control (PR: #14)
- Added `TaskTimeoutError` exception with `elapsed`, `timeout`, and `partial_traces` attributes (PR: #14)
20
+
- Added `TASK_TIMEOUT` to `TaskExecutionStatus` enum for timeout classification (PR: #14)
21
+
22
+
**Task Queue Abstraction**
23
+
24
+
- Added `TaskQueue` abstract base class with iterator interface for flexible task scheduling (PR: #14)
25
+
- Added `SequentialQueue` for simple FIFO task ordering (PR: #14)
26
+
- Added `PriorityQueue` for priority-based task scheduling using `TaskProtocol.priority` (PR: #14)
27
+
- Added `AdaptiveTaskQueue` abstract base class for feedback-based adaptive scheduling with `initial_state()`, `select_next_task(remaining, state)`, and `update_state(task, report, state)` methods (PR: #14)
28
+
12
29
**ModelAdapter Chat Interface**
13
30
14
31
- Added `chat()` method to `ModelAdapter` as the primary interface for LLM inference, accepting a list of messages in OpenAI format and returning a `ChatResponse` object and accepting tools
@@ -48,6 +65,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
48
65
**Benchmark**
49
66
50
67
-`Benchmark.agent_data` parameter is now optional (defaults to empty dict) (PR: #16)
68
+
- Refactored `Benchmark` to delegate registry operations to `ComponentRegistry` class (PR: #)
69
+
-`Benchmark.run()` now accepts optional `queue` parameter (`BaseTaskQueue`) for custom task scheduling (PR: #14)
Tasks define individual benchmark scenarios including inputs, expected outputs, and any metadata needed for evaluation. TaskCollections group related tasks together.
3
+
Tasks define individual benchmark scenarios including inputs, expected outputs, and metadata for evaluation. Task queues control execution order and scheduling strategy.
0 commit comments