Skip to content

feat: experimental API#3

Open
jolovicdev wants to merge 2 commits intomasterfrom
dev
Open

feat: experimental API#3
jolovicdev wants to merge 2 commits intomasterfrom
dev

Conversation

@jolovicdev
Copy link
Copy Markdown
Owner

Summary of Changes

Added

  • FastAPI-based REST API for the blackgeorge framework
  • Worker management endpoints (create, list, get, delete)
  • Workforce management endpoints (create, list, get, delete)
  • Job execution endpoints for workers and workforces
  • Run status and resume endpoints
  • Event tracking endpoints
  • Health check endpoint
  • Pydantic request/response models
  • Structured JSON logging middleware
  • Custom exception classes with proper HTTP status codes
  • OpenAPI/Swagger auto-documentation at /docs
  • 32 API tests with full coverage

Changed

  • Added list_runs() method to RunStore base class and implementations
  • Added httpx to dev dependencies for test client support
  • Configured ruff to ignore B008 for API routes (standard FastAPI pattern)

Documentation

  • Added docs/api.md with API usage examples

@jolovicdev jolovicdev self-assigned this Jan 10, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 908ee1aa38

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/blackgeorge/api/routes/runs.py
Comment thread src/blackgeorge/api/routes/workers.py
@jolovicdev
Copy link
Copy Markdown
Owner Author

@ds-review

Copy link
Copy Markdown

@ds-review ds-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review

PR: feat: experimental API

Important

Verdict: Request changes - 13 actionable findings, highest severity P0.

Findings (13)

P0 Critical No authentication or authorization

src/blackgeorge/api/main.py:11

The API has no authentication mechanism, allowing anyone with network access to create workers, run jobs, and access all endpoints. This is a critical vulnerability for production deployment.

P0 Critical Resume discards paused report state

src/blackgeorge/api/routes/runs.py:88

The resume endpoint constructs a new Report with empty lists for pending_action, errors, messages, tool_calls, and events, losing the suspended state needed for resumption. Use the original paused report instead of reconstructing it from the run record.

P0 Critical worker.tools is not callable

src/blackgeorge/api/routes/workers.py:33

worker.tools is a list attribute, not a method. Calling it with parentheses will raise TypeError. Use worker.tools instead of worker.tools().

P1 High Default CORS allows all origins

src/blackgeorge/api/config.py:9

The default cors_origins = ['*'] is permissive and a security risk. Restrict to a safe default (e.g., localhost) and require explicit configuration for production.

P1 High Blocking I/O in async route handler

src/blackgeorge/api/routes/runs.py:38

desk.run is likely a synchronous blocking call. In an async FastAPI route, it will block the event loop, degrading performance. Offload it with await run_in_threadpool(desk.run, worker, job, stream=request.stream).

P2 Medium list_runs fetches all records before filtering

src/blackgeorge/api/routes/status.py:38

Fetching all runs and then filtering in Python is inefficient and memory-heavy for large datasets. The store's list_runs should support filtering and pagination at the query level.

P2 Medium Missing pagination on run events endpoint

src/blackgeorge/api/routes/status.py:66

The run events endpoint returns all events for a run without limit or offset. Runs with many events can produce large responses and impact client and server performance.

P2 Medium Direct access to private Desk._workers

src/blackgeorge/api/routes/workers.py:20

The route directly accesses desk._workers, a private attribute. Use a public accessor method (e.g., desk.get_worker(name)) to avoid coupling and future breakage.

P2 Medium Missing pagination on list workers endpoint

src/blackgeorge/api/routes/workers.py:39

The list workers endpoint returns all registered workers without any limit or offset. In large deployments this can lead to large response payloads and excessive memory usage.

P2 Medium Missing pagination on list workforces endpoint

src/blackgeorge/api/routes/workforces.py:56

The list workforces endpoint returns all workforces without pagination, risking large responses and memory pressure when many workforces exist.

P3 Low Duplicate get_config function

src/blackgeorge/api/dependencies.py:13

dependencies.py defines a get_config function that mirrors the one in config.py. Remove the duplicate and import from config.py to avoid confusion.

P3 Low Hot-path logging may degrade performance

src/blackgeorge/api/middleware.py:14

Logging every incoming request and completed request at INFO level introduces overhead on every HTTP call. In high-traffic scenarios this can become a bottleneck.

P3 Low Use custom InvalidResumeError

src/blackgeorge/api/routes/runs.py:72

The resume endpoint throws a generic HTTPException for non-paused runs. The custom InvalidResumeError is better suited and ensures consistent error responses.

How To Recheck

Reply @ds-review recheck under the relevant inline finding after pushing a fix.



def create_app() -> FastAPI:
config = get_config()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Critical No authentication or authorization

The API has no authentication mechanism, allowing anyone with network access to create workers, run jobs, and access all endpoints. This is a critical vulnerability for production deployment.

Suggested change
config = get_config()
Integrate FastAPI's security utilities to add authentication (e.g., API key, OAuth2) and enforce access control.

name=worker.name,
model=worker.model,
instructions=worker.instructions,
tools=[t.name for t in worker.tools()],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Critical worker.tools is not callable

worker.tools is a list attribute, not a method. Calling it with parentheses will raise TypeError. Use worker.tools instead of worker.tools().

Suggested change
tools=[t.name for t in worker.tools()],
tools=[t.name for t in worker.tools],

Comment on lines +77 to +88
report = Report(
run_id=run_id,
status=record.status,
content=record.output,
data=record.output_json,
pending_action=None,
metrics={},
errors=[],
messages=[],
tool_calls=[],
events=[],
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Critical Resume discards paused report state

The resume endpoint constructs a new Report with empty lists for pending_action, errors, messages, tool_calls, and events, losing the suspended state needed for resumption. Use the original paused report instead of reconstructing it from the run record.

Copy link
Copy Markdown
Owner Author

@jolovicdev jolovicdev May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this further ? @ds-review

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the resume endpoint (line 77), you're creating a new Report with pending_action=None, errors=[], messages=[], tool_calls=[], events=[]—all blank. The paused run’s report had accumulated these fields to track progress (e.g., pending actions, errors, tool calls). Reconstructing it from limited record fields loses that state, so the desk.resume call can't properly pick up where it left off. Instead, the original Report object from the run should be stored and reused.

metadata=request.metadata,
)

report = desk.run(worker, job, stream=request.stream)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 High Blocking I/O in async route handler

desk.run is likely a synchronous blocking call. In an async FastAPI route, it will block the event loop, degrading performance. Offload it with await run_in_threadpool(desk.run, worker, job, stream=request.stream).

Suggested change
report = desk.run(worker, job, stream=request.stream)
report = await run_in_threadpool(desk.run, worker, job, stream=request.stream)


title: str = "Blackgeorge API"
version: str = "0.1.0"
cors_origins: list[str] = ["*"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 High Default CORS allows all origins

The default cors_origins = ['*'] is permissive and a security risk. Restrict to a safe default (e.g., localhost) and require explicit configuration for production.

Suggested change
cors_origins: list[str] = ["*"]
cors_origins: list[str] = ["http://localhost:3000"]

manager=workforce.manager.name if workforce.manager else None,
)


Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Medium Missing pagination on list workforces endpoint

The list workforces endpoint returns all workforces without pagination, risking large responses and memory pressure when many workforces exist.

Suggested change
async def list_workforces(desk: Desk = Depends(get_desk), limit: int = Query(100, ge=1, le=1000), offset: int = Query(0, ge=0)) -> list[WorkforceResponse]:
workforces = list(desk._workforces.values())
return [
WorkforceResponse(
name=w.name,
workers=[worker.name for worker in w.workers],
mode=w.mode,
manager=w.manager.name if w.manager else None,
)
for w in workforces[offset : offset + limit]
]

async def get_run_events(run_id: str, desk: Desk = Depends(get_desk)) -> list[EventResponse]:
record = desk.run_store.get_run(run_id)
if not record:
raise RunNotFoundError(run_id)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Medium Missing pagination on run events endpoint

The run events endpoint returns all events for a run without limit or offset. Runs with many events can produce large responses and impact client and server performance.

Suggested change
raise RunNotFoundError(run_id)
async def get_run_events(run_id: str, desk: Desk = Depends(get_desk), limit: int = Query(100, ge=1, le=1000), offset: int = Query(0, ge=0)) -> list[EventResponse]:
record = desk.run_store.get_run(run_id)
if not record:
raise RunNotFoundError(run_id)
events = desk.run_store.get_events(run_id)
page = events[offset : offset + limit]
return [
EventResponse(
event_id=e.event_id,
type=e.type,
timestamp=e.timestamp,
source=e.source,
payload=e.payload,
)
for e in page
]

Comment on lines +9 to +13

def get_config() -> APIConfig:
global _config
if _config is None:
_config = APIConfig()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Low Duplicate get_config function

dependencies.py defines a get_config function that mirrors the one in config.py. Remove the duplicate and import from config.py to avoid confusion.

Suggested change
def get_config() -> APIConfig:
global _config
if _config is None:
_config = APIConfig()
from blackgeorge.api.config import get_config

def register_middleware(app: Any) -> None:
@app.middleware("http")
async def log_requests(
request: Request, call_next: Callable[[Request], Awaitable[Response]]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Low Hot-path logging may degrade performance

Logging every incoming request and completed request at INFO level introduces overhead on every HTTP call. In high-traffic scenarios this can become a bottleneck.

Suggested change
request: Request, call_next: Callable[[Request], Awaitable[Response]]
logger.debug(
"Incoming request",
method=request.method,
path=request.url.path,
client=str(request.client),
)


if record.status != "paused":
raise HTTPException(
status_code=400, detail=f"Run '{run_id}' is not paused (status: {record.status})"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Low Use custom InvalidResumeError

The resume endpoint throws a generic HTTPException for non-paused runs. The custom InvalidResumeError is better suited and ensures consistent error responses.

Suggested change
status_code=400, detail=f"Run '{run_id}' is not paused (status: {record.status})"
raise InvalidResumeError(f"Run '{run_id}' is not paused (status: {record.status})")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant