Skip to content

CodeNinjaSarthak/architectai

Repository files navigation

ArchitectAI

ArchitectAI

Tests Coverage Pipeline No signup Python Next.js License


A senior engineer interrogates your idea. Live research catches the market.
You get a structured architecture plan you can actually ship from.


Contents


Demo

Discovery → Plan — type a raw idea, answer a few targeted questions, get a research-backed architecture plan with hard calls on stack, what not to skip, and what you are likely to over-build.

Discovery to plan demo

Refinement — change a constraint mid-session and re-run the pipeline.

Refinement demo


How it works

Three agents, one optional follow-up research pass, zero LLM-controlled branching.

sequenceDiagram
    actor User
    participant FE as Next.js Frontend
    participant API as FastAPI
    participant SM as SessionManager
    participant DA as Discovery Agent
    participant RA as Research Agent
    participant PA as Planning Agent
    participant RD as Redis

    User->>FE: Submits raw idea
    FE->>API: POST /v1/sessions
    API->>SM: create(raw_idea)
    SM->>DA: next_turn([], raw_idea)
    DA-->>SM: AskQuestion
    SM->>RD: store SessionState
    API-->>FE: { type: "question" }

    loop Discovery (1–N turns)
        User->>FE: Answers question
        FE->>API: POST /v1/sessions/{id}/turn
        SM->>DA: next_turn(history, raw_idea)
        DA-->>SM: AskQuestion | CompleteBrief
        SM->>RD: store SessionState
        API-->>FE: { type: "question" } | { type: "plan" }
    end

    Note over SM,PA: Brief complete — pipeline starts
    SM->>RA: run(brief)
    RA-->>SM: ResearchFindings
    SM->>PA: run(brief, findings, iteration=0)
    PA-->>SM: Plan

    alt coverage_assessment.unresolved non-empty
        SM->>RA: run(brief, follow_up_queries)
        RA-->>SM: ResearchFindings (targeted)
        SM->>PA: run(brief, findings, iteration=1)
        PA-->>SM: Plan (final)
    end

    SM->>RD: store complete SessionState
    FE->>API: polls /v1/sessions/{id}/state
    API-->>FE: plan_ready
    FE-->>User: Renders Plan page
Loading

Discovery interrogates the idea → Research gathers citable evidence → Planning makes hard calls. If Planning returns unresolved decisions, a second targeted Research pass fires — once. The orchestrator owns all branching; agents are stateless callables with no session knowledge.


What makes it technically interesting

Control flow in Python, not prompts

Loop gate, cost caps, and outcome branching are isinstance() checks and named constants in orchestrator/. No LLM is ever asked to decide whether to loop or stop.

Research loop bounded by a constant

Second Research pass fires once. The cap is MAX_RESEARCH_ITERATIONS in constants.py. Changing the loop bound is one line of Python, not a prompt edit.

Cost caps enforced before every agent call

CostTracker accumulates spend across the session and gates each call against a per-run and daily cap. Caps are set at construction time, not described in a prompt.

Refinement as a distributed-systems problem

Re-running the pipeline requires a Redis lock (SET NX PX), background task dispatch, SSE progress streaming, and no-op detection when the brief hasn't changed. All in session/manager.py.

Schemas are the contract

ProjectBrief, ResearchFindings, and Plan are frozen Pydantic v2 models. Agent outputs are validated against them on the way out. The orchestrator never reads raw LLM text — it reads typed objects.


Example plan output

A small recruiting agency ATS — one of the three built-in example ideas. See examples/ for three full session outputs.

See the full plan

Idea: "An ATS for small recruiting agencies replacing spreadsheets — Python developer, $99/seat/month, GDPR and CCPA apply."


Recommended stack

Category Choice Why
Language / Framework Python 3.12 + FastAPI Pydantic integration makes schema validation free; founder knows the stack.
Database Postgres 16 with RLS on Neon Managed Postgres removes ops burden; RLS enforces multi-tenancy at the database layer.
Auth WorkOS or Clerk B2B SaaS auth means SSO and team invitations; building it yourself is six months of distraction.

Hard calls

Multi-tenancy — Postgres row-level security, one database Use Postgres RLS in a single shared database with tenant_id on every row. Why: At 50 tenants × 5 seats, schema-per-tenant makes migrations a distributed-systems problem for no benefit. RLS enforces isolation at the database layer. Revisit when: a single tenant exceeds 500 GB or signs a contract requiring physical isolation.

Email integration — Gmail and Microsoft Graph OAuth webhooks OAuth into Gmail and Microsoft Graph; subscribe to push notifications; fetch on webhook delivery. Why: IMAP polling is fragile and slow. Forwarding addresses require recruiter workflow changes. OAuth webhooks give near-real-time delivery with maintained auth refresh. Revisit when: a third email provider covers more than 20% of the customer base.

PII handling — Soft delete with 30-day grace, hard delete, export endpoint, audit log Build right-to-delete and data portability on day one; append-only audit_events table per access. Why: GDPR and CCPA apply from the first EU candidate record. Retrofitting deletion and audit after launch is painful and creates compliance exposure.


What you will over-build Microservices — one service, keep it one service.

What not to skip PII handling — soft delete, hard delete, export, audit log on day one.


Tech stack

Layer Choice Why
Backend Python 3.11 + FastAPI Pydantic v2, async-native, SSE support
Frontend Next.js 15 + TypeScript App Router, svg-pan-zoom for Mermaid
Session storage Redis TTL expiry, distributed lock (SET NX PX)
LLM GPT-4.1 via Azure OpenAI Structured output, function calling
Research Tavily API Citable web search with source metadata
Testing pytest + vitest 375 backend + 7 frontend, 92% coverage, fakeredis for integration
Linting ruff + black + isort + mypy + bandit Pre-commit enforced

Local setup

Requirements: Python 3.11, uv, Node.js 18+, Redis.

# Clone and install
git clone git@github.com:CodeNinjaSarthak/architectai.git
cd architectai
uv sync --all-groups
uv run pre-commit install

# Environment
cp .env.example .env
# Fill in: AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT,
#          AZURE_OPENAI_DEPLOYMENT_NAME, TAVILY_API_KEY

# Run (three terminals)
redis-server
uv run uvicorn architectai.main:app --reload --port 8000
cd frontend && npm install && npm run dev   # → localhost:3000

Tests:

uv run pytest tests/ --no-cov          # backend
cd frontend && npm test                 # frontend
uv run pytest tests/ --cov=src/architectai   # with coverage report

Scope decisions

Deliberate choices, not oversights.

No auth Sessions are ephemeral Redis TTLs. No accounts, no login, no history beyond the active window.
No deployment config No Docker, no Terraform. Runs locally with uv + Redis. Cloud deployment is straightforward but was not the focus.
No rate limiting Cost caps enforce spend per run. Nothing prevents concurrent sessions from a single caller.
No persistent history Redis TTL is the only retention. Expired means gone — no database, no audit log.
check_mermaid.py not in CI Validates Mermaid output from real sessions. Runs manually. Wiring it into pre-commit is a filed issue.
No OpenAPI for /state The endpoint works but is not in the spec.
SSE no ping dedup Client doesn't deduplicate plan_progress events. Currently harmless — server emits distinct messages.

License

MIT

About

Multi-agent system that interrogates your idea, runs live research, and produces a structured architecture plan you can ship from.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors