Skip to content

Commit 15a7a42

Browse files
authored
Merge pull request #5 from harsh-kr11/fix/metrics-and-results
Full audit: dedup fix, PV tests, README rewrite, release workflow
2 parents 7bc31c4 + a3c579a commit 15a7a42

11 files changed

Lines changed: 392 additions & 454 deletions

File tree

.env.example

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ LANGFUSE_HOST=https://cloud.langfuse.com
3535
FEW_SHOT_K=3 # Number of traces to retrieve per query
3636
MAX_PROMPT_TOKENS=3500 # Token budget for the prompt
3737
SIMILARITY_DEDUP_THRESHOLD=0.95 # Cosine threshold for deduplication (Section III.E.3)
38+
SANDBOX_TIMEOUT_SECONDS=30 # Gatekeeper sandbox execution timeout
3839

3940
# === Feedback Loop ===
4041
FEEDBACK_SCORE_NAME=quality # Langfuse score name to watch

.github/workflows/release.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ on:
55
tags: ["v*"]
66

77
permissions:
8+
contents: read
89
id-token: write
910

1011
jobs:
@@ -16,3 +17,6 @@ jobs:
1617
- uses: astral-sh/setup-uv@v4
1718
- run: uv build
1819
- uses: pypa/gh-action-pypi-publish@release/v1
20+
with:
21+
packages-dir: dist/
22+

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,6 @@ Thumbs.db
3131

3232
uv.lock
3333

34-
benchmark_results.json
34+
benchmark_results*.json
3535
.chainlit/
3636
chainlit.md

CONTRIBUTING.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,16 @@ git commit -m "your message"
5959
To run checks manually at any time:
6060

6161
```bash
62+
# Using make (recommended)
63+
make lint # Lint
64+
make format # Auto-format
65+
make typecheck # Type check (mypy strict)
66+
make test # Run all tests
67+
make ci # All checks (lint + format + typecheck + test)
68+
make validate # Pipeline validation (no API keys needed)
69+
make demo # Offline demo
70+
71+
# Or directly
6272
uv run ruff check src/ tests/ agent/ # Lint only
6373
uv run ruff check --fix src/ tests/ agent/ # Lint + auto-fix
6474
uv run ruff format src/ tests/ agent/ # Format
@@ -69,7 +79,7 @@ uv run pytest # Run all tests
6979
## Running Tests
7080

7181
```bash
72-
# All 104 tests (no external services needed)
82+
# All tests (no external services needed)
7383
uv run pytest
7484

7585
# With verbose output

0 commit comments

Comments
 (0)