quickwit-oss
diff --git a/‎.claude/skills/sesh-mode/SKILL.md‎
Lines changed: 82 additions & 0 deletions b/‎.claude/skills/sesh-mode/SKILL.md‎
Lines changed: 82 additions & 0 deletions
diff --git a/‎.github/CODEOWNERS‎
Lines changed: 10 additions & 5 deletions b/‎.github/CODEOWNERS‎
Lines changed: 10 additions & 5 deletions
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 0 additions & 1 deletion b/‎.github/workflows/ci.yml‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎.github/workflows/coverage.yml‎
Lines changed: 0 additions & 1 deletion b/‎.github/workflows/coverage.yml‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎.github/workflows/datafusion-ci.yml‎
Lines changed: 0 additions & 1 deletion b/‎.github/workflows/datafusion-ci.yml‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎.github/workflows/dependency.yml‎
Lines changed: 8 additions & 1 deletion b/‎.github/workflows/dependency.yml‎
Lines changed: 8 additions & 1 deletion
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 0 deletions b/‎.gitignore‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎CLAUDE.md‎
Lines changed: 1 addition & 0 deletions b/‎CLAUDE.md‎
Lines changed: 1 addition & 0 deletions
@@ -44,6 +44,88 @@ Stateright      DST Tests      Production Metrics
 (exhaustive)    (simulation)   (Observability)
 ```
 
+## STOP, Don't Weaken — Specs are Load-Bearing
+
+Whether the load-bearing claim is a formal property (TLA+ invariant,
+Stateright property, DST assertion) or an English-language user
+requirement, the rule is the same: **MUST NOT** silently weaken it.
+First action is diagnosis and an explicit conversation with the user,
+not modification.
+
+### Verification check fails
+
+When a TLA+ invariant, Stateright property, or DST assertion fails, the
+failure has exactly two possible causes:
+
+1. **The implementation/model has a real bug.** Fix the bug.
+2. **The property is over-strong** — it asserts something the design does not
+   actually guarantee. *Sometimes* the right answer is to weaken or replace
+   the property — but the failing trace was probably also revealing the
+   real, weaker safety property the design *does* guarantee. Almost never
+   should the answer be just "remove the property."
+
+**Required protocol**:
+- Read the failing trace. State out loud what the property meant to claim,
+  and what the trace shows.
+- If unsure whether (1) or (2) applies — **stop and ask the user**.
+  Do not silently rewrite the property.
+- If (2) applies and you propose to weaken/replace, present the candidate
+  replacement *and* the safety property the original was reaching for, then
+  ask before changing. The replacement should usually preserve the spirit
+  via a different formulation (action property, liveness, narrower precondition)
+  — not just delete the constraint.
+
+**Forbidden** without explicit user approval:
+- Renaming an invariant to make its negation trivially true
+- Deleting an invariant that just produced a counter-example
+- Adding `=> TRUE` or other no-op weakenings
+- Changing `\A` to `\E`, `[]` to `<>`, or similar quantifier flips, when the
+  motive is to suppress a violation rather than to capture a different claim
+
+### User requirement seems hard or out-of-scope
+
+The same rule applies when translating from the user's spec into a plan,
+or from the plan into implementation. If a stated requirement looks hard,
+expensive, or "more than this PR needs," **stop and ask first** — do not
+decide unilaterally that "close enough" is acceptable.
+
+Silent-weakening moves to watch for:
+- **Granularity downgrade**: user asked for page-by-page streaming;
+  plan says "column-chunk granularity is the natural unit." The memory
+  bound the user actually wanted is gone.
+- **Constraint dropping**: user said "must not OOM under load"; plan
+  treats it as "bounded in the typical case." The qualifier was doing
+  load-bearing work and you removed it.
+- **Strength reduction**: user said "byte-identical metadata"; plan
+  says "logically equivalent metadata." Test passes; spec is gone.
+- **MUST → SHOULD**: user said "one GET per file"; plan allows "two in
+  the rare retry case." May be reasonable, but only the user can
+  authorize it.
+- **Reframing as out-of-scope**: user described a property as part of
+  the goal; plan declares it a follow-up PR. If the user didn't say
+  "split it," you don't get to.
+
+**Required protocol** mirrors the verification case: quote the user's
+original phrasing back to yourself; if you cannot point at the source
+phrase that authorizes the weaker version, you are not authorized to
+ship it. Surface the gap explicitly: *"you asked for X; my plan does
+Y; here's why I considered Y; ok to weaken, or do you want X?"* The
+default answer is X.
+
+**Plan-document approval is not spec approval.** When the plan you write
+diverges from the spec at write time, the user reviews the diverged
+plan, not the original requirement. Accountability for the divergence
+is on you — they approved what you wrote, not what they asked for.
+Re-read the user's original message before writing code.
+
+### Why this matters
+
+Specs — formal or English — describe the system's promise to its
+users. When a check fails or a requirement seems too strict, that has
+just told you either that the implementation is wrong, or that you've
+been claiming a stronger promise than you actually keep. Both deserve
+a conscious decision, never a silent edit.
+
 ## Testing Through Production Path
 
 **MUST NOT** claim a feature works unless tested through the actual network stack.
 
@@ -1,9 +1,7 @@
 # CODEOWNERS — see https://docs.github.com/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
 #
-# Last matching rule per file wins. Approval from any listed owner satisfies
-# the requirement. quickwit-dev is listed on every rule so it can always
-# approve; an additional team is listed on the metrics Parquet pipeline paths
-# so PRs scoped to those paths can be approved by either team.
+# Last matching rule per file wins. Multiple owners listed on the same line
+# means approval from ANY ONE of them satisfies the requirement.
 
 # Default: quickwit-core owns everything
 *                                                        @quickwit-oss/quickwit-core
@@ -13,4 +11,11 @@
 /quickwit/quickwit-datafusion/                           @quickwit-oss/byoc-metrics
 /quickwit/quickwit-df-core/                              @quickwit-oss/byoc-metrics
 /quickwit/quickwit-dst/                                  @quickwit-oss/byoc-metrics
-/quickwit/quickwit-indexing/src/actors/metrics_pipeline/ @quickwit-oss/byoc-metrics
+/quickwit/quickwit-indexing/src/actors/parquet_pipeline/ @quickwit-oss/byoc-metrics
+
+# Shared paths — either team can approve. `docs/internals/` is shared
+# architecture + verification docs that any team working on the codebase
+# may need to update. `Cargo.lock` churns on routine dependency bumps
+# and doesn't carry domain-specific review value.
+/docs/internals/                                         @quickwit-oss/quickwit-core @quickwit-oss/byoc-metrics
+/quickwit/Cargo.lock                                     @quickwit-oss/quickwit-core @quickwit-oss/byoc-metrics
@@ -16,7 +16,6 @@ permissions:
 
 env:
   CARGO_INCREMENTAL: 0
-  QW_DISABLE_TELEMETRY: 1
   QW_TEST_DATABASE_URL: postgres://quickwit-dev:quickwit-dev@localhost:5432/quickwit-metastore-dev
   RUST_BACKTRACE: 1
   RUSTDOCFLAGS: -Dwarnings -Arustdoc::private_intra_doc_links
 
@@ -20,7 +20,6 @@ env:
   AWS_SECRET_ACCESS_KEY: "placeholder"
   CARGO_INCREMENTAL: 0
   PUBSUB_EMULATOR_HOST: "localhost:8681"
-  QW_DISABLE_TELEMETRY: 1
   QW_S3_ENDPOINT: "http://localhost:4566" # Services are exposed as localhost because we are not running coverage in a container.
   QW_S3_FORCE_PATH_STYLE_ACCESS: 1
   QW_TEST_DATABASE_URL: postgres://quickwit-dev:quickwit-dev@localhost:5432/quickwit-metastore-dev
 
@@ -18,7 +18,6 @@ permissions:
 
 env:
   CARGO_INCREMENTAL: 0
-  QW_DISABLE_TELEMETRY: 1
   RUST_BACKTRACE: 1
   RUSTFLAGS: -Dwarnings --cfg tokio_unstable
 
 
@@ -20,4 +20,11 @@ jobs:
         with:
           # GHSA-c38w-74pg-36hr, GHSA-4grx-2x9w-596c: minor vuln on the rsa crate, used for google storage.
           # GHSA-cq8v-f236-94qc: rand 0.8.6 unsound with custom logger + rand::rng(), not affected (log feature disabled, transitive dep from fail/sqlx).
-          allow-ghsas: GHSA-c38w-74pg-36hr,GHSA-4grx-2x9w-596c,GHSA-cq8v-f236-94qc
+          # GHSA-2f9f-gq7v-9h6m: thrift 0.17 excessive-allocation on untrusted input. Already a transitive
+          #   dep via parquet on main; PR-4 (streaming Parquet reader) adds it as a direct dep for
+          #   `parquet::format::PageHeader::read_from_in_protocol`. Inputs are parquet files we wrote to
+          #   our own object store (trusted source). Defense-in-depth: the streaming reader caps each
+          #   Thrift parse buffer at `max_page_header_bytes` (1 MiB default) BEFORE handing bytes to
+          #   thrift, so the outer buffer thrift can read from is bounded regardless of any inner
+          #   length prefix. No newer thrift release fixes this — 0.17.0 is the latest on crates.io.
+          allow-ghsas: GHSA-c38w-74pg-36hr,GHSA-4grx-2x9w-596c,GHSA-cq8v-f236-94qc,GHSA-2f9f-gq7v-9h6m
@@ -32,3 +32,6 @@ qwdata
 
 # GSD planning artifacts (local only)
 .planning
+
+# TLC model checker working directory (state dumps from `tla2tools.jar`)
+states/
@@ -43,6 +43,7 @@ See `quickwit/CLAUDE.md` for architecture overview, crate descriptions, and buil
 | Recreates futures in `select!` loops | Use `&mut fut` to resume, not recreate — dropping loses data | GAP-002 |
 | Holds locks across await points | Invariant violations on cancel. Use message passing or synchronous critical sections | GAP-002 |
 | Silently swallows unexpected state | If a condition "shouldn't happen," return an error or assert — don't silently return Ok. Skipping optional/missing data is fine; pretending a bug didn't occur is not | Code quality |
+| Replies to PR review comments as standalone inline comments | Use `gh api repos/.../pulls/{pr}/comments/{codex_comment_id}/replies` (or `in_reply_to_id` in the POST body) so the reply is **threaded under** the original review comment. Standalone inline comments at the same line are NOT replies; they show as separate threads. Verify with the GitHub API that `in_reply_to_id` is set on your reply. | Review hygiene |
 
 ## Engineering Priority