Skip to content

Commit cd52042

Browse files
Fix column mask and cross-policy row filter bugs
Bug 1 — column_mask had no effect: parse_mask_expr used ctx.sql() to build a standalone plan, causing a double alias (Alias(Alias(...))) and table-qualified column references that didn't resolve in the real plan. Rewrote parse_mask_expr as a sync function using sql_ast_to_df_expr with FunctionRegistry lookup, eliminating the standalone plan entirely. Extended sql_ast_to_df_expr with an optional registry parameter to support all built-in scalar functions (RIGHT, LEFT, UPPER, LOWER, CONCAT, COALESCE, etc.) in mask expressions. Bug 2 — cross-policy row filters were OR'd instead of AND'd: multiple permit policies on the same table produced a union of rows instead of an intersection. Fixed collect() to use lit(true) seed and .and() combinator so each policy adds a restriction (intersection semantics). Added regression tests for both bugs: - test_exec_permit_column_mask: literal mask applied to all rows - test_exec_column_mask_with_row_filter: mask + row filter combined - test_exec_two_permits_row_filter_and: disjoint filters → 0 rows - test_exec_two_permits_row_filter_and_overlapping: overlapping → intersection Updated docs/permission-system.md, proxy/CLAUDE.md, and docs/permission-security-tests.md to reflect correct semantics.
1 parent 6c9fe66 commit cd52042

5 files changed

Lines changed: 1410 additions & 341 deletions

File tree

docs/permission-security-tests.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ async fn policy_required_no_policy_returns_empty() { ... }
156156

157157
---
158158

159-
### 9. `column_access deny` on deny-effect policies ignored at query time
159+
### 11. `column_access deny` on deny-effect policies ignored at query time
160160

161161
**Vector**: Admin creates a **deny-effect** policy with a `column_access deny` obligation on `ssn`, expecting the column to be stripped from query results immediately.
162162

@@ -168,7 +168,7 @@ async fn policy_required_no_policy_returns_empty() { ... }
168168

169169
---
170170

171-
### 10. Disabled policies still enforced in visibility layer
171+
### 12. Disabled policies still enforced in visibility layer
172172

173173
**Vector**: Admin disables a policy with `column_access deny`, expecting the column to reappear in `information_schema.columns` on next reconnect.
174174

@@ -183,7 +183,39 @@ async fn policy_required_no_policy_returns_empty() { ... }
183183

184184
---
185185

186-
### 11. `SELECT <denied-column>` returns silent empty rows instead of an error
186+
### 13. Column mask had no effect — original values returned
187+
188+
**Vector**: Admin creates a `column_mask` obligation expecting `ssn` values to be masked (e.g. `'***-**-' || RIGHT(ssn, 4)`). Data is queried and original SSN values are returned as-is.
189+
190+
**Bug**: `parse_mask_expr` built a standalone SQL plan (`SELECT {mask} AS {col} FROM {schema}.{table}`) via `ctx.sql()`, then extracted the first `Projection` expression. Two problems:
191+
1. **Double alias**: the extracted expression was already `Alias(inner, "ssn")` from the `AS ssn` clause; `apply_projection` then wrapped it again with `.alias(col_name)` producing `Alias(Alias(...))`, which DataFusion silently resolved by dropping the inner alias — causing column not found or type mismatches at execution time.
192+
2. **Qualified column references**: the inner expression carried table-qualified references (e.g. `public.customers.ssn`) bound to the standalone plan's `TableScan`. These did not resolve against the actual query plan, so the mask evaluated to NULL or errored.
193+
194+
**Defense**: `parse_mask_expr` is now sync and uses `sql_ast_to_df_expr(..., Some(ctx))` — the same sqlparser → DataFusion AST converter used for row filter expressions, extended with `FunctionRegistry` lookup. No standalone plan is built. Column references are unqualified (`col("ssn")`), resolving correctly against the real query plan. No alias wrapping occurs — `apply_projection` provides the alias.
195+
196+
**Test**:
197+
- Unit: `hooks::policy::tests::test_exec_permit_column_mask` — literal mask `'REDACTED'` applied; all SSN values in result equal `"REDACTED"`.
198+
- Unit: `hooks::policy::tests::test_exec_column_mask_with_row_filter` — row filter (3 rows) + mask combined; 3 rows returned with `ssn = "***"`.
199+
- Unit: `hooks::policy::tests::test_deny_overrides_mask` — column denied and masked; deny takes priority, column absent from result.
200+
201+
---
202+
203+
### 14. Two permit policies with row_filter produced a union (OR) instead of intersection (AND)
204+
205+
**Vector**: Two permit policies both have `row_filter` obligations on the same table with different conditions (e.g. Policy A: `org_id = 'acme'`, Policy B: `status = 'active'`). A user assigned both policies can see ALL rows matching either condition — including rows from other tenants or inactive records that neither policy alone intended to expose.
206+
207+
**Bug**: In `ObligationEffects::collect()`, cross-policy row filters were combined with OR semantics (seed `lit(false)`, combinator `.or()`). The intent was "any permit match grants access", but this allows a user assigned multiple narrow policies to see the union of all their allowed sets — potentially broader than any single policy intended.
208+
209+
**Defense**: Cross-policy row filters are now combined with AND semantics (seed `lit(true)`, combinator `.and()`). Each permit policy adds a restriction; users see the intersection. Within a single policy, multiple `row_filter` obligations are still AND'd (unchanged). Deny policies are unaffected — the deny short-circuit on first match is equivalent to OR across denies.
210+
211+
**Test**:
212+
- Unit: `hooks::policy::tests::test_exec_two_permits_row_filter_and` — two disjoint filters (`acme` / `globex`) → AND → 0 rows.
213+
- Unit: `hooks::policy::tests::test_exec_two_permits_row_filter_and_overlapping` — overlapping filters (`org_id = 'acme'``name != 'Charlie'`) → 2 rows (Alice + Bob only).
214+
- Unit: `hooks::policy::tests::test_row_filters_and_across_policies` — plan structure shows AND expression containing both filter values.
215+
216+
---
217+
218+
### 15. `SELECT <denied-column>` returns silent empty rows instead of an error
187219

188220
**Vector**: User runs `SELECT ssn FROM customers` where `ssn` is denied. They receive many rows with empty/null values and incorrectly conclude the column is empty in the database.
189221

docs/permission-system.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Injects a `WHERE` clause into queries that touch the specified table.
4040
Use `"schema": "*"` and/or `"table": "*"` to match all schemas or tables.
4141

4242
Multiple `row_filter` obligations on the **same policy** targeting the same table are **AND**ed together.
43-
`row_filter` obligations from **different permit policies** are **OR**ed together — any permit match grants access.
43+
`row_filter` obligations from **different permit policies** are also **AND**ed together — each policy adds a restriction, and users see the intersection of all matching policies.
4444

4545
### column_mask
4646

@@ -131,7 +131,7 @@ Each policy assignment has a `priority` (integer, lower = higher precedence, def
131131

132132
| Situation | Resolution |
133133
|---|---|
134-
| Multiple permit policies, same table | Row filters are OR'd |
134+
| Multiple permit policies, same table | Row filters are AND'd (intersection) |
135135
| Multiple column masks, same column | Lowest priority number wins |
136136
| column_access deny from any enabled policy (permit or deny) | Column is always removed |
137137
| Equal priority, user-specific vs wildcard | User-specific assignment wins |

ideas.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
- Policy creation can be hard without deep understanding
2727
- Consider: policy templates, policy examples, or AI-assisted policy creation
2828
- Allow create (paste) and/or import policy from json/yaml, as a way to help users to quickly copy polices from testing env to prod env.
29+
- auto scan upstream tables, fetch a few rows, then use LLM to suggest a bunch of policies, users can pick to add/enable them, and/or future tweak them manually or by iterating with LLM.
2930

3031
### Performance of PolicyHook
3132

@@ -235,6 +236,8 @@ Given complexity of new policy system (interaction with DataFusion and PostgreSQ
235236
- 2026-03-04: DataFusion query error - table 'postgres.pg_catalog.pg_statio_user_tables' not found
236237
- 2026-03-04: DataFusion query error - table 'postgres.information_schema.table_constraints' not found
237238
- 2026-03-04: DataFusion query error - Invalid function 'quote_ident'. Did you mean 'date_bin'?
239+
- 2026-03-08: Column masking obligation doesn't work - tested with SSN column, still see the whole value instead of masked
240+
- 2026-03-08: Row filter policy interaction bug - when two separate row filter policies are enabled (e.g., tenant filter on tenant='foo' AND state filter on state!='WY'), the result contains more rows than either policy alone. Both tenant 'foo' rows AND non-WY state rows appear, rather than rows satisfying BOTH conditions.
238241
- Sometimes SQL queries take long time and cause UI to hang - need performance testing, may be missing indexes
239242

240243
## Frontend Architecture Guideline: Future-Proofing UI

proxy/CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ Policy CRUD handlers also call `state.proxy_handler.rebuild_contexts_for_datasou
6464
`PolicyHook` replaces the old hardcoded `RLSHook`. It loads policies from the DB, caches per `(datasource_id, username)` for 60 seconds, and applies three obligation types:
6565

6666
- **row_filter**`Filter(expr)` node injected below the matching `TableScan` via `transform_up`. Template variables (`{user.tenant}`, `{user.username}`, `{user.id}`) are substituted as `Expr::Literal` after parsing — never interpolated as raw SQL.
67-
- **column_mask** — replaces the column `Expr` in the top-level `Projection` with an aliased mask expression. Async-parsed before `transform_up` via `ctx.sql("SELECT {mask} AS {col} FROM ...")`.
67+
- **column_mask** — replaces the column `Expr` in the top-level `Projection` with an aliased mask expression. Parsed synchronously via `sql_ast_to_df_expr(..., Some(ctx))` — sqlparser converts the mask template to a DataFusion `Expr` using the session's `FunctionRegistry` for built-in function lookup (RIGHT, LEFT, UPPER, LOWER, CONCAT, COALESCE, etc.). No standalone SQL plan is created.
6868
- **column_access deny** — strips denied columns from the top-level `Projection`. Wildcards (`schema: "*"`, `table: "*"`) match any schema/table.
6969

7070
**Deny policies** short-circuit on the first match — query is rejected with a descriptive error before plan execution.

0 commit comments

Comments
 (0)