Skip to content

Commit 245191e

Browse files
Centralize obligation structs, fix column glob matching in column_access deny
- Make `matches_pattern` pub and add suffix glob support (`*_suffix`) - Move shared obligation structs (RowFilterDef, ColumnMaskDef, ColumnAccessDef, ObjectAccessDef) to policy_match.rs as the single source of truth - Fix column glob expansion in engine: expand patterns against Arrow schema fields at connect time so `*_name`, `secret_*`, `*` resolve to exact column names - Add `column_glob_patterns: Vec<String>` to ObligationEffects; split exact vs glob column denies at collection time for O(1) common-case path - Add 21 new tests across policy_match, engine, and hooks covering suffix globs, prefix globs, total blackout, cross-table isolation, large schema, and known join-collision limitation - Document column glob patterns and known limitations in permission-system.md
1 parent f32f5b0 commit 245191e

5 files changed

Lines changed: 852 additions & 104 deletions

File tree

docs/permission-system.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,34 @@ This matches `raw_events`, `raw_orders`, `raw_customers`, etc. Useful for naming
151151

152152
Glob support applies to all obligation types: `row_filter`, `column_mask`, `column_access`, and `object_access`.
153153

154+
### Column glob patterns (`columns` field)
155+
156+
The `columns` field of `column_access` obligations also supports glob patterns:
157+
158+
| Column pattern | Denies | Keeps |
159+
|----------------|--------|-------|
160+
| `["*"]` | all columns in the matched table ||
161+
| `["secret_*"]` | `secret_key`, `secret_token` | `email`, `id`, `ssn` |
162+
| `["*_name"]` | `first_name`, `last_name` | `email`, `id`, `created_at` |
163+
| `["ssn"]` | `ssn` only (exact match) | all others |
164+
| `["*_at", "secret_*"]` | `created_at`, `secret_key`, `secret_token` | `email`, `id`, `ssn` |
165+
166+
Both prefix globs (`col_*`) and suffix globs (`*_col`) are supported for column names. Patterns are **case-sensitive** — this matches PostgreSQL's default behaviour of folding identifiers to lowercase. Glob matching for columns is applied at schema-metadata build time (connect) and at query-time projection (execute), so denied columns are hidden from both `information_schema.columns` and `SELECT` results.
167+
168+
## Known limitations
169+
170+
### Column deny is not table-qualified at query time
171+
172+
`column_access deny` obligations identify denied columns by **name only**, not by `schema.table.column`. In a query that JOINs two tables both containing a column named `id`, a deny on `id` in `table_a` will also strip `id` from `table_b` in the same result set.
173+
174+
The column is correctly hidden from schema metadata (`information_schema.columns`) on a per-table basis at connect time. Query-time stripping in `SELECT *` or explicit projections is name-based across the full projection.
175+
176+
*Workaround:* use more specific column names to avoid collisions (e.g. `orders_id` instead of `id`), or restrict access at the table level with `object_access deny` when full table hiding is needed.
177+
178+
### `object_access deny` uses the upstream (source) schema name, not the alias
179+
180+
If a schema has been aliased in the datasource configuration, the `object_access deny` obligation must use the original upstream schema name — not the display alias. Using the alias will silently fail to deny access.
181+
154182
## Join-based row filters
155183

156184
For tables that don't directly contain a tenant column, use `join_through` to filter via a parent table:

docs/roadmap.md

Lines changed: 0 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -55,45 +55,12 @@
5555

5656
> **See also:** DM-05 (verbose mode to explain why row filtered/masked), DM-04 (canary rollout for testing policies on subset of users)
5757
58-
### Glob Pattern Matching for Schema/Table/Column Names in Obligations ✅ **Completed 2026-03-09**
59-
60-
- **Current**: obligation `schema`/`table`/`column` fields support exact match or `*` (match all). No prefix/suffix patterns.
61-
- **Use case**: naming conventions are common — `raw_*` schemas, `tmp_*` tables, `*_id` columns. Forcing a separate obligation per object is verbose and fragile (breaks when new tables are added).
62-
- **Recommendation**: support trailing-`*` glob only (e.g., `table: "raw_*"`). Not `*_foo` or mid-string patterns.
63-
- Rationale: trailing prefix match covers ~90% of real naming conventions; trivial to implement (`starts_with`); unambiguous to read and write.
64-
- Full regex: reject — footgun for policy authors, no standard representation across tools.
65-
- `starts_with`/`ends_with`/`contains` keywords: possible future extension if trailing-`*` proves insufficient.
66-
- **Implementation**: add pattern matching in `visibility_matches()` (`engine/mod.rs`) and `matches_schema_table()` (`hooks/policy.rs`) — same predicate in both to stay consistent.
67-
6858
### Wildcard `*` Support for Column Names in `column_access`
6959

7060
- **Current**: `column_access` columns field is a plain list of exact column names — no wildcard support.
7161
- **Use case**: `columns: ["*"]` combined with `schema: "*"` / `table: "orders"` would deny all columns in a table, effectively making it invisible at the column-metadata level without needing `policy_required` mode.
7262
- **Implementation**: check for `"*"` in the columns list inside the `column_access` deny block in `compute_user_visibility()` (`engine/mod.rs`) and `PolicyHook` (`hooks/policy.rs`). If present, expand to all column names from the matched table's Arrow schema.
7363

74-
### Schema & Table-level Deny Obligation (`object_access: deny`) ✅ **Completed 2026-03-09**
75-
76-
- **Problem**: In `open` mode with a global tenant-isolation policy (wildcard `row_filter`), there is no way to hide a specific schema or table from a specific user without switching the entire datasource to `policy_required` mode. `column_access: deny` hides individual columns within a table but cannot hide entire schemas or tables from the catalog.
77-
- **Current workaround**: Switch to `policy_required` — but this forces every user to have explicit permit assignments for every schema and table they need.
78-
- **What this is NOT**: This is not about hiding columns (that's `column_access: deny`). This is about hiding entire schemas or entire tables from the catalog — they become invisible in `information_schema.schemata`/`information_schema.tables`, SQL client sidebars, and query execution.
79-
- **Proposed**: Add a new obligation type that feeds into `compute_user_visibility()` at connect time (alongside the existing `column_access: deny`):
80-
- **Schema deny**: `schema: "analytics"` → entire schema and all its tables are excluded from the user's filtered `SessionContext`
81-
- **Table deny**: `schema: "public", table: "payments"` → specific table is excluded while the rest of the schema remains visible
82-
- **Use cases**:
83-
- Hide an internal `analytics` schema from external partners in `open` mode
84-
- Hide a `payments` table from support agents who only need `orders` and `customers`
85-
- Combine with glob patterns (if implemented): hide all `raw_*` schemas from non-engineering users
86-
- This lets operators stack targeted schema/table hiding on top of an `open`-mode datasource without restructuring all assignments.
87-
88-
> **See also:** DS-14 (schema-level deny), DS-15 (table-level deny) in `permission_stories.md`
89-
90-
### Validate `deny` + `column_mask` Combination ✅ **Completed 2026-03-09**
91-
92-
- **Problem**: The codebase silently ignores `column_mask` obligations on deny-effect policies. The `PolicyHook` only processes `column_mask` from permit policies, so if a user creates a deny policy with a `column_mask` obligation, it has no effect — the column is not masked.
93-
- **Recommendation**: Add validation in both API and UI to prevent this invalid combination:
94-
- **API**: Reject policy creation/update if `effect: "deny"` and `obligation_type: "column_mask"` are both present. Return a clear validation error.
95-
- **UI**: When "deny" effect is selected, hide `column_mask` from the available obligation types in the policy creation form. Show a tooltip or help text explaining why (e.g., "Column masking is not supported on deny policies").
96-
9764
### Conditional Column Masking
9865

9966
- **Use case**: Mask sensitive columns only when certain user attributes match a condition. For example:
@@ -345,8 +312,6 @@ Given complexity of new policy system (interaction with DataFusion and PostgreSQ
345312
- 2026-03-04: DataFusion query error - table 'postgres.pg_catalog.pg_statio_user_tables' not found
346313
- 2026-03-04: DataFusion query error - table 'postgres.information_schema.table_constraints' not found
347314
- 2026-03-04: DataFusion query error - Invalid function 'quote_ident'. Did you mean 'date_bin'?
348-
- 2026-03-08: Column masking obligation doesn't work ✅ **Completed 2026-03-08** - tested with SSN column, still see the whole value instead of masked
349-
- 2026-03-08: Row filter policy interaction bug ✅ **Completed 2026-03-08** - when two separate row filter policies are enabled (e.g., tenant filter on tenant='foo' AND state filter on state!='WY'), the result contains more rows than either policy alone. Both tenant 'foo' rows AND non-WY state rows appear, rather than rows satisfying BOTH conditions.
350315
- Sometimes SQL queries take long time and cause UI to hang - need performance testing, may be missing indexes
351316

352317
### Git Commit Hook Improvements
@@ -415,10 +380,6 @@ Benefit: This ensures that if we need to override a "v1" style for a specific ed
415380

416381
Specific areas identified from 2026-03-08 bug fixes — worth revisiting in a dedicated refactoring pass:
417382

418-
#### Duplicated matching logic ✅ **Completed 2026-03-08**
419-
420-
`matches_schema_table()` in `hooks/policy.rs` and `visibility_matches()` in `engine/mod.rs` implement the same schema/table wildcard matching predicate. They were written independently and must be kept in sync by hand. A shared utility (e.g., `engine::policy_match::matches_schema_table`) should replace both.
421-
422383
#### `column_access deny` logic is in three places
423384

424385
After the recent fixes, column deny logic lives in:

0 commit comments

Comments
 (0)