Skip to content

Orm datascope#27188

Open
Jengro777 wants to merge 6 commits into
vlang:masterfrom
Jengro777:orm-scope
Open

Orm datascope#27188
Jengro777 wants to merge 6 commits into
vlang:masterfrom
Jengro777:orm-scope

Conversation

@Jengro777
Copy link
Copy Markdown
Contributor

fix #27154

Summary

Replaces the old global, thread-unsafe tenant_filter system with a per-connection, decorator-based DataScope system. This is the V ORM equivalent of Java's DelegatingConnection, Go's connector wrappers, and Rust's Tower middleware pattern.

Problem (Old System)

// Global mutable state — shared across ALL goroutines/requests
__global tenant_filter_state = TenantFilterState{...}

orm.set_current_tenant_id(orm.Primitive(5))
sql db { select from User }!  // implicitly filtered by global state

Three fatal flaws:

  1. Not thread-safe — one request's set_current_tenant_id() leaks to others
  2. Single-tenant only — global state means only one tenant filter active process-wide
  3. Invasive — every DB driver (sqlite/pg/mysql) had to call apply_tenant_filter() internally

Solution (New System)

// Per-request DB — isolated, immutable configuration
db := orm.new_db(raw_conn, orm.DataScope{
    filters: [orm.QueryFilter{field: 'tenant_id', value: orm.Primitive(5)}]
})

sql db { select from User }!  // filtered by db's own DataScope, no global state

Architecture comparison

        OLD (v1)                              NEW (v2)
   ┌─────────────────┐                 ┌─────────────────┐
   │  __global state │                 │  per-request DB │
   │  tenant_id = 5  │                 │  DataScope{...} │
   └────────┬────────┘                 └────────┬────────┘
            │                                    │
    ┌───────▼────────┐                  ┌───────▼────────┐
    │ sqlite.select() │                  │  orm.DB.select()│
    │  ┌───────────┐  │                  │  ┌───────────┐  │
    │  │apply_     │  │                  │  │apply_data │  │
    │  │tenant_    │  │                  │  │_scope()   │  │
    │  │filter()   │──┤ reads global     │  └─────┬─────┘  │
    │  └───────────┘  │                  │        │        │
    │       │         │                  │  ┌─────▼─────┐  │
    │  actual query   │                  │  │conn.select│  │
    └─────────────────┘                  │  └───────────┘  │
        ⚠ thread-unsafe                  └─────────────────┘
        ⚠ single scope                       ✅ thread-safe
        ⚠ couples backends                   ✅ multi-scope
                                             ✅ backends clean

Public API

// Create scoped DB — accepts any Connection
pub fn new_db(conn Connection, scope DataScope) DB

// Skip scope filters. Returns a new DB (immutable builder).
db.unscoped()              // skip ALL scope filters
db.unscoped('org_id')      // skip only 'org_id' filter

// Mark table permanently exempt from DataScope
@[unscoped]
struct Config { ... }

Key design decisions

Decision Rationale
new_db(conn Connection, ...) accepts base interface Decorator pattern. JDBC/Go/Rust all accept the widest interface
Transaction methods use runtime is check Go's .(ConnBeginTx) equivalent. Returns error when unsupported
unscoped() returns new DB (immutable) Go's context.WithValue() pattern. No mutation
conn field is module-private Encapsulation. Callers use the interface

Migration

Basic setup

- orm.set_current_tenant_id(orm.Primitive(5))
+ db := orm.new_db(raw_conn, orm.DataScope{
+     filters: [orm.QueryFilter{field: 'tenant_id', value: orm.Primitive(5)}]
+ })

Skip filters

- orm.without_tenant_filter(fn () ![]User {
-     return sql db { select from User }!
- })!
+ unscoped_db := db.unscoped('tenant_id')
+ users := sql unscoped_db { select from User }!

Middleware pattern

fn request_handler(mut ctx Context) {
    base := orm.new_db(ctx.pool.acquire()!, orm.DataScope{
        filters: [orm.QueryFilter{field: 'tenant_id', value: orm.Primitive(ctx.tenant_id)}]
    })
    ctx.db = match ctx.role {
        .admin   { base.unscoped() }        // no scope
        .manager { base.unscoped('org_id') } // partial scope
        .normal  { base }                    // full scope
    }
    // handlers use ctx.db — scope-unaware
}

Removed API

Removed Replacement
configure_tenant_filter() / set_current_tenant_id() orm.new_db(conn, DataScope{...})
with_tenant() / without_tenant_filter() db.unscoped(...)
@[tenant_filter] / @[ignore_tenant_filter] @[unscoped]
apply_tenant_filter() Handled internally by orm.DB

Tests

  • 84 assertions in orm_scope_test.v covering SELECT/INSERT/UPDATE/DELETE × scope/unscoped/transaction
  • All 38 existing ORM tests continue to pass
  • All three DB backends (sqlite/pg/mysql) pass
  • Compiled and tested under tcc, gcc, clang × debug + `-prod

* Orm datascope v2 (#7)

* orm: use struct field names in Table.fields for DataScope validation, rename test/example files

- orm_table_field_names() now returns V struct field names (field.name)
  instead of SQL column names (get_orm_column_name_from_struct_field),
  so apply_data_scope() field existence check matches QueryFilter.field
  values correctly when @[sql:] column aliases are used
- rename vlib/orm/orm_no_scope_test.v -> orm_scope_test.v
- rename examples/orm/orm_middleware.v -> orm_scope_middleware.v

orm: fix three DataScope issues from code review

1. Fix P1: append is_and marker for unary scope operators (.is_null/.is_not_null)
   - Remove !is_unary() guard from is_and and parentheses appending
   - Unary filters now properly append AND connector with existing WHERE

2. Fix P1: translate scope field names to SQL column names
   - Add columns []string to Table struct (parallel to fields)
   - Add orm_table_column_names() in cgen to collect SQL column names
   - write_orm_table_struct() now generates .columns C array
   - apply_data_scope() resolves struct field name -> SQL column name via table.columns

3. Fix P2: guard empty typed arrays in primitive_type
   - Add len > 0 checks to all typed array variants ([]int, []string, etc.)
   - Empty arrays fall back to type_idx['int'] matching old behavior

* orm: replace global tenant_filter with per-connection DataScope system, fix cgen bugs

== Core change: DataScope replaces global tenant_filter ==

Remove the old `tenant_filter` global-state system (`tenant_filter_state`
global, `set_tenant_filter_enabled`, `set_current_tenant_id`,
`with_tenant`, `without_tenant_filter`, etc.) and replace with a
per-connection `DataScope` system:

- Add `orm.DB` struct wrapping `orm.Connection` with `DataScope` scope
  and `unscoped_fields` for selective scope skipping
- Add `orm.new_db(conn, scope)` to create a scoped DB wrapper
- Add `orm.DB.unscoped(fields...)` to skip scope filters per-field or
  all with `unscoped()`
- `DB` implements all `Connection` interface methods (select, insert,
  update, delete, create, drop, last_id), applying DataScope before
  delegating to the underlying connection
- Add `apply_data_scope()` and `apply_data_scope_insert()` for WHERE
  and INSERT scope filtering
- Scope filters support multi-field scopes, deduplication, unscoped
  field skipping, and table-level scope opt-out via
  @[orm:scope_filter(ignore)] attribute
- Remove `apply_tenant_filter()` calls from all DB drivers (sqlite, pg,
  mysql); scope handling is centralized in `DB` wrapper

== Fix: cgen `(orm__QueryData){}` missing .fields ==

Two compound literals in `write_orm_select` were missing the `.fields`
member: the DATA (limit/offset) QueryData and the no-WHERE WHERE
QueryData.  In C, uninitialized compound literal members are
zero-initialized, giving the `.fields` array `element_size=0`.  This
propagated through `builtin__array_clone_to_depth`, causing
`builtin__array_push` to copy 0 bytes per string, resulting in empty
column names in WHERE clauses.

== Fix: cgen skip @[skip]/@[sql:-] fields ==

`orm_table_field_names` and `orm_table_column_names` now skip fields
with `@[skip]` or `@[sql:-]` attributes, preventing them from being
included in `Table.fields`/`Table.columns`.

== Miscellaneous ==

- Add `.clone()` on column name string in `apply_data_scope` to prevent
  shared backing data issues
- Rename `tenant_filter_primitive_type` to `primitive_type` (made public)
- Remove old tenant_filter tests from `orm_fn_test.v`
- Clean up trailing whitespace and unnecessary blank lines

All 38 ORM tests pass.

* orm: address Copilot review on PR #4

Fix 6 review comments:
- apply_data_scope_insert: resolve SQL column names from struct field
  names via Table.fields/Table.columns mapping (HIGH)
- apply_data_scope: build field-to-column lookup map for O(1)
  resolution instead of O(filters * fields) linear scan (MEDIUM)
- apply_data_scope: wrap original WHERE in parentheses once before
  the scope filter loop, avoiding redundant nesting (MEDIUM)
- QueryFilter.field: document that field must be a struct field
  name (not SQL column name) (MEDIUM)
- orm_scope_middleware.v: fix filename in comment and run cmd (LOW)
- Table: add new_table() constructor to mitigate future breakage
  from struct field additions (MEDIUM)

* Update .gitignore

* orm: address Copilot review fixes for DataScope

* cgen: revert mods_with_c_libs changes

Reverts c809d93 and aa709b7.

* orm: add TransactionalConnection proxy to DB wrapper

- Change orm.DB.conn type from Connection to TransactionalConnection
- Add 6 proxy methods: orm_begin, orm_commit, orm_rollback,
  orm_savepoint, orm_rollback_to, orm_release_savepoint
- Update new_db() to accept TransactionalConnection
- Add 7 transaction proxy tests covering commit, rollback,
  savepoint, scope+transaction interaction, and interface
  satisfaction

Fixes: scoped DB wrapper now supports transaction workflows
via the TransactionalConnection interface.

* orm: remove stale @[has_globals] attribute

No __global variables remain in the orm module after the
global tenant_filter_state was removed in DataScope v2.

* orm: accept Connection in new_db, use runtime interface cast

- Change DB.conn back to Connection (was TransactionalConnection)
- new_db(conn Connection, ...) accepts any Connection again
- Transaction proxy methods use runtime is+unsafe cast:
  if db.conn is TransactionalConnection { ... } else { error }
- Works around V codegen limitation with &struct_field as Interface
  by copying the interface to a local variable first
- Compiles and passes under both debug and -prod

* orm: extract apply_scope_filters common core, make primitive_type private

- Extract shared filter-iteration logic from apply_data_scope and
  apply_data_scope_insert into a single apply_scope_filters helper.
  reduce ~80 duplicate lines. The two pub functions are now thin
  wrappers, keeping the public API unchanged.

- Change primitive_type from pub fn to fn — no external callers.

* orm: address review findings F1, F2, F5

F1: Remove ['*'] magic sentinel. Replace unscoped_fields with
    skip_all_scopes bool + skip_fields []string. The 'skip all'
    case is checked before calling apply_data_scope, eliminating
    the wildcard string from the filter pipeline.

F2: Make DB.conn module-private (mut: instead of pub mut:).
    Callers access the connection through the Connection
    interface, not by direct field access.

F5: Replace apply_scope_filters where_mode bool with
    ScopeMode enum (.where / .insert) for clarity.

---------

Co-authored-by: Alexander Medvednikov <alexander@medvednikov.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89b7385e65

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread vlib/orm/orm.v
Comment on lines +448 to +450
if !filter.operator.is_unary() {
result.data << filter.value
result.types << primitive_type(filter.value)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Replicate scoped INSERT values for every batch row

When scope filters are applied in insert mode, this appends each filter value only once, but batch inserts (QueryData.batch_rows > 1) require row-major data with one value per field per row. As a result, using a scoped orm.DB with insert users into User can misalign/break parameter binding (wrong values per row or missing bind values), because orm_stmt_gen/prepare_insert_query_data expect fields.len * batch_rows values.

Useful? React with 👍 / 👎.

Comment thread vlib/v/gen/c/fn.v Outdated
return false
}
return file_links_c_source(node.source_file) || node.mod in g.mods_with_c_libs
return file_links_c_source(node.source_file)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve extern emission for modules linked via #flag -l

This now emits C extern declarations only when the source file links .c/.o artifacts, dropping the previous module-level #flag -l... path. That breaks C interop modules that link prebuilt libraries and declare fn C.* without local #includes: the generated C can call undeclared functions, which fails on toolchains that treat implicit declarations as errors.

Useful? React with 👍 / 👎.

Jengro777 added 5 commits May 18, 2026 17:44
apply_scope_filters in insert mode appended scope values only once,
but batch inserts (batch_rows > 1) require one value per row in
row-major order: row * fields.len + col. After the scope loop,
rebuild the data array with scope values interleaved per row.

Add two batch insert tests with a struct lacking the scope fields
to validate scope injection works for multi-row inserts.
6 transaction methods (orm_begin, orm_commit, orm_rollback,
orm_savepoint, orm_rollback_to, orm_release_savepoint) each had
the same 2-line pattern:
    mut conn := db.conn
    mut tc := unsafe { &conn as TransactionalConnection }

Extract it into unwrap_to_tx() so there is a single place to
maintain the unsafe cast workaround for V's codegen limitation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@[tenant_filter]` global state is unsafe — need a thread-safe, request-level design

1 participant