Skip to content

feat: index-friendly sort and search via per-filter case/join modes#729

Merged
binaryk merged 6 commits intoBinarCode:10.xfrom
george-todica:index-friendly-sort-and-search
May 5, 2026
Merged

feat: index-friendly sort and search via per-filter case/join modes#729
binaryk merged 6 commits intoBinarCode:10.xfrom
george-todica:index-friendly-sort-and-search

Conversation

@george-todica
Copy link
Copy Markdown
Contributor

@george-todica george-todica commented Apr 30, 2026

Problem

Two SQL patterns produced by SearchableFilter and SortableFilter bypass database indexes on consumer apps' hot endpoints:

1. Sort by relation column → correlated subquery

ORDER BY (
    SELECT vendors.code FROM vendors
    WHERE vendors.id = invoices.vendor_id ...
    ORDER BY vendors.code ASC LIMIT 1
) ASC

Per-row evaluation. The optimizer can't push the relation into the main scan. Asymmetric with the search side, which has had a LEFT JOIN strategy (restify.search.use_joins_for_belongs_to) for some time.

2. Case-insensitive search → UPPER() on every column

WHERE (
    UPPER(invoices.gross_amount) LIKE UPPER(?) OR    -- DECIMAL!
    UPPER(invoices.number) LIKE UPPER(?) OR
    UPPER(vendors.code) LIKE UPPER(?) OR              -- stored uppercase
    UPPER(vendors.name) LIKE UPPER(?)
)

UPPER(numeric) casts a DECIMAL to text per row. UPPER(vendors.code) is wasteful when codes are stored uppercase by convention. Same for status enums stored lowercase. Index never used.

Production data (Flare APM, last 7d)

Sort hotspots:

Endpoint p95 avg calls/7d
GET billings ?sort=customers.code 1440 ms 989 ms 4
GET invoices ?sort=vendors.code 1125 ms 293 ms 4
invoice_adjustments default sort vendors.code 800 ms 800 ms 1

Search hotspots — most frequent UPPER(column) in slow query set:

Count Column
11 vendors.name, vendors.code
5 jobs.number, invoices.number, invoices.gross_amount, customers.name, customers.code
3 work_orders.number, invoice_payments.number, billings.number, employees.code, purchase_orders.*
2 journals.*, work_sites.*, line_items.*

Audit across 461 searchable columns in 10 domain repositories of one consumer app: ~70% would benefit from raw (numerics, FKs, dates, known-case codes/enums); ~29% are human text genuinely needing case-insensitive matching.


Solution

Two independent, opt-in, backward-compatible mechanisms:

Feature 1 — Sortable JOIN strategy

LEFT JOIN strategy for sort-by-BelongsTo / sort-by-HasOne, gated by:

  1. New config flag restify.sort.use_joins_for_belongs_to (default false).
  2. Two per-filter chainable methods: ->useJoin() / ->useSubquery().

JOIN dedup logic extracted to a shared Filters/Concerns/AppliesRelationJoin trait — used by both the new sort path and the existing RepositorySearchService::applyBelongsToJoins(). Search-side behavior preserved exactly.

Feature 2 — Searchable per-field case modes

Five chainable case-mode methods on SearchableFilter + a flexible transform(callable) escape hatch. Single class, no subclasses. Per-column choice of column expression and value transformation:

Method Column Value Use when
->caseRaw() raw raw numerics, FK ids, dates, blind-indexed
->upperValue() raw UPPER column always stored UPPER (vendors.code)
->lowerValue() raw LOWER column always stored lowercase (status)
->upperBoth() UPPER UPPER legacy case-insensitive (today's default with case_sensitive=false)
->lowerBoth() LOWER LOWER mirror
->transform(callable) raw callable custom (trim+lower, unicode fold, digit-strip)

The five named methods are convenience wrappers around transform() plus a column-wrap setting. No new config key — default behavior derives from existing restify.search.case_sensitive.

BelongsTo::->searchable([...]) extended to accept mixed string | SearchableFilter arrays, so per-column overrides work for joined searchables too.


Examples

Sort — before

SortableFilter::make()
    ->setColumn('vendors.code')
    ->usingRelation(BelongsTo::make('vendor', VendorRepository::class));

emits:

ORDER BY (SELECT vendors.code FROM vendors WHERE vendors.id = invoices.vendor_id LIMIT 1) ASC

Sort — after (config flag on, or per-filter ->useJoin())

LEFT JOIN vendors ON invoices.vendor_id = vendors.id
ORDER BY vendors.code ASC

Sort — dedup with search

When search has already left-joined vendors (from BelongsTo->searchable(['vendors.name']) + restify.search.use_joins_for_belongs_to=true), the sort path detects the existing join and reuses it — exactly one LEFT JOIN vendors in final SQL.

Search — before (case_sensitive=false)

public static function searchables(): array
{
    return ['gross_amount', 'number'];
}

'vendor' => BelongsTo::make('vendor', VendorRepository::class)->searchable([
    'vendors.code',
    'vendors.name',
]),

emits:

WHERE (
    UPPER(invoices.gross_amount) LIKE UPPER('%500%') OR
    UPPER(invoices.number) LIKE UPPER('%500%') OR
    UPPER(vendors.code) LIKE UPPER('%500%') OR
    UPPER(vendors.name) LIKE UPPER('%500%')
)

None of the four predicates use an index.

Search — after, opt critical columns in

public static function searchables(): array
{
    return [
        SearchableFilter::make('gross_amount')->caseRaw(),    // numeric
        SearchableFilter::make('number')->upperValue(),       // stored UPPER
    ];
}

'vendor' => BelongsTo::make('vendor', VendorRepository::class)->searchable([
    SearchableFilter::make('vendors.code')->upperValue(),    // stored UPPER
    'vendors.name',                                            // human text — keep upperBoth default
]),

emits:

WHERE (
    invoices.gross_amount LIKE '%500%' OR             -- index usable
    invoices.number LIKE '%500%' OR                    -- index usable
    vendors.code LIKE '%500%' OR                       -- index usable
    UPPER(vendors.name) LIKE UPPER('%500%')            -- intentional case-insensitive on human text
)

Search — custom value normalization

SearchableFilter::make('email')->transform(
    fn (string $v): string => trim(strtolower($v))
),
SearchableFilter::make('phone')->transform(
    fn (string $v): string => preg_replace('/\D/', '', $v)
),

Impact

Sort: production SQL shape changes from O(n) per-row subquery to O(1) LEFT JOIN. Expected ≥ 2× p95 reduction on the listed routes based on plan analysis.

Search: 70% of audited columns can stop wrapping the column in UPPER() and become directly indexable. The 29% needing case-insensitive matching opt into ->upperBoth() explicitly — same SQL as today by default. Net: per-column rollout, zero regression risk, every opt-in is a strict perf win.


API summary

// Sort — config (new)
'sort' => [
    'use_joins_for_belongs_to' => env('RESTIFY_SORT_USE_JOINS_FOR_BELONGS_TO', false),
],

// Sort — per-filter (new)
SortableFilter::make()->useJoin();          // strategy = JOIN
SortableFilter::make()->useSubquery();      // strategy = subquery

// Search — per-filter (new)
SearchableFilter::make('col')->caseRaw();
SearchableFilter::make('col')->upperValue();
SearchableFilter::make('col')->lowerValue();
SearchableFilter::make('col')->upperBoth();
SearchableFilter::make('col')->lowerBoth();
SearchableFilter::make('col')->transform(fn (string $v): string => /* ... */);

// BelongsTo — accepts mixed strings + SearchableFilter instances
BelongsTo::make('vendor', VendorRepository::class)->searchable([
    'vendors.name',
    SearchableFilter::make('vendors.code')->upperValue(),
]),

Resolution order (sort): per-filter > config > legacy subquery.
Resolution order (search): per-filter method > legacy case_sensitive.


Why this design, not others

Sort

Alternative Rejected because
Always JOIN, no flag Breaking change for projects depending on subquery semantics. Strict back-compat is non-negotiable for a perf-only PR.
Default true in package config Same upgrade risk. Default false lets every consumer evaluate the trade-off explicitly.
Config-only, no per-filter override Loses "flip one column at a time" workflow. ->useJoin() makes single-hotspot validation cheap.
Cover HasMany / MorphTo / BelongsToMany Row multiplication under JOIN — silent data corruption. Restricted to BelongsTo + HasOne (which usingRelation() already accepts).
Inline dedup logic in SortableFilter Same dedup already lives in applyBelongsToJoins. Duplication would diverge over time. Trait = one source of truth across sort and search.
Skip the trait, call into the search service Cross-class coupling between filters and a service. Trait is more idiomatic and keeps callers at their natural locations.
Add tenant predicates to JOIN ON clause Existing search-side JOINs don't either — they rely on outer WHERE. Preserving the existing contract; changing it is separate, riskier scope.

Search

Alternative Rejected because
New case_mode config key (raw|upper_both|lower_both) Adds env knob and precedence layer for marginal benefit. Existing case_sensitive covers the global. Per-field methods cover the rest. Less surface to mis-configure.
Constructor param: SearchableFilter::make('col', CASE_UPPER_VALUE) Inconsistent with SortableFilter::make()->useJoin() and broader field()->rules()->matchableText() style. Breaks IDE autocomplete discovery. Less self-documenting at the call site.
Dedicated subclasses (RawSearchableFilter, UpperValueSearchableFilter…) NaturalSort-style Initial draft. Rejected as "too many classes, confusing." Single class with 5 chainable methods + escape hatch matches the fluent style used everywhere else in the package and Filament/Nova/Eloquent.
matchUpper / matchLower vs bothUpper / bothLower Names too similar — easy to confuse "which side gets uppered?" Renamed to upperValue / upperBoth: prefix is action (upper), suffix is scope (value-only vs both). Parallel and explicit.
Don't extend BelongsTo::searchable([]) Biggest wins (vendors.code, customers.code, jobs.number) live behind BelongsTo searchables. Skipping leaves half the perf gain on the table. Two small relaxations + per-entry unwrap make it transparent.
Bake transformer into named methods only Real cases need custom normalization (trim+lower, unicode fold, digit-strip). ->transform(callable) subsumes the named ones AND opens the door to anything else. Single internal primitive; named methods are wrappers around it.
Flip global default to case_sensitive=true Audit verdict: ~35-50 callsites in one consumer would need explicit upperBoth() to avoid regressing human-text searches. Feasible follow-up — not in this PR. API ships first, sweep follows at consumer pace.
Auto-detect column case from schema/data Brittle: requires reading column metadata or sampling, expensive at request time, unreliable. Per-column declarative intent at the call site is clearer and free at runtime.

Backward compatibility

  • Sort: new config key defaults to false → no SQL change for any consumer that doesn't flip it. Per-filter methods opt-in.
  • Search: no new config keys. A repository that does not call any new chainable methods produces byte-identical SQL to before (verified by parity tests).
  • BelongsTo searchable(['col_a', 'col_b']) (string-only) keeps working unchanged. Mixed-array form is purely additive.
  • RepositorySearchService::applyBelongsToJoins() refactor (trait extraction) is behavior-preserving — same dedup, same column qualification, same exception handling.
  • All existing tests unchanged.

Test coverage

Sort — 8 new cases in tests/Feature/Filters/SortableJoinStrategyTest.php:

  1. Subquery preserved when flag off.
  2. LEFT JOIN emitted when flag on.
  3. ->useJoin() overrides config=false.
  4. ->useSubquery() overrides config=true.
  5. Dedup: when search already joined the same table, sort doesn't double-join.
  6. Multiple sortables on same related table share one join.
  7. HasOne sortable supported.
  8. Authorization respected — no JOIN, no ORDER BY change when relation unauthorized.

Search — 8 new cases in tests/Feature/Filters/SearchableCaseModeTest.php:

  1. case_sensitive=false default still emits UPPER both — strict back-compat.
  2. case_sensitive=true default emits raw — strict back-compat.
  3. ->caseRaw() overrides global insensitive.
  4. ->upperValue() — column raw, value uppercased.
  5. ->lowerValue() — column raw, value lowercased.
  6. ->lowerBoth() — both sides lowered.
  7. ->transform(callable) — custom closure (verified with trim + strtolower).
  8. BelongsTo searchable mix — strings + SearchableFilter instances coexist.

Vendor regression: 53 tests / 207 assertions green across tests/Feature/Filters, tests/Feature/BelongsToJoinConfigTest.php, tests/Fields/BelongsToFieldTest.php.


Files changed

File Change
src/Filters/SortableFilter.php useJoin(), useSubquery(), applyJoinSort(), mode resolution
src/Filters/SearchableFilter.php Six chainable case methods, mode resolution, applyValueAndColumn() helper, BelongsTo per-entry unwrap
src/Filters/Concerns/AppliesRelationJoin.php newensureLeftJoin(...) trait shared by sort and search
src/Services/Search/RepositorySearchService.php applyBelongsToJoins() uses trait; behavior unchanged
src/Fields/BelongsTo.php searchable([]) accepts mixed string | SearchableFilter arrays
config/restify.php new sort.use_joins_for_belongs_to block
tests/Feature/Filters/SortableJoinStrategyTest.php new, 8 cases
tests/Feature/Filters/SearchableCaseModeTest.php new, 8 cases

Migration

None required. Both features are off-by-default / opt-in. Consumers can:

  • Leave everything unchanged → zero behavior change.
  • Set RESTIFY_SORT_USE_JOINS_FOR_BELONGS_TO=true for global sort JOIN rollout.
  • Sprinkle ->useJoin() / ->useSubquery() on perf-critical sortables for surgical sort rollout.
  • Wrap individual searchables in SearchableFilter::make('col')->caseRaw() (or any other method) when ready for the perf win.
  • Audit-and-sweep at any pace; each opt-in is self-contained with clear test surface.

@what-the-diff
Copy link
Copy Markdown

what-the-diff Bot commented Apr 30, 2026

PR Summary

  • Enhanced Configuration for Sorting
    Added a new configuration for sorting within the config/restify.php file, enabling more proficient sorting capabilities, including operating with JOINs for sorting related columns.

  • Improved Searchable Method
    The searchable method within BelongsTo.php is improved to accommodate both string attributes and instances of SearchableFilter. This enhancement provides more flexibility.

  • New Trait for Ensuring LEFT JOINs
    The AppliesRelationJoin trait introduced within src/Filters/Concerns provides methods to ensure LEFT JOINs are added only if they haven't been applied to the query previously. This addition ensures no redundant operations occur.

  • More Capable SearchableFilter
    The SearchableFilter.php file was restructured to support transformed values for searching, including case transformations. This also enhances the handling of table joins.

  • SortableFilter with JOINs Support
    The SortableFilter.php file was updated to support sorting through JOINs and helper methods were added for determining when to use subqueries instead of JOINs.

  • Efficient JOINs Application
    New functionality was added to the RepositorySearchService.php file that applies JOINs for related models efficiently using the newly added AppliesRelationJoin trait.

Test Case Implementations

  • SearchableCaseModeTest Class
    A new test class that validates case sensitivity behavior in searchable filters for a Post model.

  • Various Test Cases for Search Functionality
    Multiple test scenarios were added which includes verification of search responses, examining how transformations affect search behavior, and making sure related models are properly included during searches.

  • SortableJoinStrategyTest Class
    Added a new test class that validates sorting behavior, specifically when dealing with related models.

  • Sorting Functionality Tests
    Introduced numerous test methods that verify whether the system uses subqueries or joins for sorting related fields, validates configuration overrides and checks correct application of relationships during sorting.

  • New Utility Methods for Testing
    Both of the aforementioned Test classes include several utility methods for capturing the last executed SQL queries, which is helpful to verify the correctness of various database interactions.

@arthurkirkosa arthurkirkosa requested a review from binaryk April 30, 2026 18:07
Copy link
Copy Markdown
Collaborator

@binaryk binaryk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Tight scope, careful back-compat, real perf rationale with measured production data. The trait extraction is the right call and the parity test (test_subquery_and_join_strategies_produce_identical_id_order) is exactly the assertion that makes the JOIN switch safe to land.

A few items worth addressing before merge — none are blocking.

Items I'd want addressed

1. SearchableFilter::make() overrides the Make trait and silently drops argssrc/Filters/SearchableFilter.php

public static function make(...$arguments): static
{
    $filter = new static;                                  // args dropped
    if (isset($arguments[0]) && is_string($arguments[0])) {
        $filter->setColumn($arguments[0]);
    }
    return $filter;
}

The base Make trait forwards all args to the constructor (new static(...$arguments)). The override only handles the string first arg and discards the rest. Existing callers like CustomSearchableFilter::make() (no args) still work, but any subclass that relies on constructor-arg forwarding will silently lose them. Suggest new static(...$arguments) or document the override.

2. column() ?? '' swallows missing columnssrc/Filters/SearchableFilter.php (resolveBelongsToEntry) and src/Fields/BelongsTo.php lines 58 / 78 / 92

If a caller passes SearchableFilter::make()->upperValue() (no column set) into BelongsTo->searchable([...]), the ?? '' coalesce produces broken SQL like where '' like '%foo%' instead of throwing. Fail-fast on null/empty column would prevent silent SQL corruption.

Nits

3. applyBelongsToSubquery hard-codes 'like' instead of $likeOperator in the wrapped branch. The RAW branch a few lines up uses $likeOperator (which becomes ilike on pgsql), so the two branches are inconsistent. Pre-existing behavior, but worth noting that caseRaw() on pgsql still uses ilike and is therefore not strictly case-sensitive.

4. flatten()flatten(1) in BelongsTo::searchable — necessary so SearchableFilter instances aren't dissolved by deep flattening, but a one-line comment explaining why would prevent a future "let's clean this up" regression.

5. ensureLeftJoin() table-name dedup ignores join type — if anything upstream registered an INNER JOIN users (e.g., a custom repository hook), this returns early and the sort path silently relies on the inner join. A $join->type === 'left' guard would be more defensive. Unlikely in practice.

6. Test gap. The BelongsTo case-mode mix test only verifies upperValue. One assertion on lowerValue (or caseRaw) on a BelongsTo searchable would round out the matrix.

7. Doc/test mismatch. The basic-filters doc shows SearchableFilter::make('gross_amount')->caseRaw() (short factory form), but every test uses SearchableFilter::make()->setColumn('gross_amount')->caseRaw(). Add one test in the short form to lock the public API shown in docs.

Risk

  • Sort: low — opt-in two ways, parity test asserts identical IDs.
  • Search: low — no new config keys, byte-identical SQL when no new methods are called (covered by the two test_default_* tests).
  • BelongsTo::searchable polymorphism: low — string-only arrays unchanged; mixed form covered.

Nice work on the Flare APM data in the description — the production hotspot table makes the case for the design choices much easier to evaluate.

@george-todica george-todica requested a review from binaryk May 4, 2026 15:17
Copy link
Copy Markdown
Collaborator

@binaryk binaryk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review

All seven items addressed. Quick verification:

# Item Resolution
1 SearchableFilter::make() arg-forwarding new static(...$arguments) — args now reach the constructor ✓
2 Empty-column fail-fast InvalidArgumentException thrown in BelongsTo::extractColumn() and SearchableFilter::resolveBelongsToEntry() with a helpful "Pass it as SearchableFilter::make('column') or call setColumn()" message ✓
3 pgsql like vs $likeOperator mismatch Acknowledged with an inline comment in applyBelongsToSubquery explaining the legacy contract and the pgsql ILIKE-on-RAW consequence ✓
4 flatten(1) rationale Comment added: "flatten(1) preserves SearchableFilter instances inside the array; deeper flattening would dissolve them into their internal arrays."
5 ensureLeftJoin join-type guard Now matches $join->table === $relatedTable && $join->type === 'left', with a comment on INNER vs LEFT semantics ✓
6 Test gap — lowerValue on BelongsTo New test_belongs_to_searchable_supports_lower_value_per_entry
7 Short-form factory tested New test_searchable_filter_short_form_make_with_column_argument exercises SearchableFilter::make('title')->upperValue()

Bonus: test_belongs_to_searchable_throws_when_filter_has_no_column locks the fail-fast contract from item 2.

LGTM — happy to see this merged.


Generated by Claude Code

@binaryk binaryk merged commit e61f3af into BinarCode:10.x May 5, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants