Skip to content

Latest commit

 

History

History
153 lines (124 loc) · 13.5 KB

File metadata and controls

153 lines (124 loc) · 13.5 KB
title Customer Meta Query Optimization (fix-customer-meta-query-cast)
status planning
created 2026-06-10
updated 2026-06-10
deliverable standalone mu-plugin — wp-content/mu-plugins/fix-customer-meta-query-cast.php
related Hypercart Query Guard (sibling, NOT a dependency — see Decision Record)

Customer Meta Query Optimization — Product Plan

Most Recently Completed Phase What's Next
Phase 1 — Build the mu-plugin (wrote fix-customer-meta-query-cast.php with defensive shape-checking and logging) Phase 2 — Local Verification (install on local WooCommerce site, capture SQL, run EXPLAIN, verify result equivalence)

Table of Contents

Background

WooCommerce's REST API translates ?customer=X order queries into a WP_Meta_Query with type => 'NUMERIC', producing:

CAST(wp_postmeta.meta_value AS SIGNED) = X

The CAST() makes the query non-sargable — MySQL cannot use any index on meta_value. It can still narrow by the stock meta_key index, but that leaves it scanning every _customer_user row (roughly one per order) on every call. On a high-volume store with a multi-million-row wp_postmeta table, this was the root cause of a production database saturation incident: a third-party service's REST API callbacks issuing ?customer=X order queries consumed thousands of seconds of DB time over a 12-hour window.

The fix is a single filter that changes the _customer_user meta query type from NUMERIC to CHAR, removing the CAST() so the query becomes meta_value = 'X' and can use a companion composite index (idx_customer_user_post). Safety: _customer_user stores integer user IDs as strings, so CHAR comparison is semantically identical to NUMERIC for all valid user IDs.

Scope: REST only (deliberate). The same _customer_user NUMERIC meta query is generated by other WooCommerce code paths — notably wc_get_orders( [ 'customer' => X ] ) / WC_Order_Query and some admin order-list lookups — and those are intentionally not touched: the incident traffic was entirely REST. If _customer_user slow queries reappear from a non-REST origin, that is out-of-scope traffic, not the filter failing (see the Phase 2 check that records which non-REST paths emit the CAST()).

Decision Record

This ships as a standalone mu-plugin, not as part of Hypercart Query Guard.

  • QueryGuard's contract is defensive only — it enforces limits, observes, and logs, but never alters what queries execute. This fix rewrites query semantics, which would muddy that trust proposition.
  • The fix is deployment-specific in practice: it is coupled to the idx_customer_user_post index applied manually to the target database. QueryGuard is a generic multi-site product.
  • The fix has its own obsolescence path (HPOS migration — see Phase 5) and should not be entangled with QueryGuard's release cadence.
  • Revisit only if a family of sargability fixes accumulates; at that point a small "query optimizations" companion mu-plugin is the right home.

It also avoids forking the third-party plugin that triggered the incident — the fix is WooCommerce infrastructure, benefits any REST consumer querying orders by customer, and removes the burden of maintaining a fork against the vendor's update cycle.

Phase 0 — Incident Response & Index (Completed)

  • Root cause identified: non-sargable CAST(meta_value AS SIGNED) from WooCommerce REST ?customer=X queries, triggered at volume by a third-party service's REST API callbacks.
  • Composite index applied to production:
    ALTER TABLE wp_postmeta
      ADD INDEX idx_customer_user_post (meta_key, meta_value(20), post_id);
  • Decision made: standalone mu-plugin, not a QueryGuard feature, not a third-party plugin fork.

Phase 1 — Build the mu-plugin

Write fix-customer-meta-query-cast.php (~45 lines, no settings, no admin UI, no dependencies beyond WooCommerce).

  • Plugin header includes: purpose, companion-index note, and an explicit obsolescence line: "Obsolete after HPOS migration — _customer_user lookups leave wp_postmeta; remove this file at that point."
  • Filter function hcqo_optimize_customer_order_meta_query() implemented and hooked to both woocommerce_rest_shop_order_object_query and woocommerce_rest_orders_prepare_object_query at priority 99.
  • Filter only touches _customer_user clauses; all other meta query clauses pass through byte-identical.
  • Defensive shape-checking: before mutating, verify the args are an array, the meta query key exists, and the clause matches the exact shape WooCommerce REST emits: flat _customer_user clause, type NUMERIC, equality compare (= or absent), scalar non-negative integer value. CHAR is equivalent to NUMERIC only for integer equality — range compares (>, BETWEEN) are lexicographic under CHAR ('10' < '9'), so any other shape must stay on the slow-but-correct CAST() path. On any unexpected shape, return the input unmodified (never fatal, never warn-spam) — WooCommerce or a REST consumer could change the structure in a future version.
  • One-time confirmation logging: on the first request where the rewrite actually fires, emit a single Hypercart_Logger::info() — e.g. event customer_meta_query_cast_fixed. Guard with a 30-day transient that stores the WooCommerce version (WC_VERSION), plus a static flag as a per-request short-circuit — a static flag alone dedupes only within one PHP process and would log once per request, failing the Phase 2 test. Version-keying matters: a plain transient would suppress the event for 30 days regardless of whether the filter still works, so an upgrade inside that window would be silent either way. Keyed to the WC version, an upgrade re-fires the event immediately — silence after an upgrade unambiguously means the filter stopped matching (the Phase 5 health signal). Falls back silently if Hypercart_Logger is unavailable. This gives the same "confirm it's active in production" path used for QueryGuard.
  • Guard clause: bail early (return input unchanged) if WooCommerce is not active.
  • File passes php -l and PHPCS (match house style used in QueryGuard; copy QueryGuard's PHPCS config into this repo rather than referencing the sibling checkout, so linting runs standalone).

Phase 2 — Local Verification

  • Standalone smoke test (php tests/test-filter.php, no WordPress needed) passes: canonical clause rewritten; range compares, array values, malformed clauses, and nested groups all left untouched. Re-run after major WooCommerce upgrades — it verifies the shape-guard logic, not the live query WooCommerce emits.
  • Install the mu-plugin on a local WooCommerce site with sample orders and at least one customer with multiple orders.
  • Capture the SQL generated by GET /wp-json/wc/v3/orders?customer=X without the mu-plugin; confirm it contains CAST(... AS SIGNED).
  • Capture the same SQL with the mu-plugin; confirm it is meta_value = 'X' with no CAST().
  • Run EXPLAIN on the rewritten query against a table with the composite index present; confirm the index is selected (key = idx_customer_user_post, no full scan).
  • Result-equivalence check: response body for ?customer=X is identical with and without the mu-plugin (same order IDs, same count, same pagination totals).
  • Edge cases return identical results with and without the fix: a customer with zero orders, a non-existent user ID, and customer combined with other query params (status, after, per_page).
  • customer=0 returns identical results provided no malformed _customer_user values exist — NUMERIC casts garbage values to 0 and matches them; CHAR does not. Verify the precondition first:
    SELECT COUNT(*) FROM wp_postmeta
     WHERE meta_key = '_customer_user'
       AND meta_value NOT REGEXP '^(0|[1-9][0-9]*)$';
    Expect 0 (the strict pattern also catches leading zeros and whitespace). If any exist, the CHAR behavior — not matching them — is the accepted/correct one; record the difference rather than treating it as a failure.
  • Check whether wc_get_orders( [ 'customer' => X ] ) on this WooCommerce version also emits the CAST() (the filter deliberately covers only REST — see Scope note in Background); record the answer either way so future slow-query triage knows what is in scope.
  • Confirm the customer_meta_query_cast_fixed log event appears exactly once in debug.log across multiple consecutive requests (deduped by the version-keyed transient — once per TTL window per WooCommerce version, not per request). Bonus check: change the stored transient value (simulating a WC upgrade) and confirm the event re-fires on the next request.
  • Confirm a deliberately malformed meta query (simulated via a test filter at priority 98) passes through unmodified with no error.

Phase 3 — Production Deployment

  • Pre-flight: confirm idx_customer_user_post exists on the target production wp_postmeta (SHOW INDEX FROM wp_postmeta WHERE Key_name = 'idx_customer_user_post').
  • Pre-flight: run the malformed-value check from Phase 2 (meta_value NOT REGEXP '^(0|[1-9][0-9]*)$') against production and confirm it returns 0 — this is the empirical proof of the semantic-equivalence claim in Risks & Safety Notes.
  • Upload fix-customer-meta-query-cast.php to wp-content/mu-plugins/ on the target site (no activation step — mu-plugins load automatically). Record how it was uploaded (SFTP, host file manager, etc.) for the Phase 4 provisioning item.
  • Verify the file appears under Plugins → Must-Use in wp-admin. If the host caches the mu-plugins file list or runs opcache with timestamp validation off, flush/restart PHP so the new file is actually loaded.
  • Issue a manual authenticated GET /wp-json/wc/v3/orders?customer=X (or wait for a known REST consumer to call it) and confirm the customer_meta_query_cast_fixed log event fires.
  • Confirm via slow-query observation (QueryGuard observe mode or the host's query monitor) that ?customer=X REST queries no longer appear in the slow-query log.
  • Spot-check any service that queries orders by customer via REST to confirm its results are unchanged.

Phase 4 — Post-Deploy Monitoring & Cleanup

  • Monitor production DB time for 7 days post-deploy; record before/after numbers against the pre-fix baseline in this file.
  • If a forked third-party plugin was used as an interim workaround, restore the stock vendor version and archive the fork with a README pointing to this plan.
  • Update the originating incident record with resolution notes linking to this document.
  • Add the mu-plugin file to whatever backup/provisioning process rebuilds mu-plugins/ (so a host migration doesn't silently drop it).

Phase 5 — Future / Obsolescence

  • HPOS migration trigger: when the site migrates to WooCommerce HPOS, order customer lookups move to wp_wc_orders.customer_id (indexed natively). At that point: remove fix-customer-meta-query-cast.php, drop idx_customer_user_post if wp_postmeta order meta has been fully migrated, and close this plan.
  • Family-of-fixes trigger: if a second sargability/meta-query fix of this kind is needed, consolidate into a small "query optimizations" companion mu-plugin with one file per fix or one registry — revisit the Decision Record above.
  • Until either trigger fires: re-verify the fix after major WooCommerce upgrades — run php tests/test-filter.php and watch for the customer_meta_query_cast_fixed event. The transient is keyed to WC_VERSION, so the event re-fires on the first REST ?customer=X hit after any upgrade; its absence after an upgrade unambiguously means the query shape changed and the performance protection silently stopped (the filter degrades gracefully — correctness is unaffected).

Risks & Safety Notes

  • Semantic equivalence holds because _customer_user values are always integer strings written by WooCommerce. If a third party ever wrote a non-numeric value, CHAR comparison would simply not match it (NUMERIC would have cast it to 0 and potentially matched customer=0) — CHAR is the safer comparison. The Phase 2/3 REGEXP pre-checks verify this assumption empirically rather than taking it on faith.
  • Leading zeros / whitespace: NUMERIC would match '05' or ' 5' for customer=5; CHAR will not. WooCommerce never writes such values, but a past import or migration could have — the strict REGEXP '^(0|[1-9][0-9]*)$' pre-check catches these alongside non-numeric garbage.
  • Index prefix length: meta_value(20) covers any realistic user ID; no truncation risk.
  • Failure mode is performance, not correctness: if the filter stops firing (shape change, WooCommerce refactor), queries revert to the slow CAST path — site stays correct, but watch for slow-query log reappearance.
  • The index and the mu-plugin are a pair. The mu-plugin without the index removes the CAST but still scans; the index without the mu-plugin is unused by these queries. Pre-flight check in Phase 3 enforces this.