THREESCALE-14244 Optimize Oracle Enhanced adapter performance#4261
THREESCALE-14244 Optimize Oracle Enhanced adapter performance#4261jlledom wants to merge 1 commit into
Conversation
The oracle-enhanced adapter fires expensive Oracle data-dictionary queries on every DDL operation and for every table during db:schema:dump. DDL fixes: - Cache describe() results per connection to avoid repeated 4-way UNION queries across all_tables/all_views/all_synonyms on every add_index - Simplify data_source_exists? to use table_exists? (single all_tables query) instead of the full describe() UNION - Skip redundant table_exists?/index_name_exists? validation in add_index_options — Oracle raises ORA-00955 on duplicates anyway Schema dump prefetch (db:schema:dump): - Prefetch columns, indexes, primary keys, table comments, and foreign keys in 5 bulk queries before iterating tables, replacing ~450 per-table data-dictionary queries with 5 single queries Measured on Oracle XE (90 tables): - DDL operations (create table + index): 0.92s → 0.07s (~12x faster) - db:schema:dump: 4+ minutes → ~9s (~40x faster) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
Wow, this is really too complicated. It seems like people have found a couple of simpler solutions: rsim/oracle-enhanced#2467 but didn't follow through with either of them. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4261 +/- ##
==========================================
+ Coverage 82.12% 88.21% +6.09%
==========================================
Files 204 1765 +1561
Lines 3888 44451 +40563
Branches 686 686
==========================================
+ Hits 3193 39213 +36020
- Misses 679 5222 +4543
Partials 16 16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I tried this and it doesn't work. That PR is wrong regrettably |
|
I don't think it's worth keeping this PR open. I'll apply changes from here every time I need to work with oracle to make the schema dump faster, but I don't want to take the time to really review this in order to merge it. Also, I'm not going to merge it without review. |
|
Did you try the just updated rsim/oracle-enhanced#2521 ? |
Note: the task
db:schema:dumpis extremely slow when running against an Oracle DB. Apparently the problem is not on our side, it comes fromoracle-enhancedadapter. I didn't want to spend time investigating this so I told Calude to fix it. It was working on it for a while and actually fixed it, but I haven't reviewed this and have no idea what it did. Next is its code an explanation, for reference, in case we consider commiting this, or at least understand better what's the problem here.What this PR does / why we need it
The
oracle-enhancedActiveRecord adapter fires expensive Oracle data-dictionary queries on every DDL operation (CREATE TABLE,CREATE INDEX) and for every table duringdb:schema:dump. On Oracle XE this made DDL ~12x slower anddb:schema:dumpeffectively broken (4+ minute timeout on a 90-table schema).This PR adds performance patches to
config/initializers/oracle.rb— the same file where all our other oracle-enhanced monkey-patches live.Root cause analysis
Why DDL was slow
Every
add_indexcall fired 3 data-dictionary queries before the actualCREATE INDEX:table_exists?→SELECT owner, table_name FROM all_tables WHERE ...index_name_exists?→ callsdescribe()first (see below), thenSELECT 1 FROM all_indexes WHERE ...describe()→ a 4-way UNION:all_tables UNION ALL all_views UNION ALL all_synonyms UNION ALL all_synonyms (PUBLIC)Every
create_tablewithforce: truealso calleddata_source_exists?which callsdescribe().On Oracle XE,
describe()alone takes ~300ms. So 3 queries × ~400ms each = ~0.9s of overhead per index before any DDL runs.Benchmark (10 iterations, 4 operations: create table + add column + add index + drop table):
Why db:schema:dump was slow
db:schema:dumpiterates every table and calls these methods per table:indexes(table)— complex 5-way JOIN acrossall_indexes,all_ind_columns,all_ind_expressions,all_tab_cols,all_constraintscolumn_definitions(table)— JOIN acrossall_tab_cols+all_col_commentsforeign_keys(table)— 4-table JOIN acrossall_constraints+all_cons_columnsprimary_keys(table)— callsdescribe()+ queriesall_constraints/all_cons_columnstable_comment(table)— callsdescribe()+ queriesall_tab_commentsWith 90 tables, measured times:
indexes()column_definitions()foreign_keys()primary_keys()+table_comment()The schema dump effectively never finished.
The fix (6 patches)
All changes are in
config/initializers/oracle.rb, using the samemodule_eval/prepend/alias_methodpatterns already used throughout that file.Patch 1: Cache
describe()per connectiondescribe()is the most expensive single call (~300ms). It's called byindex_name_exists?,data_source_exists?,foreign_keys,primary_keys,column_definitions, andtable_comment. The cache is per-connection instance variable — safe because table metadata doesn't change mid-connection during migrations or schema dumps.Patch 2: Simplify
data_source_exists?The original called
describe()(4-way UNION). We only need to know if a table exists, so a directall_tablesquery suffices.Patch 3: Skip validation in
add_index_optionsThe original
add_index_optionscalledtable_exists?andindex_name_exists?before every index creation to raise a friendly Ruby error on duplicates. We remove these checks — Oracle itself raisesORA-00955: name is already used by an existing objectif an index name is duplicated. The guard is pure overhead during migrations.This is a verbatim copy of the upstream method with the guard block removed:
Patches 4–6: Schema dump batch prefetch
Instead of calling
indexes(),columns(),foreign_keys(),primary_keys(), andtable_comment()90 times each, we hook intoSchemaDumper#tablesto run 5 bulk queries upfront and cache results:prefetch_schema_dump!runs:prefetch_schema_dump_columns!—all_tab_cols+all_col_commentsfor all tables, populates@columns_cacheprefetch_schema_dump_indexes!— full indexes query for all tables at once, stores in@prefetched_indexesprefetch_schema_dump_primary_keys!— all PKs fromall_constraints/all_cons_columns, stores in@prefetched_primary_keysprefetch_schema_dump_table_comments!— all table comments fromall_tab_comments, stores in@prefetched_table_commentsprefetch_schema_dump_foreign_keys!— all FKs fromall_constraints/all_cons_columns, stores in@prefetched_foreign_keysThe patched
indexes(),table_comment(),foreign_keys()check for the prefetched instance variables and return cached data immediately.primary_keys()is patched onOracleEnhancedAdapter(where it's defined, not onSchemaStatements).Result:
indexes()per tablecolumn_definitions()per tableforeign_keys()per tabledb:schema:dumptotalThe generated
db/oracle_schema.rbis byte-for-byte identical to the original (verified by diff).Verification steps
Prerequisites: Oracle XE running locally (
oracle-enhanced://rails:railspass@127.0.0.1:1521/systempdb)To reproduce the slow schema dump (before this PR):
To verify the fix:
To verify the schema is correct:
To verify migrations still work:
Oracle CI pipeline: Please trigger the Oracle pipeline on CircleCI to run the full test suite.
Special notes for your reviewer
module_eval,prepend, andalias_method— the same patterns used in the ~15 other monkey-patches already inoracle.rbdescribe()cache is never invalidated within a connection. This is intentional and safe: the cache is only used during schema-introspection operations (DDL and dump), not during normal query execution where stale metadata would matteradd_index_optionspatch (Patch 3) is a verbatim copy of the upstream method minus the duplicate-check guard. If the adapter is upgraded, this method should be re-checked for changes@prefetched_indexes,@prefetched_primary_keys, etc.) is stored on the connection/adapter instance. It's populated once perdb:schema:dumpinvocation and not used during normal app operationindexes_with_prefetch/indexes_without_prefetchalias chain means the existingadd_indexoverride in this same file still calls through correctly —add_index→add_index_options(patched, no guard) →execute CREATE INDEXJira: https://issues.redhat.com/browse/THREESCALE-14244