THREESCALE-14244 Optimize Oracle Enhanced adapter performance by jlledom · Pull Request #4261 · 3scale/porta

jlledom · 2026-03-27T10:18:52Z

Note: the task db:schema:dump is extremely slow when running against an Oracle DB. Apparently the problem is not on our side, it comes from oracle-enhanced adapter. I didn't want to spend time investigating this so I told Calude to fix it. It was working on it for a while and actually fixed it, but I haven't reviewed this and have no idea what it did. Next is its code an explanation, for reference, in case we consider commiting this, or at least understand better what's the problem here.

What this PR does / why we need it

The oracle-enhanced ActiveRecord adapter fires expensive Oracle data-dictionary queries on every DDL operation (CREATE TABLE, CREATE INDEX) and for every table during db:schema:dump. On Oracle XE this made DDL ~12x slower and db:schema:dump effectively broken (4+ minute timeout on a 90-table schema).

This PR adds performance patches to config/initializers/oracle.rb — the same file where all our other oracle-enhanced monkey-patches live.

Root cause analysis

Why DDL was slow

Every add_index call fired 3 data-dictionary queries before the actual CREATE INDEX:

table_exists? → SELECT owner, table_name FROM all_tables WHERE ...
index_name_exists? → calls describe() first (see below), then SELECT 1 FROM all_indexes WHERE ...
describe() → a 4-way UNION: all_tables UNION ALL all_views UNION ALL all_synonyms UNION ALL all_synonyms (PUBLIC)

Every create_table with force: true also called data_source_exists? which calls describe().

On Oracle XE, describe() alone takes ~300ms. So 3 queries × ~400ms each = ~0.9s of overhead per index before any DDL runs.

Benchmark (10 iterations, 4 operations: create table + add column + add index + drop table):

Oracle (vanilla):  0.924s/iter
Oracle (patched):  0.074s/iter  →  12.5x faster

Why db:schema:dump was slow

db:schema:dump iterates every table and calls these methods per table:

indexes(table) — complex 5-way JOIN across all_indexes, all_ind_columns, all_ind_expressions, all_tab_cols, all_constraints
column_definitions(table) — JOIN across all_tab_cols + all_col_comments
foreign_keys(table) — 4-table JOIN across all_constraints + all_cons_columns
primary_keys(table) — calls describe() + queries all_constraints/all_cons_columns
table_comment(table) — calls describe() + queries all_tab_comments

With 90 tables, measured times:

Method	Per table	× 90 tables
`indexes()`	2.93s	264s
`column_definitions()`	0.27s	24s
`foreign_keys()`	0.29s	26s
`primary_keys()` + `table_comment()`	~0.3s	~27s
Total		~341s

The schema dump effectively never finished.

The fix (6 patches)

All changes are in config/initializers/oracle.rb, using the same module_eval/prepend/alias_method patterns already used throughout that file.

Patch 1: Cache `describe()` per connection

ActiveRecord::ConnectionAdapters::OracleEnhanced::Connection.prepend(Module.new do
  private
  def describe(name)
    @describe_cache ||= {}
    key = name.to_s.upcase
    return @describe_cache[key] if @describe_cache.key?(key)
    @describe_cache[key] = super
  end
end)

describe() is the most expensive single call (~300ms). It's called by index_name_exists?, data_source_exists?, foreign_keys, primary_keys, column_definitions, and table_comment. The cache is per-connection instance variable — safe because table metadata doesn't change mid-connection during migrations or schema dumps.

Patch 2: Simplify `data_source_exists?`

def data_source_exists?(table_name)
  table_exists?(table_name)
end

The original called describe() (4-way UNION). We only need to know if a table exists, so a direct all_tables query suffices.

Patch 3: Skip validation in `add_index_options`

The original add_index_options called table_exists? and index_name_exists? before every index creation to raise a friendly Ruby error on duplicates. We remove these checks — Oracle itself raises ORA-00955: name is already used by an existing object if an index name is duplicated. The guard is pure overhead during migrations.

This is a verbatim copy of the upstream method with the guard block removed:

# Removed:
# if table_exists?(table_name) && index_name_exists?(table_name, index_name)
#   raise ArgumentError, "Index name '#{index_name}' on table '#{table_name}' already exists"
# end

Patches 4–6: Schema dump batch prefetch

Instead of calling indexes(), columns(), foreign_keys(), primary_keys(), and table_comment() 90 times each, we hook into SchemaDumper#tables to run 5 bulk queries upfront and cache results:

ActiveRecord::ConnectionAdapters::OracleEnhanced::SchemaDumper.prepend(Module.new do
  private
  def tables(stream)
    @connection.prefetch_schema_dump!
    super
  end
end)

prefetch_schema_dump! runs:

prefetch_schema_dump_columns! — all_tab_cols + all_col_comments for all tables, populates @columns_cache
prefetch_schema_dump_indexes! — full indexes query for all tables at once, stores in @prefetched_indexes
prefetch_schema_dump_primary_keys! — all PKs from all_constraints/all_cons_columns, stores in @prefetched_primary_keys
prefetch_schema_dump_table_comments! — all table comments from all_tab_comments, stores in @prefetched_table_comments
prefetch_schema_dump_foreign_keys! — all FKs from all_constraints/all_cons_columns, stores in @prefetched_foreign_keys

The patched indexes(), table_comment(), foreign_keys() check for the prefetched instance variables and return cached data immediately. primary_keys() is patched on OracleEnhancedAdapter (where it's defined, not on SchemaStatements).

Result:

Method	Before	After
`indexes()` per table	2.93s	~0s (cache hit)
`column_definitions()` per table	0.27s	~0s (cache hit)
`foreign_keys()` per table	0.29s	~0s (cache hit)
Prefetch (one-time)	—	~2.3s total
`db:schema:dump` total	4+ minutes	~9s

The generated db/oracle_schema.rb is byte-for-byte identical to the original (verified by diff).

Verification steps

Prerequisites: Oracle XE running locally (oracle-enhanced://rails:railspass@127.0.0.1:1521/systempdb)

To reproduce the slow schema dump (before this PR):

git checkout master  # without this PR
time DATABASE_URL='oracle-enhanced://...' bundle exec rails db:schema:dump
# Expected: hangs for 4+ minutes or times out

To verify the fix:

git checkout THREESCALE-14244-oracle-ddl-performance
time DATABASE_URL='oracle-enhanced://...' bundle exec rails db:schema:dump
# Expected: completes in ~9s

To verify the schema is correct:

# Run schema dump twice and diff — should be identical (modulo schema version)
DATABASE_URL='oracle-enhanced://...' bundle exec rails db:schema:dump
cp db/oracle_schema.rb /tmp/schema_a.rb
DATABASE_URL='oracle-enhanced://...' bundle exec rails db:schema:dump
diff /tmp/schema_a.rb db/oracle_schema.rb  # should be empty

To verify migrations still work:

DATABASE_URL='oracle-enhanced://...' bundle exec rails db:migrate
# Should complete without ORA-XXXXX errors

Oracle CI pipeline: Please trigger the Oracle pipeline on CircleCI to run the full test suite.

Special notes for your reviewer

All 6 patches use module_eval, prepend, and alias_method — the same patterns used in the ~15 other monkey-patches already in oracle.rb
The describe() cache is never invalidated within a connection. This is intentional and safe: the cache is only used during schema-introspection operations (DDL and dump), not during normal query execution where stale metadata would matter
The add_index_options patch (Patch 3) is a verbatim copy of the upstream method minus the duplicate-check guard. If the adapter is upgraded, this method should be re-checked for changes
The prefetch data (@prefetched_indexes, @prefetched_primary_keys, etc.) is stored on the connection/adapter instance. It's populated once per db:schema:dump invocation and not used during normal app operation
The indexes_with_prefetch / indexes_without_prefetch alias chain means the existing add_index override in this same file still calls through correctly — add_index → add_index_options (patched, no guard) → execute CREATE INDEX

Jira: https://issues.redhat.com/browse/THREESCALE-14244

The oracle-enhanced adapter fires expensive Oracle data-dictionary queries on every DDL operation and for every table during db:schema:dump. DDL fixes: - Cache describe() results per connection to avoid repeated 4-way UNION queries across all_tables/all_views/all_synonyms on every add_index - Simplify data_source_exists? to use table_exists? (single all_tables query) instead of the full describe() UNION - Skip redundant table_exists?/index_name_exists? validation in add_index_options — Oracle raises ORA-00955 on duplicates anyway Schema dump prefetch (db:schema:dump): - Prefetch columns, indexes, primary keys, table comments, and foreign keys in 5 bulk queries before iterating tables, replacing ~450 per-table data-dictionary queries with 5 single queries Measured on Oracle XE (90 tables): - DDL operations (create table + index): 0.92s → 0.07s (~12x faster) - db:schema:dump: 4+ minutes → ~9s (~40x faster) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

akostadinov · 2026-03-27T10:47:22Z

Wow, this is really too complicated. It seems like people have found a couple of simpler solutions: rsim/oracle-enhanced#2467 but didn't follow through with either of them.

codecov · 2026-03-27T11:09:03Z

Codecov Report

❌ Patch coverage is 36.08247% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.21%. Comparing base (066a7c2) to head (e2c5894).

Files with missing lines	Patch %	Lines
config/initializers/oracle.rb	36.08%	62 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4261      +/-   ##
==========================================
+ Coverage   82.12%   88.21%   +6.09%     
==========================================
  Files         204     1765    +1561     
  Lines        3888    44451   +40563     
  Branches      686      686              
==========================================
+ Hits         3193    39213   +36020     
- Misses        679     5222    +4543     
  Partials       16       16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jlledom · 2026-04-15T14:36:26Z

Wow, this is really too complicated. It seems like people have found a couple of simpler solutions: rsim/oracle-enhanced#2467 but didn't follow through with either of them.

I tried this and it doesn't work. That PR is wrong regrettably

jlledom · 2026-04-15T14:40:40Z

I don't think it's worth keeping this PR open. I'll apply changes from here every time I need to work with oracle to make the schema dump faster, but I don't want to take the time to really review this in order to merge it. Also, I'm not going to merge it without review.

akostadinov · 2026-04-16T10:12:10Z

Did you try the just updated rsim/oracle-enhanced#2521 ?

akostadinov · 2026-04-16T10:14:19Z

Or rsim/oracle-enhanced#2531 ?

jlledom closed this Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

THREESCALE-14244 Optimize Oracle Enhanced adapter performance#4261

THREESCALE-14244 Optimize Oracle Enhanced adapter performance#4261
jlledom wants to merge 1 commit into
masterfrom
THREESCALE-14244-oracle-ddl-performance

jlledom commented Mar 27, 2026 •

edited

Loading

Uh oh!

akostadinov commented Mar 27, 2026

Uh oh!

codecov Bot commented Mar 27, 2026

Uh oh!

jlledom commented Apr 15, 2026

Uh oh!

jlledom commented Apr 15, 2026

Uh oh!

akostadinov commented Apr 16, 2026

Uh oh!

akostadinov commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlledom commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root cause analysis

Why DDL was slow

Why db:schema:dump was slow

The fix (6 patches)

Patch 1: Cache describe() per connection

Patch 2: Simplify data_source_exists?

Patch 3: Skip validation in add_index_options

Patches 4–6: Schema dump batch prefetch

Verification steps

Special notes for your reviewer

Uh oh!

akostadinov commented Mar 27, 2026

Uh oh!

codecov Bot commented Mar 27, 2026

Codecov Report

Uh oh!

jlledom commented Apr 15, 2026

Uh oh!

jlledom commented Apr 15, 2026

Uh oh!

akostadinov commented Apr 16, 2026

Uh oh!

akostadinov commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jlledom commented Mar 27, 2026 •

edited

Loading

Patch 1: Cache `describe()` per connection

Patch 2: Simplify `data_source_exists?`

Patch 3: Skip validation in `add_index_options`