Skip to content

Fix oracle for Rails 7.1#4296

Merged
akostadinov merged 7 commits into
masterfrom
oracle_1000
May 14, 2026
Merged

Fix oracle for Rails 7.1#4296
akostadinov merged 7 commits into
masterfrom
oracle_1000

Conversation

@akostadinov
Copy link
Copy Markdown
Contributor

Various fixes and cleanup for 0racle in Rails 7.1

Comment on lines +5 to +11
ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter.class_eval do
# https://github.com/rsim/oracle-enhanced/pull/2573
# Perhaps will end-up into adapter version 8.1.4+ but double check
def supports_fetch_first_n_rows_and_offset?
true
end
end
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the upgrade, there was a regression in the adapter code that hardcoded the old visitor, so Oracle12 visitor that we patch did not come into play at all. Get back the the new visitor so our patches are in effect.

@qltysh
Copy link
Copy Markdown

qltysh Bot commented May 5, 2026

All good ✅

Comment on lines -187 to -192
BabySqueel::Nodes::Attribute.prepend(Module.new do
# those relations are used in subqueries and oracle does not support ORDER in subqueries
private def sanitize_relation(rel)
super rel.unscope(:order)
end
end)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets see if the test suite complains. With the Rails 7.1 upgrade our patch to the visitor removing :order from queries also fell out of effect because of the supports_fetch_first_n_rows_and_offset? issue described above. So I want to try whether removing this will also go without troubles.

Oracle in fact supports order in some subqueries (according to AI, when combined with limits), so ideally we should not silently strip out ORDER clauses. But strip them out whenever they are in fact not needed in the specific places in our code where we have such sub-queries.

Alternatively we can try to remove ORDER when we don't detect a LIMIT or NUMROW. But that might be messy and flaky.

@odbc_dsn ||= "DSN=oracle;Driver={Oracle-Driver};Dbq=#{conf[:host]}:#{conf[:port] || 1521}/#{conf[:database]};Uid=#{conf[:username]};Pwd=#{conf[:password]}"
end
end
end)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't use database delta indices with ThinkingSphinx but only Realtime indices, these patches should not be needed as far as I can tell. We can't even know if they are still working. I prefer to just remove them.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not need the ts-datetime-delta gem then.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! b43d27a

Comment on lines -281 to -292
def add_index(table_name, column_name, **options) #:nodoc:
# All this code is exactly the same as the original except the line of the ALTER TABLE, which adds an additional USING INDEX #{quote_column_name(index_name)}
# The reason of this is otherwise it picks the first index that finds that contains that column name, even if it is shared with other columns and it is not unique.
# upstreamed: https://github.com/rsim/oracle-enhanced/pull/2293
index_name, index_type, quoted_column_names, tablespace, index_options = add_index_options(table_name, column_name, **options)
quoted_table_name = quote_table_name(table_name)
quoted_column_name = quote_column_name(index_name)
execute "CREATE #{index_type} INDEX #{quoted_column_name} ON #{quoted_table_name} (#{quoted_column_names})#{tablespace} #{index_options}"
if index_type == 'UNIQUE' && quoted_column_names !~ /\(.*\)/
execute "ALTER TABLE #{quoted_table_name} ADD CONSTRAINT #{quoted_column_name} #{index_type} (#{quoted_column_names}) USING INDEX #{quoted_column_name}"
end
end
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I find it merged in oracle_enhanced 7.1 already so letting it go.

Comment on lines -6 to -24
# we need to strip ORDER from subqueries because Oracle does not support it
def strip_order_from_select(o)
case (node = o.right)
when Arel::Nodes::SelectStatement
node.orders = []
end
end

def visit_Arel_Nodes_In(o, collector)
strip_order_from_select(o)

super
end

def visit_Arel_Nodes_NotIn(o, collector)
strip_order_from_select(o)

super
end
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what essentially stopped being effectinve since our Rails 7.1 upgrade. There was one query that Daria fixed at the time, that was previously covered but this clean-up but there doesn't seem to be anything else the testsuite could find.

So I prefer to keep these removed, stay with upstream and also see explanation about the BabySqueel counterpart https://github.com/3scale/porta/pull/4296/changes#r3190774533

# Fixes ORA-00932: inconsistent datatypes: expected - got CLOB
# remove when https://github.com/rsim/oracle-enhanced/pull/2415 is merged
def visit_Arel_Nodes_Equality(o, collector)
left = o.left
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just had to bring this method in sync with the upstream updates, db:schema:dump was failing because some method call was invalid in our version

Copy link
Copy Markdown
Contributor

@jlledom jlledom May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original method is the same except the if right guard. Is that what was causing the error when calling schema dump? because if not, we could just remove this patch and use the original method.

https://github.com/rsim/oracle-enhanced/blob/v7.1.1/lib/arel/visitors/oracle_common.rb#L10-L20

The purpose of this patch, which according to the comment is to handle the DBMS_LOB.COMPARE issue, is in fact fixed upstream.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the linked pull request in comment, which is not merged and it is needed.

end

# Remove after upgrade to a version with
# https://github.com/rsim/oracle-enhanced/pull/2654
Copy link
Copy Markdown
Contributor Author

@akostadinov akostadinov May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the following two methods fix the issue with Oracle IN (more than 1000 values)

There are a couple of tests down there.

Comment on lines -294 to -332
def distinct_relation_for_primary_key(relation) # :nodoc:
primary_key_columns = Array(relation.primary_key).map do |column|
visitor.compile(relation.table[column])
end

values = columns_for_distinct(
primary_key_columns,
relation.order_values
)

limited = relation.reselect(values).distinct!

# The original code in https://github.com/rails/rails/blob/v7.1.5.1/activerecord/lib/active_record/connection_adapters/abstract/schema_statements.rb#L1404-L1406
# ----
# limited_ids = select_rows(limited.arel, "SQL").map do |results|
# results.last(Array(relation.primary_key).length) # ignores order values for MySQL and PostgreSQL
# end
# ----
# The change is needed because in Oracle, because otherwise the resulting `limited_ids` array would be wrong. For example,
# for a ServiceContract model that has the primary key "id" and a query with ordering and filtering you may get:
# 1. `limited` variable can be something like: #<ActiveRecord::Relation [#<ServiceContract alias_0__: "live", id: 4, raw_rnum_: 1>]>
# 2. `select_rows` would then return [["live",4,1]]
# 3. `limited_ids` calculation will result in [[1]], which is an invalid IDs list (the expected is [[4]])
# The updated code doesn't make assumptions about how many columns are selected or their order, but fetches the values according
# to the primary key column names
limited_ids = select_all(limited.arel, "SQL").to_ary.map do |result|
result.values_at(*Array(relation.primary_key))
end

if limited_ids.empty?
relation.none!
else
relation.where!(**Array(relation.primary_key).zip(limited_ids.transpose).to_h)
end

relation.limit_value = relation.offset_value = nil
relation
end
end
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added with our Rails 7.1 upgrade. But in the respective oracle_enhanced-adapter version, the Oracle 11 visitor was mistakenly hardcoded.

With that visitor, limits to the query are added with adding a row num to the queries, or maybe it adds something even without limit, not sure. This confuses ActiveRecord distinct_relation_for_primary_key and this is IMO a nice better implementation that should go upstream to Rails.

The point for removing it is that with the Oracle12 visitor we started to use, this doesn't seem to be a problem. So I'm removing it, although it is perfectly fine to also stay. But I prefer to remove it to avoid future Rails upgrade issue.

The way to reproduce the issue is by going back to the old Oracle11 listener by altering supports_fetch_first_n_rows_and_offset? above to false and then run with oracle features/buyers/accounts/service_contracts/index.feature#Ordering and filtering by service

Apparently with that method OR with the Oracle 12 listener the test pass. But let me know if you think it is safer to keep this monkey patch. I definitely like the implementation.

Also @mayorova would you submit this nice change to Rails? Or if you don't have time/desire for this, let me know and I can do.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so, just to make sure I understand this correctly:

The issue with the default implementation of distinct_relation_for_primary_key happens only for SQL queries generated by the Oracle11 visitor, which was accidentally forced in the oracle-enhanced adapter. In the "modern" syntax, which is generated by Oracle12 visitor (and which serves us well, because we don't support any old versions of Oracle anyway - only 19), fixing distinct_relation_for_primary_key is not required.

And even when we upgrade to the version that includes https://github.com/rsim/oracle-enhanced/pull/2573/changes, and remove the patch

def supports_fetch_first_n_rows_and_offset?
  true
end

we'll still stay on Oracle12 visitor, because we don't support older Oracle anyway.

I'm wondering though - for the condition o.limit && o.lock Oracle11 is still forced (I think it's the only use case?). So, could this kind of select potentially call distinct_relation_for_primary_key still ? And hence still be broken, if we remove this modified version?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked Bob about it, see his answer below.

We probably don't have such queries currently in the code base, but who knows - maybe they will appear at some point...

By the way, the suggested test currently throws:

Minitest::UnexpectedError: ActiveRecord::StatementInvalid: OCIError: ORA-00907: missing right parenthesis
    test/unit/oracle_hacks_test.rb:50:in `block in <class:CollectionsTest>'
test/unit/oracle_hacks_test.rb:50:in `block in <class:CollectionsTest>'

which, if I remember correctly is NOT what overriding distinct_relation_for_primary_key was fixing 🙃 (there were no errors - just 0 results when > 0 rows were expected).

🚨 Critical Bug Analysis: distinct_relation_for_primary_key Removal

Summary

The removal of the distinct_relation_for_primary_key override creates a critical bug when combined with the Oracle11 visitor fallback for LIMIT+LOCK queries.

The Bug Scenario

When distinct_relation_for_primary_key is Called

From Rails source (active_record/relation/finder_methods.rb):

def apply_join_dependency(eager_loading: group_values.empty?)
  # ...
  if eager_loading && has_limit_or_offset? && !(using_limitable_reflections?)
    relation = klass.connection.distinct_relation_for_primary_key(relation)
  end
end

This method is called when:

  1. Eager loading associations (.includes() or .eager_load())
  2. Has LIMIT or OFFSET
  3. Associations are not "limitable" (complex joins that can't be optimized)

The Problem Query

A query like this WILL trigger the bug:

Account.includes(:users)
       .where(state: 'approved')
       .order(:created_at)
       .limit(10)
       .lock              # <-- LOCK added!
       .to_a

What Happens (Step by Step)

  1. ✅ Query has limit + lock → Oracle12 visitor detects this condition
  2. ✅ Switches to Oracle11 visitor via visit_Arel_Nodes_SelectStatement
  3. ✅ SQL generated with ROWNUM syntax (not FETCH FIRST)
  4. ❌ Rails calls distinct_relation_for_primary_key (because of eager loading + limit)
  5. The removed override is NOT there to handle the ROWNUM-based SQL
  6. 💥 The default Rails implementation expects FETCH FIRST syntax and will FAIL!

Why This Wasn't Caught

The removed distinct_relation_for_primary_key override was specifically designed to handle Oracle's quirks with result set parsing. When you switch to Oracle11 visitor (ROWNUM), the SQL structure changes fundamentally, but there's no corresponding distinct_relation_for_primary_key implementation for that case.

The Evidence

From the removed code comments:

# The change is needed because in Oracle, because otherwise the resulting 
# `limited_ids` array would be wrong. For example, for a ServiceContract 
# model that has the primary key "id" and a query with ordering and 
# filtering you may get:
# 1. `limited` variable: #<ActiveRecord::Relation [#<ServiceContract alias_0__: "live", id: 4, raw_rnum_: 1>]>
# 2. `select_rows` would return [["live",4,1]]  # <-- Note the raw_rnum_ column!
# 3. `limited_ids` calculation will result in [[1]], which is WRONG (expected [[4]])

Key insight: The ROWNUM syntax adds extra columns (raw_rnum_, alias_0__) that the default Rails code doesn't know how to handle!

SQL Comparison

Oracle12 (FETCH FIRST) - What Rails expects:

SELECT * FROM accounts 
ORDER BY created_at 
FETCH FIRST 10 ROWS ONLY
FOR UPDATE

Oracle11 (ROWNUM) - What we actually generate with LOCK:

SELECT * FROM (
  SELECT accounts.*, ROWNUM as raw_rnum_
  FROM accounts
  ORDER BY created_at
) WHERE raw_rnum_ <= 10
FOR UPDATE

The extra raw_rnum_ column breaks the default distinct_relation_for_primary_key logic!

Recommendations

This code removal is UNSAFE. You need to either:

Option 1: Keep the Override (Recommended)

Restore the distinct_relation_for_primary_key override - it's still needed when Oracle11 visitor is used.

Option 2: Prevent the Scenario

Add logic to prevent .lock on queries with eager loading + limit:

def visit_Arel_Nodes_SelectStatement(o, collector)
  if o.limit && o.lock
    # Check if this might trigger distinct_relation_for_primary_key
    # and raise an error or handle it differently
    @oracle11_visitor ||= Arel::Visitors::Oracle.new(@connection)
    return @oracle11_visitor.accept(o.dup, collector)
  end
  super
end

Option 3: Wait for Upstream Fix

Don't remove the code until oracle-enhanced 8.1.4+ is deployed AND verified to fix both:

Suggested Test Case

test "eager loading with limit and lock should work" do
  skip "Oracle-specific test" unless System::Database.oracle?
  
  account = FactoryBot.create(:simple_account)
  FactoryBot.create_list(:simple_user, 5, account: account)
  
  # This should not raise an error or return wrong results
  result = Account.includes(:users)
                  .where(id: account.id)
                  .order(:created_at)
                  .limit(3)
                  .lock
                  .to_a
  
  assert_equal 1, result.size
  assert_equal account.id, result.first.id
  
  # Verify users were actually eager loaded (no N+1 queries)
  assert result.first.association(:users).loaded?
end

Conclusion

Verdict: The removal of distinct_relation_for_primary_key creates a latent bug when:

  • Eager loading associations
  • With LIMIT/OFFSET
  • With LOCK
  • On complex joins

This is a rare but critical edge case that will cause incorrect query results or errors in production.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I wanted to capture in the comment. It might be possible to have a query that fails but we don't. We can generate many queries that will fail without various upstream fixes or things without existing fixes.

The fact is that we don't have such queries right now. If we introduce such queries, they should fail on oracle so we will know to either fix the queries or the adapter (in this case actually active record).

So I prefer not to have this redundant workaround in the codebase until we actually need it. Rather than focus on fixing upstream for a long-term safety.

I think bob can give a similar answer if you ask it whether we should backport all 8.1.4 upstream fixes to our monkey patch or not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fine for me 👍

Comment on lines -61 to -66
# remove when addressed: https://github.com/rsim/oracle-enhanced/pull/2247 - included in v7.1.0
def visit_Arel_Nodes_Matches o, collector
if !o.case_sensitive && o.left && o.right
o.left = Arel::Nodes::NamedFunction.new('UPPER', [o.left])
o.right = Arel::Nodes::NamedFunction.new('UPPER', [o.right])
end
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already live in oracle adapter 7.1 so we can get rid of it 🎉

jlledom
jlledom previously approved these changes May 13, 2026
Copy link
Copy Markdown
Contributor

@jlledom jlledom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always happy to remove patches! Also, the Oracle 11 visitor being forced by the adapter was in fact something that needed addressing. Thanks for working on this!

# Fixes ORA-00932: inconsistent datatypes: expected - got CLOB
# remove when https://github.com/rsim/oracle-enhanced/pull/2415 is merged
def visit_Arel_Nodes_Equality(o, collector)
left = o.left
Copy link
Copy Markdown
Contributor

@jlledom jlledom May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original method is the same except the if right guard. Is that what was causing the error when calling schema dump? because if not, we could just remove this patch and use the original method.

https://github.com/rsim/oracle-enhanced/blob/v7.1.1/lib/arel/visitors/oracle_common.rb#L10-L20

The purpose of this patch, which according to the comment is to handle the DBMS_LOB.COMPARE issue, is in fact fixed upstream.

Comment on lines +27 to +33
def visit_Arel_Nodes_SelectStatement(o, collector)
if o.limit && o.lock
@oracle11_visitor ||= Arel::Visitors::Oracle.new(@connection)
return @oracle11_visitor.accept(o.dup, collector)
end
super
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a test for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have failing tests without this, that's why I had to backport it. I don't have a low-level reproducer and think there is no point in spending time on finding one. I'm mostly interested in fixing actual calls that we make in porta, not backport all possible upstream fixes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a simple Model.lock.limit(1).first raise an exception?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual issue I couldn't replicate better than existing tests was

ArgumentError: Combination of limit and lock is not supported. Because generated SQL statements
SELECT FOR UPDATE and FETCH FIRST n ROWS generates ORA-02014.

https://app.circleci.com/pipelines/github/3scale/porta/34308/workflows/a8a759c5-ac8e-4e0f-baf2-28c8ae800d3e/jobs/408119

@odbc_dsn ||= "DSN=oracle;Driver={Oracle-Driver};Dbq=#{conf[:host]}:#{conf[:port] || 1521}/#{conf[:database]};Uid=#{conf[:username]};Pwd=#{conf[:password]}"
end
end
end)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not need the ts-datetime-delta gem then.

Comment thread config/initializers/oracle.rb Outdated
Comment on lines +52 to +57
# uncomment the commented our lines for https://github.com/rsim/oracle-enhanced/pull/2485
# if you uncomment them and things don't fail, then they are taking effect
ActiveRecord::Base.skip_callback(:update, :after, :enhanced_write_lobs)
# ActiveRecord::Base.skip_callback(:create, :after, :enhanced_write_lobs)
ActiveRecord::Base.skip_callback(:update, :before, :record_changed_lobs)
# ActiveRecord::Base.skip_callback(:create, :before, :record_lobs_for_create)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the commented lines, why are they commented? Why can they be uncommented for :update but not for :insert?

Also, I don't understand your comment: "if you uncomment them and things don't fail, then they are taking effect", mmm... if you uncomment them and things don't fail, doesn't that mean the callback was in fact doing nothing? and how do they "take effect"? What the lines do is removing a callback, so preventing it to take effect.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These callbacks are redundant and long version is in the upstream pull request.

Future upstream versions have the callbacks for create as well update so now we cannot uncomment but later we can and should.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I update d the comment with e2bfd15, hope it makes more sense now.

@akostadinov akostadinov merged commit 9861df8 into master May 14, 2026
23 of 27 checks passed
@akostadinov akostadinov deleted the oracle_1000 branch May 14, 2026 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants