Fix oracle for Rails 7.1#4296
Conversation
| ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter.class_eval do | ||
| # https://github.com/rsim/oracle-enhanced/pull/2573 | ||
| # Perhaps will end-up into adapter version 8.1.4+ but double check | ||
| def supports_fetch_first_n_rows_and_offset? | ||
| true | ||
| end | ||
| end |
There was a problem hiding this comment.
With the upgrade, there was a regression in the adapter code that hardcoded the old visitor, so Oracle12 visitor that we patch did not come into play at all. Get back the the new visitor so our patches are in effect.
| BabySqueel::Nodes::Attribute.prepend(Module.new do | ||
| # those relations are used in subqueries and oracle does not support ORDER in subqueries | ||
| private def sanitize_relation(rel) | ||
| super rel.unscope(:order) | ||
| end | ||
| end) |
There was a problem hiding this comment.
Lets see if the test suite complains. With the Rails 7.1 upgrade our patch to the visitor removing :order from queries also fell out of effect because of the supports_fetch_first_n_rows_and_offset? issue described above. So I want to try whether removing this will also go without troubles.
Oracle in fact supports order in some subqueries (according to AI, when combined with limits), so ideally we should not silently strip out ORDER clauses. But strip them out whenever they are in fact not needed in the specific places in our code where we have such sub-queries.
Alternatively we can try to remove ORDER when we don't detect a LIMIT or NUMROW. But that might be messy and flaky.
| @odbc_dsn ||= "DSN=oracle;Driver={Oracle-Driver};Dbq=#{conf[:host]}:#{conf[:port] || 1521}/#{conf[:database]};Uid=#{conf[:username]};Pwd=#{conf[:password]}" | ||
| end | ||
| end | ||
| end) |
There was a problem hiding this comment.
Because we don't use database delta indices with ThinkingSphinx but only Realtime indices, these patches should not be needed as far as I can tell. We can't even know if they are still working. I prefer to just remove them.
There was a problem hiding this comment.
We might not need the ts-datetime-delta gem then.
| def add_index(table_name, column_name, **options) #:nodoc: | ||
| # All this code is exactly the same as the original except the line of the ALTER TABLE, which adds an additional USING INDEX #{quote_column_name(index_name)} | ||
| # The reason of this is otherwise it picks the first index that finds that contains that column name, even if it is shared with other columns and it is not unique. | ||
| # upstreamed: https://github.com/rsim/oracle-enhanced/pull/2293 | ||
| index_name, index_type, quoted_column_names, tablespace, index_options = add_index_options(table_name, column_name, **options) | ||
| quoted_table_name = quote_table_name(table_name) | ||
| quoted_column_name = quote_column_name(index_name) | ||
| execute "CREATE #{index_type} INDEX #{quoted_column_name} ON #{quoted_table_name} (#{quoted_column_names})#{tablespace} #{index_options}" | ||
| if index_type == 'UNIQUE' && quoted_column_names !~ /\(.*\)/ | ||
| execute "ALTER TABLE #{quoted_table_name} ADD CONSTRAINT #{quoted_column_name} #{index_type} (#{quoted_column_names}) USING INDEX #{quoted_column_name}" | ||
| end | ||
| end |
There was a problem hiding this comment.
This one I find it merged in oracle_enhanced 7.1 already so letting it go.
| # we need to strip ORDER from subqueries because Oracle does not support it | ||
| def strip_order_from_select(o) | ||
| case (node = o.right) | ||
| when Arel::Nodes::SelectStatement | ||
| node.orders = [] | ||
| end | ||
| end | ||
|
|
||
| def visit_Arel_Nodes_In(o, collector) | ||
| strip_order_from_select(o) | ||
|
|
||
| super | ||
| end | ||
|
|
||
| def visit_Arel_Nodes_NotIn(o, collector) | ||
| strip_order_from_select(o) | ||
|
|
||
| super | ||
| end |
There was a problem hiding this comment.
This is what essentially stopped being effectinve since our Rails 7.1 upgrade. There was one query that Daria fixed at the time, that was previously covered but this clean-up but there doesn't seem to be anything else the testsuite could find.
So I prefer to keep these removed, stay with upstream and also see explanation about the BabySqueel counterpart https://github.com/3scale/porta/pull/4296/changes#r3190774533
| # Fixes ORA-00932: inconsistent datatypes: expected - got CLOB | ||
| # remove when https://github.com/rsim/oracle-enhanced/pull/2415 is merged | ||
| def visit_Arel_Nodes_Equality(o, collector) | ||
| left = o.left |
There was a problem hiding this comment.
just had to bring this method in sync with the upstream updates, db:schema:dump was failing because some method call was invalid in our version
There was a problem hiding this comment.
The original method is the same except the if right guard. Is that what was causing the error when calling schema dump? because if not, we could just remove this patch and use the original method.
https://github.com/rsim/oracle-enhanced/blob/v7.1.1/lib/arel/visitors/oracle_common.rb#L10-L20
The purpose of this patch, which according to the comment is to handle the DBMS_LOB.COMPARE issue, is in fact fixed upstream.
There was a problem hiding this comment.
Look at the linked pull request in comment, which is not merged and it is needed.
| end | ||
|
|
||
| # Remove after upgrade to a version with | ||
| # https://github.com/rsim/oracle-enhanced/pull/2654 |
There was a problem hiding this comment.
the following two methods fix the issue with Oracle IN (more than 1000 values)
There are a couple of tests down there.
We only use realtime indices so this is dead code now.
| def distinct_relation_for_primary_key(relation) # :nodoc: | ||
| primary_key_columns = Array(relation.primary_key).map do |column| | ||
| visitor.compile(relation.table[column]) | ||
| end | ||
|
|
||
| values = columns_for_distinct( | ||
| primary_key_columns, | ||
| relation.order_values | ||
| ) | ||
|
|
||
| limited = relation.reselect(values).distinct! | ||
|
|
||
| # The original code in https://github.com/rails/rails/blob/v7.1.5.1/activerecord/lib/active_record/connection_adapters/abstract/schema_statements.rb#L1404-L1406 | ||
| # ---- | ||
| # limited_ids = select_rows(limited.arel, "SQL").map do |results| | ||
| # results.last(Array(relation.primary_key).length) # ignores order values for MySQL and PostgreSQL | ||
| # end | ||
| # ---- | ||
| # The change is needed because in Oracle, because otherwise the resulting `limited_ids` array would be wrong. For example, | ||
| # for a ServiceContract model that has the primary key "id" and a query with ordering and filtering you may get: | ||
| # 1. `limited` variable can be something like: #<ActiveRecord::Relation [#<ServiceContract alias_0__: "live", id: 4, raw_rnum_: 1>]> | ||
| # 2. `select_rows` would then return [["live",4,1]] | ||
| # 3. `limited_ids` calculation will result in [[1]], which is an invalid IDs list (the expected is [[4]]) | ||
| # The updated code doesn't make assumptions about how many columns are selected or their order, but fetches the values according | ||
| # to the primary key column names | ||
| limited_ids = select_all(limited.arel, "SQL").to_ary.map do |result| | ||
| result.values_at(*Array(relation.primary_key)) | ||
| end | ||
|
|
||
| if limited_ids.empty? | ||
| relation.none! | ||
| else | ||
| relation.where!(**Array(relation.primary_key).zip(limited_ids.transpose).to_h) | ||
| end | ||
|
|
||
| relation.limit_value = relation.offset_value = nil | ||
| relation | ||
| end | ||
| end |
There was a problem hiding this comment.
This was added with our Rails 7.1 upgrade. But in the respective oracle_enhanced-adapter version, the Oracle 11 visitor was mistakenly hardcoded.
With that visitor, limits to the query are added with adding a row num to the queries, or maybe it adds something even without limit, not sure. This confuses ActiveRecord distinct_relation_for_primary_key and this is IMO a nice better implementation that should go upstream to Rails.
The point for removing it is that with the Oracle12 visitor we started to use, this doesn't seem to be a problem. So I'm removing it, although it is perfectly fine to also stay. But I prefer to remove it to avoid future Rails upgrade issue.
The way to reproduce the issue is by going back to the old Oracle11 listener by altering supports_fetch_first_n_rows_and_offset? above to false and then run with oracle features/buyers/accounts/service_contracts/index.feature#Ordering and filtering by service
Apparently with that method OR with the Oracle 12 listener the test pass. But let me know if you think it is safer to keep this monkey patch. I definitely like the implementation.
Also @mayorova would you submit this nice change to Rails? Or if you don't have time/desire for this, let me know and I can do.
There was a problem hiding this comment.
OK, so, just to make sure I understand this correctly:
The issue with the default implementation of distinct_relation_for_primary_key happens only for SQL queries generated by the Oracle11 visitor, which was accidentally forced in the oracle-enhanced adapter. In the "modern" syntax, which is generated by Oracle12 visitor (and which serves us well, because we don't support any old versions of Oracle anyway - only 19), fixing distinct_relation_for_primary_key is not required.
And even when we upgrade to the version that includes https://github.com/rsim/oracle-enhanced/pull/2573/changes, and remove the patch
def supports_fetch_first_n_rows_and_offset?
true
end
we'll still stay on Oracle12 visitor, because we don't support older Oracle anyway.
I'm wondering though - for the condition o.limit && o.lock Oracle11 is still forced (I think it's the only use case?). So, could this kind of select potentially call distinct_relation_for_primary_key still ? And hence still be broken, if we remove this modified version?
There was a problem hiding this comment.
I asked Bob about it, see his answer below.
We probably don't have such queries currently in the code base, but who knows - maybe they will appear at some point...
By the way, the suggested test currently throws:
Minitest::UnexpectedError: ActiveRecord::StatementInvalid: OCIError: ORA-00907: missing right parenthesis
test/unit/oracle_hacks_test.rb:50:in `block in <class:CollectionsTest>'
test/unit/oracle_hacks_test.rb:50:in `block in <class:CollectionsTest>'
which, if I remember correctly is NOT what overriding distinct_relation_for_primary_key was fixing 🙃 (there were no errors - just 0 results when > 0 rows were expected).
🚨 Critical Bug Analysis: distinct_relation_for_primary_key Removal
Summary
The removal of the distinct_relation_for_primary_key override creates a critical bug when combined with the Oracle11 visitor fallback for LIMIT+LOCK queries.
The Bug Scenario
When distinct_relation_for_primary_key is Called
From Rails source (active_record/relation/finder_methods.rb):
def apply_join_dependency(eager_loading: group_values.empty?)
# ...
if eager_loading && has_limit_or_offset? && !(using_limitable_reflections?)
relation = klass.connection.distinct_relation_for_primary_key(relation)
end
endThis method is called when:
- Eager loading associations (
.includes()or.eager_load()) - Has LIMIT or OFFSET
- Associations are not "limitable" (complex joins that can't be optimized)
The Problem Query
A query like this WILL trigger the bug:
Account.includes(:users)
.where(state: 'approved')
.order(:created_at)
.limit(10)
.lock # <-- LOCK added!
.to_aWhat Happens (Step by Step)
- ✅ Query has
limit+lock→ Oracle12 visitor detects this condition - ✅ Switches to Oracle11 visitor via
visit_Arel_Nodes_SelectStatement - ✅ SQL generated with ROWNUM syntax (not FETCH FIRST)
- ❌ Rails calls
distinct_relation_for_primary_key(because of eager loading + limit) - ❌ The removed override is NOT there to handle the ROWNUM-based SQL
- 💥 The default Rails implementation expects FETCH FIRST syntax and will FAIL!
Why This Wasn't Caught
The removed distinct_relation_for_primary_key override was specifically designed to handle Oracle's quirks with result set parsing. When you switch to Oracle11 visitor (ROWNUM), the SQL structure changes fundamentally, but there's no corresponding distinct_relation_for_primary_key implementation for that case.
The Evidence
From the removed code comments:
# The change is needed because in Oracle, because otherwise the resulting
# `limited_ids` array would be wrong. For example, for a ServiceContract
# model that has the primary key "id" and a query with ordering and
# filtering you may get:
# 1. `limited` variable: #<ActiveRecord::Relation [#<ServiceContract alias_0__: "live", id: 4, raw_rnum_: 1>]>
# 2. `select_rows` would return [["live",4,1]] # <-- Note the raw_rnum_ column!
# 3. `limited_ids` calculation will result in [[1]], which is WRONG (expected [[4]])Key insight: The ROWNUM syntax adds extra columns (raw_rnum_, alias_0__) that the default Rails code doesn't know how to handle!
SQL Comparison
Oracle12 (FETCH FIRST) - What Rails expects:
SELECT * FROM accounts
ORDER BY created_at
FETCH FIRST 10 ROWS ONLY
FOR UPDATEOracle11 (ROWNUM) - What we actually generate with LOCK:
SELECT * FROM (
SELECT accounts.*, ROWNUM as raw_rnum_
FROM accounts
ORDER BY created_at
) WHERE raw_rnum_ <= 10
FOR UPDATEThe extra raw_rnum_ column breaks the default distinct_relation_for_primary_key logic!
Recommendations
This code removal is UNSAFE. You need to either:
Option 1: Keep the Override (Recommended)
Restore the distinct_relation_for_primary_key override - it's still needed when Oracle11 visitor is used.
Option 2: Prevent the Scenario
Add logic to prevent .lock on queries with eager loading + limit:
def visit_Arel_Nodes_SelectStatement(o, collector)
if o.limit && o.lock
# Check if this might trigger distinct_relation_for_primary_key
# and raise an error or handle it differently
@oracle11_visitor ||= Arel::Visitors::Oracle.new(@connection)
return @oracle11_visitor.accept(o.dup, collector)
end
super
endOption 3: Wait for Upstream Fix
Don't remove the code until oracle-enhanced 8.1.4+ is deployed AND verified to fix both:
- Restore Oracle12 visitor with limit+lock ROWNUM fallback rsim/oracle-enhanced#2573
- Add composite primary key support rsim/oracle-enhanced#2693
Suggested Test Case
test "eager loading with limit and lock should work" do
skip "Oracle-specific test" unless System::Database.oracle?
account = FactoryBot.create(:simple_account)
FactoryBot.create_list(:simple_user, 5, account: account)
# This should not raise an error or return wrong results
result = Account.includes(:users)
.where(id: account.id)
.order(:created_at)
.limit(3)
.lock
.to_a
assert_equal 1, result.size
assert_equal account.id, result.first.id
# Verify users were actually eager loaded (no N+1 queries)
assert result.first.association(:users).loaded?
endConclusion
Verdict: The removal of distinct_relation_for_primary_key creates a latent bug when:
- Eager loading associations
- With LIMIT/OFFSET
- With LOCK
- On complex joins
This is a rare but critical edge case that will cause incorrect query results or errors in production.
There was a problem hiding this comment.
This is what I wanted to capture in the comment. It might be possible to have a query that fails but we don't. We can generate many queries that will fail without various upstream fixes or things without existing fixes.
The fact is that we don't have such queries right now. If we introduce such queries, they should fail on oracle so we will know to either fix the queries or the adapter (in this case actually active record).
So I prefer not to have this redundant workaround in the codebase until we actually need it. Rather than focus on fixing upstream for a long-term safety.
I think bob can give a similar answer if you ask it whether we should backport all 8.1.4 upstream fixes to our monkey patch or not.
| # remove when addressed: https://github.com/rsim/oracle-enhanced/pull/2247 - included in v7.1.0 | ||
| def visit_Arel_Nodes_Matches o, collector | ||
| if !o.case_sensitive && o.left && o.right | ||
| o.left = Arel::Nodes::NamedFunction.new('UPPER', [o.left]) | ||
| o.right = Arel::Nodes::NamedFunction.new('UPPER', [o.right]) | ||
| end |
There was a problem hiding this comment.
this is already live in oracle adapter 7.1 so we can get rid of it 🎉
jlledom
left a comment
There was a problem hiding this comment.
Always happy to remove patches! Also, the Oracle 11 visitor being forced by the adapter was in fact something that needed addressing. Thanks for working on this!
| # Fixes ORA-00932: inconsistent datatypes: expected - got CLOB | ||
| # remove when https://github.com/rsim/oracle-enhanced/pull/2415 is merged | ||
| def visit_Arel_Nodes_Equality(o, collector) | ||
| left = o.left |
There was a problem hiding this comment.
The original method is the same except the if right guard. Is that what was causing the error when calling schema dump? because if not, we could just remove this patch and use the original method.
https://github.com/rsim/oracle-enhanced/blob/v7.1.1/lib/arel/visitors/oracle_common.rb#L10-L20
The purpose of this patch, which according to the comment is to handle the DBMS_LOB.COMPARE issue, is in fact fixed upstream.
| def visit_Arel_Nodes_SelectStatement(o, collector) | ||
| if o.limit && o.lock | ||
| @oracle11_visitor ||= Arel::Visitors::Oracle.new(@connection) | ||
| return @oracle11_visitor.accept(o.dup, collector) | ||
| end | ||
| super | ||
| end |
There was a problem hiding this comment.
Maybe add a test for this?
There was a problem hiding this comment.
We have failing tests without this, that's why I had to backport it. I don't have a low-level reproducer and think there is no point in spending time on finding one. I'm mostly interested in fixing actual calls that we make in porta, not backport all possible upstream fixes.
There was a problem hiding this comment.
Wouldn't a simple Model.lock.limit(1).first raise an exception?
There was a problem hiding this comment.
The actual issue I couldn't replicate better than existing tests was
ArgumentError: Combination of limit and lock is not supported. Because generated SQL statements
SELECT FOR UPDATE and FETCH FIRST n ROWSgenerates ORA-02014.
| @odbc_dsn ||= "DSN=oracle;Driver={Oracle-Driver};Dbq=#{conf[:host]}:#{conf[:port] || 1521}/#{conf[:database]};Uid=#{conf[:username]};Pwd=#{conf[:password]}" | ||
| end | ||
| end | ||
| end) |
There was a problem hiding this comment.
We might not need the ts-datetime-delta gem then.
| # uncomment the commented our lines for https://github.com/rsim/oracle-enhanced/pull/2485 | ||
| # if you uncomment them and things don't fail, then they are taking effect | ||
| ActiveRecord::Base.skip_callback(:update, :after, :enhanced_write_lobs) | ||
| # ActiveRecord::Base.skip_callback(:create, :after, :enhanced_write_lobs) | ||
| ActiveRecord::Base.skip_callback(:update, :before, :record_changed_lobs) | ||
| # ActiveRecord::Base.skip_callback(:create, :before, :record_lobs_for_create) |
There was a problem hiding this comment.
About the commented lines, why are they commented? Why can they be uncommented for :update but not for :insert?
Also, I don't understand your comment: "if you uncomment them and things don't fail, then they are taking effect", mmm... if you uncomment them and things don't fail, doesn't that mean the callback was in fact doing nothing? and how do they "take effect"? What the lines do is removing a callback, so preventing it to take effect.
There was a problem hiding this comment.
These callbacks are redundant and long version is in the upstream pull request.
Future upstream versions have the callbacks for create as well update so now we cannot uncomment but later we can and should.
There was a problem hiding this comment.
I update d the comment with e2bfd15, hope it makes more sense now.
Various fixes and cleanup for 0racle in Rails 7.1