Skip to content

Commit 9844c4c

Browse files
authored
Fix client-side maxRows enforcement in ResultSet.next() (#1448)
## Summary Add client-side `maxRows` enforcement in `DatabricksResultSet.next()`. When `statement.setMaxRows()` is set, `next()` returns `false` once the row limit is reached, even if the server returns more rows. Applies uniformly to all result types (Thrift, SEA, inline, CloudFetch). ### Problem The server-side `row_limit`/`resultRowLimit` is not always honored — the server may return more rows than requested. Some `IExecutionResult` implementations (`LazyThriftResult`, `StreamingColumnarResult`, `StreamingInlineArrowResult`) enforce maxRows internally, but `InlineJsonResult` and `ArrowStreamResult` do not, leaving a gap. ### Fix Single enforcement point at `DatabricksResultSet.next()`: - `maxRowsLimit` field cached from `parentStatement.getMaxRows()` at construction - `rowsReturned` counter incremented on each successful `next()` - Check `maxRowsLimit > 0 && rowsReturned >= maxRowsLimit` before delegating to `executionResult.next()` - `maxRows == 0` means no limit (JDBC convention) - `getUpdateCount()` internal iteration bypasses the limit via `countingUpdateRows` flag Existing per-implementation maxRows enforcement retained as defense-in-depth. ## Test plan - [x] `testNextRespectsMaxRows` — limit=3, 4th next() returns false - [x] `testNextMaxRowsZeroNoLimit` — no truncation with limit=0 - [x] `testNextMaxRowsNullParentNoLimit` — null parent = no limit - [x] `testNextMaxRowsOneEdge` — limit=1 edge case - [x] `testNextMaxRowsDoesNotCallExecutionResultAfterLimit` — verify no unnecessary delegation - [x] `testNextMaxRowsWithEmptyResultSet` — empty result + maxRows - [x] `testNextMaxRowsGreaterThanActualRows` — natural exhaustion before limit - [x] `testNextMaxRowsIdempotenceAfterLimit` — repeated calls after limit - [x] `testGetUpdateCountBypassesMaxRows` — DML counting unaffected - [x] Independent verification: 7/7 PASS (Agent C, black-box from spec) - [x] Full suite: 62 tests pass This pull request was AI-assisted by Isaac. --------- Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
1 parent 7de2cf4 commit 9844c4c

12 files changed

Lines changed: 269 additions & 316 deletions

NEXT_CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ upgrading. These changes do not affect metadata on All-Purpose Clusters.
5555
- Fixed `?` characters inside SQL comments, string literals, and quoted identifiers being incorrectly counted as parameter placeholders when `supportManyParameters=1`. `SQLInterpolator` now uses `SqlCommentParser` to locate only real placeholders. Fixes #1331.
5656
- Fixed `MetadataOperationTimeout` not being applied when metadata operations use SHOW commands. Operations like `getTables`, `getSchemas`, and `getColumns` now respect the `MetadataOperationTimeout` connection property instead of hanging indefinitely with no timeout.
5757
- Reclassify transient server errors to standard SQL states (08S01, 40001) across all Thrift error sites. This ensures UC unavailability and concurrent modification errors surface consistently for better retry handling. Note: Dashboards and branching logic keyed on legacy XXUCC or 42000 must be updated.
58+
- Fixed client-side enforcement of `maxRows` limit. When `statement.setMaxRows()` is set, `ResultSet.next()` now returns false once the row limit is reached, even if the server returns more rows. Applies to all result types (Thrift, SEA, inline, CloudFetch).
5859

5960
---
6061
*Note: When making changes, please add your change under the appropriate section

src/main/java/com/databricks/jdbc/api/impl/DatabricksResultSet.java

Lines changed: 86 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,16 @@ enum ResultSetType {
8282
// for the lifetime of a result set.
8383
private final TelemetryCollector cachedTelemetryCollector;
8484

85+
// Client-side maxRows enforcement. This central check in next() is the single
86+
// enforcement point using the full long precision from getLargeMaxRows().
87+
// No per-impl maxRows enforcement exists — all implementations delegate to this check.
88+
private final long maxRowsLimit;
89+
private long rowsReturned = 0;
90+
private boolean truncatedByMaxRows = false; // tracks client-side truncation for cursor methods
91+
// Flag to bypass maxRows check during getUpdateCount() internal iteration,
92+
// which calls next() to sum affected-row counts for DML statements.
93+
private boolean countingUpdateRows = false;
94+
8595
// Constructor for SEA result set
8696
public DatabricksResultSet(
8797
StatementStatus statementStatus,
@@ -121,6 +131,7 @@ public DatabricksResultSet(
121131
this.updateCount = null;
122132
this.parentStatement = parentStatement;
123133
this.cachedTelemetryCollector = resolveTelemetryCollector(parentStatement);
134+
this.maxRowsLimit = resolveMaxRowsLimit(parentStatement);
124135
this.isClosed = false;
125136
this.wasNull = false;
126137
}
@@ -142,6 +153,7 @@ public DatabricksResultSet(
142153
this.updateCount = null;
143154
this.parentStatement = parentStatement;
144155
this.cachedTelemetryCollector = resolveTelemetryCollector(parentStatement);
156+
this.maxRowsLimit = resolveMaxRowsLimit(parentStatement);
145157
this.isClosed = false;
146158
this.wasNull = false;
147159
this.complexDatatypeSupport = complexDatatypeSupport;
@@ -189,6 +201,7 @@ public DatabricksResultSet(
189201
this.updateCount = null;
190202
this.parentStatement = parentStatement;
191203
this.cachedTelemetryCollector = resolveTelemetryCollector(parentStatement);
204+
this.maxRowsLimit = resolveMaxRowsLimit(parentStatement);
192205
this.isClosed = false;
193206
this.wasNull = false;
194207
}
@@ -220,6 +233,7 @@ public DatabricksResultSet(
220233
this.updateCount = null;
221234
this.parentStatement = null;
222235
this.cachedTelemetryCollector = null;
236+
this.maxRowsLimit = 0;
223237
this.isClosed = false;
224238
this.wasNull = false;
225239
}
@@ -251,6 +265,7 @@ public DatabricksResultSet(
251265
this.updateCount = null;
252266
this.parentStatement = null;
253267
this.cachedTelemetryCollector = null;
268+
this.maxRowsLimit = 0;
254269
this.isClosed = false;
255270
this.wasNull = false;
256271
}
@@ -271,14 +286,52 @@ public DatabricksResultSet(
271286
this.updateCount = null;
272287
this.parentStatement = null;
273288
this.cachedTelemetryCollector = null;
289+
this.maxRowsLimit = 0;
274290
this.isClosed = false;
275291
this.wasNull = false;
276292
}
277293

294+
/**
295+
* Advances the cursor to the next row of the result set. Returns {@code false} when no more rows
296+
* are available or when the client-side {@code maxRows} limit has been reached.
297+
*
298+
* <p>{@code rowsReturned} tracks how many rows have been delivered to the caller through this
299+
* ResultSet instance. It is only incremented during normal (non-DML-counting) iteration and
300+
* assumes single-threaded access, which is the standard JDBC contract.
301+
*
302+
* @return {@code true} if the new current row is valid; {@code false} if there are no more rows
303+
* @throws SQLException if the result set is closed or an error occurs
304+
*/
278305
@Override
279306
public boolean next() throws SQLException {
280307
checkIfClosed();
308+
// Client-side maxRows truncation: stop before delegating to the underlying result
309+
// implementation when the limit has been reached. This is skipped during
310+
// getUpdateCount() internal iteration (countingUpdateRows) to avoid breaking DML
311+
// row counting. This is the single maxRows enforcement point for all implementations.
312+
if (maxRowsLimit > 0 && rowsReturned >= maxRowsLimit && !countingUpdateRows) {
313+
LOGGER.debug(
314+
"maxRows limit ({}) reached for statement {}; returning false after {} rows",
315+
maxRowsLimit,
316+
statementId,
317+
rowsReturned);
318+
if (!truncatedByMaxRows) {
319+
truncatedByMaxRows = true;
320+
// Record telemetry for truncated queries so dashboards reflect the truncation
321+
if (cachedTelemetryCollector != null) {
322+
cachedTelemetryCollector.recordResultSetIteration(
323+
statementId.toSQLExecStatementId(), resultSetMetaData.getChunkCount(), false);
324+
}
325+
}
326+
return false;
327+
}
281328
boolean hasNext = this.executionResult.next();
329+
// Only count rows for customer iteration, not internal DML counting
330+
// (getUpdateCount() sets countingUpdateRows=true to iterate over affected-row counts
331+
// without inflating the user-visible row counter).
332+
if (hasNext && !countingUpdateRows) {
333+
rowsReturned++;
334+
}
282335
if (cachedTelemetryCollector != null) {
283336
cachedTelemetryCollector.recordResultSetIteration(
284337
statementId.toSQLExecStatementId(), resultSetMetaData.getChunkCount(), hasNext);
@@ -321,6 +374,20 @@ private static TelemetryCollector resolveTelemetryCollector(
321374
return null;
322375
}
323376

377+
private static long resolveMaxRowsLimit(IDatabricksStatementInternal parentStatement) {
378+
try {
379+
if (parentStatement != null) {
380+
// Use getLargeMaxRows() to preserve full long precision.
381+
// getMaxRows() returns int and silently truncates values > Integer.MAX_VALUE.
382+
return parentStatement.getLargeMaxRows();
383+
}
384+
} catch (SQLException e) {
385+
// Narrow to SQLException (the only checked exception from getLargeMaxRows).
386+
LOGGER.warn("Error resolving maxRows limit: {}", e.getMessage());
387+
}
388+
return 0;
389+
}
390+
324391
@Override
325392
public boolean wasNull() throws SQLException {
326393
checkIfClosed();
@@ -662,7 +729,9 @@ public boolean isBeforeFirst() throws SQLException {
662729
@Override
663730
public boolean isAfterLast() throws SQLException {
664731
checkIfClosed();
665-
return executionResult.getCurrentRow() >= resultSetMetaData.getTotalRows();
732+
// Account for client-side maxRows truncation
733+
return truncatedByMaxRows
734+
|| executionResult.getCurrentRow() >= resultSetMetaData.getTotalRows();
666735
}
667736

668737
@Override
@@ -691,6 +760,11 @@ public boolean isFirst() throws SQLException {
691760
@Override
692761
public boolean isLast() throws SQLException {
693762
checkIfClosed();
763+
// Account for client-side maxRows truncation: if the next next() call would
764+
// hit the limit, this is the last row the caller will see.
765+
if (maxRowsLimit > 0 && rowsReturned >= maxRowsLimit) {
766+
return true;
767+
}
694768
if (executionResult instanceof LazyThriftResult
695769
|| executionResult instanceof StreamingColumnarResult
696770
|| executionResult instanceof LazyThriftInlineArrowResult
@@ -731,6 +805,10 @@ public boolean last() throws SQLException {
731805
@Override
732806
public int getRow() throws SQLException {
733807
checkIfClosed();
808+
// JDBC spec: getRow() returns 0 when cursor is not on a valid row (after last)
809+
if (truncatedByMaxRows) {
810+
return 0;
811+
}
734812
return (int) executionResult.getCurrentRow() + 1;
735813
}
736814

@@ -1958,8 +2036,13 @@ public long getUpdateCount() throws SQLException {
19582036
updateCount = 0L;
19592037
} else if (hasUpdateCount()) {
19602038
long rowsUpdated = 0;
1961-
while (next()) {
1962-
rowsUpdated += this.getLong(AFFECTED_ROWS_COUNT);
2039+
countingUpdateRows = true;
2040+
try {
2041+
while (next()) {
2042+
rowsUpdated += this.getLong(AFFECTED_ROWS_COUNT);
2043+
}
2044+
} finally {
2045+
countingUpdateRows = false;
19632046
}
19642047
updateCount = rowsUpdated;
19652048
} else {

src/main/java/com/databricks/jdbc/api/impl/LazyThriftResult.java

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
package com.databricks.jdbc.api.impl;
22

3-
import static com.databricks.jdbc.common.EnvironmentVariables.DEFAULT_RESULT_ROW_LIMIT;
43
import static com.databricks.jdbc.common.util.DatabricksThriftUtil.createColumnarView;
54

65
import com.databricks.jdbc.api.internal.IDatabricksSession;
@@ -21,7 +20,6 @@ public class LazyThriftResult implements IExecutionResult {
2120
private long globalRowIndex;
2221
private final IDatabricksSession session;
2322
private final IDatabricksStatementInternal statement;
24-
private final int maxRows;
2523
private boolean hasReachedEnd;
2624
private boolean isClosed;
2725
private long totalRowsFetched;
@@ -42,7 +40,6 @@ public LazyThriftResult(
4240
this.currentResponse = initialResponse;
4341
this.statement = statement;
4442
this.session = session;
45-
this.maxRows = statement != null ? statement.getMaxRows() : DEFAULT_RESULT_ROW_LIMIT;
4643
this.globalRowIndex = -1;
4744
this.currentBatchIndex = -1;
4845
this.hasReachedEnd = false;
@@ -115,13 +112,6 @@ public boolean next() throws SQLException {
115112
return false;
116113
}
117114

118-
// Check if we've reached the maxRows limit
119-
boolean hasRowLimit = maxRows > 0;
120-
if (hasRowLimit && globalRowIndex + 1 >= maxRows) {
121-
hasReachedEnd = true;
122-
return false;
123-
}
124-
125115
// Move to next row in current batch
126116
currentBatchIndex++;
127117
globalRowIndex++;
@@ -163,12 +153,6 @@ public boolean hasNext() {
163153
return false;
164154
}
165155

166-
// Check maxRows limit
167-
boolean hasRowLimit = maxRows > 0;
168-
if (hasRowLimit && globalRowIndex + 1 >= maxRows) {
169-
return false;
170-
}
171-
172156
// Check if there are more rows in current batch
173157
if (currentBatchIndex + 1 < currentBatch.getRowCount()) {
174158
return true;

src/main/java/com/databricks/jdbc/api/impl/arrow/LazyThriftInlineArrowResult.java

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
package com.databricks.jdbc.api.impl.arrow;
22

3-
import static com.databricks.jdbc.common.EnvironmentVariables.DEFAULT_RESULT_ROW_LIMIT;
43
import static com.databricks.jdbc.common.util.ArrowUtil.createArrowByteStream;
54
import static com.databricks.jdbc.common.util.ArrowUtil.getColumnInfoList;
65
import static com.databricks.jdbc.common.util.ArrowUtil.getSerializedSchema;
@@ -35,7 +34,6 @@ public class LazyThriftInlineArrowResult implements IExecutionResult {
3534
private long globalRowIndex;
3635
private final IDatabricksSession session;
3736
private final IDatabricksStatementInternal statement;
38-
private final int maxRows;
3937
private boolean hasReachedEnd;
4038
private boolean isClosed;
4139
private long totalRowsFetched;
@@ -58,7 +56,6 @@ public LazyThriftInlineArrowResult(
5856
this.currentResponse = initialResponse;
5957
this.statement = statement;
6058
this.session = session;
61-
this.maxRows = statement != null ? statement.getMaxRows() : DEFAULT_RESULT_ROW_LIMIT;
6259
this.globalRowIndex = -1;
6360
this.hasReachedEnd = false;
6461
this.isClosed = false;
@@ -163,13 +160,6 @@ public boolean next() throws SQLException {
163160
return false;
164161
}
165162

166-
// Check if we've reached the maxRows limit
167-
boolean hasRowLimit = maxRows > 0;
168-
if (hasRowLimit && globalRowIndex + 1 >= maxRows) {
169-
hasReachedEnd = true;
170-
return false;
171-
}
172-
173163
// Try to advance in current chunk
174164
if (currentChunkIterator != null && currentChunkIterator.hasNextRow()) {
175165
boolean advanced = currentChunkIterator.nextRow();
@@ -209,12 +199,6 @@ public boolean hasNext() {
209199
return false;
210200
}
211201

212-
// Check maxRows limit
213-
boolean hasRowLimit = maxRows > 0;
214-
if (hasRowLimit && globalRowIndex + 1 >= maxRows) {
215-
return false;
216-
}
217-
218202
// Check if there are more rows in current chunk
219203
if (currentChunkIterator != null && currentChunkIterator.hasNextRow()) {
220204
return true;

src/main/java/com/databricks/jdbc/api/impl/arrow/StreamingInlineArrowResult.java

Lines changed: 1 addition & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
package com.databricks.jdbc.api.impl.arrow;
22

3-
import static com.databricks.jdbc.common.EnvironmentVariables.DEFAULT_RESULT_ROW_LIMIT;
43
import static com.databricks.jdbc.common.EnvironmentVariables.DEFAULT_STREAMING_BATCH_TIMEOUT_SECONDS;
54
import static com.databricks.jdbc.common.util.ArrowUtil.getColumnInfoList;
65

@@ -54,7 +53,6 @@ public class StreamingInlineArrowResult implements IExecutionResult {
5453

5554
// Metadata
5655
private List<ColumnInfo> columnInfos;
57-
private final int maxRows;
5856

5957
// State
6058
private boolean hasReachedEnd;
@@ -78,8 +76,6 @@ public StreamingInlineArrowResult(
7876
throws DatabricksSQLException {
7977

8078
this.session = session;
81-
this.maxRows = statement != null ? statement.getMaxRows() : DEFAULT_RESULT_ROW_LIMIT;
82-
8379
this.globalRowIndex = -1;
8480
this.hasReachedEnd = false;
8581
this.isClosed = false;
@@ -105,9 +101,8 @@ public StreamingInlineArrowResult(
105101
}
106102

107103
LOGGER.debug(
108-
"StreamingInlineArrowResult initialized - firstBatchRows={}, maxRows={}, maxBatchesInMemory={}",
104+
"StreamingInlineArrowResult initialized - firstBatchRows={}, maxBatchesInMemory={}",
109105
currentBatch != null ? currentBatch.getRowCount() : 0,
110-
maxRows,
111106
session.getConnectionContext().getThriftMaxBatchesInMemory());
112107
}
113108

@@ -184,13 +179,6 @@ public boolean next() throws DatabricksSQLException {
184179
return false;
185180
}
186181

187-
// Check maxRows limit
188-
boolean hasRowLimit = maxRows > 0;
189-
if (hasRowLimit && globalRowIndex + 1 >= maxRows) {
190-
hasReachedEnd = true;
191-
return false;
192-
}
193-
194182
globalRowIndex++;
195183

196184
// Try to move to next row in current chunk
@@ -243,12 +231,6 @@ public boolean hasNext() {
243231
return false;
244232
}
245233

246-
// Check maxRows limit
247-
boolean hasRowLimit = maxRows > 0;
248-
if (hasRowLimit && globalRowIndex + 1 >= maxRows) {
249-
return false;
250-
}
251-
252234
// Check current chunk
253235
if (currentChunkIterator != null && currentChunkIterator.hasNextRow()) {
254236
return true;

0 commit comments

Comments
 (0)