Skip to content

[BUG] Restrictive regex when parsing insert statement for determining batched inserts #1398

@cliffsun-anaplan

Description

@cliffsun-anaplan

Describe the bug
When EnableBatchedInserts=1 is set we've noticed that the inserts are not batched if our INSERT statement references a table name (or schema name if schema-qualified) that technically contains valid characters (in a quoted / delimited identifier) but that don't match the regex in InsertStatementParser.INSERT_DETAILS_PATTERN.

Example:
INSERT INTO `my-table` (`id`, `name`, `value`) VALUES (?, ?, ?) does not pass the regex match due to the hyphen in my-table.

The regex in question is ^\s*INSERT\s+INTO\s+([\w`\.]+)\s*\(([^)]+)\)\s+VALUES\s*\( - in particular the first capture group ([\w`\.]+).
This capture group regex does allow for backticks (and therefore delimited identifiers) but doesn't allow for the set of characters that would be valid within a delimited identifier.

In addition here is the code that calls it:

  private boolean canUseBatchedInsert() {
    // Check if batched inserts are enabled via connection property
    if (!connection.getConnectionContext().isBatchedInsertsEnabled()) {
      return false;
    }

    // Use strict exception-based parsing for better error handling
    try {
      InsertStatementParser.parseInsertStrict(sql);
      return true;
    } catch (Exception e) {
      // Not a valid INSERT statement suitable for batching
      return false;
    }
  }

The exception thrown by parseInsertStrict(sql) is discarded and not logged out at all, which would have helped with diagnosing this issue.

To Reproduce
Try to execute an INSERT statement against a table name that contains a character that is not a letter, digit or underscore (i.e. a quoted / delimited identifier), with EnableBatchedInserts=1 - you will see that you end up with individual inserts and therefore much slower performance.

Expected behavior
It should allow for batched inserts if delimited identifiers are used for schema / table names.

Screenshots
If applicable, add screenshots to help explain your problem.

Client side logs
If available, share redacted client side logs

Client Environment (please complete the following information):

  • OS: any
  • Java version [e.g. Java 21]
  • Java vendor [e.g. OpenJDK]
  • Driver Version: latest
  • BI Tool (if used) [e.g. DBeaver]
  • BI Tool version (if applicable) [e.g. 24.3.5]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions