-
Notifications
You must be signed in to change notification settings - Fork 706
PostgreSqlDialect accepts large amounts of non-PostgreSQL syntax #2237
Copy link
Copy link
Open
Description
While building a parser correctness benchmark using libpg_query (pg_query.rs) as the PostgreSQL ground truth, we measured how often PostgreSqlDialect accepts SQL that real PostgreSQL rejects. The numbers are surprisingly high.
Against SQL extracted from the sqlparser-rs test suite itself:
- 28.7% of statements rejected by pg_query are silently accepted by
PostgreSqlDialect(37/129, PostgreSQL-specific test file) - 30.0% in the broader common-dialect test file (141/470)
We understand sqlparser-rs is intentionally permissive. The question is: is this level of permissiveness intentional for PostgreSqlDialect, or is it leakage that would be worth tightening?
Examples of what PostgreSqlDialect currently accepts
A selection from the 141 cases found, grouped by the dialect the syntax originates from:
-- Oracle
FETCH NEXT IN my_cursor INTO result_table -- INTO clause on FETCH
-- SQL Server / T-SQL
SELECT TOP 3 * FROM tbl
EXEC my_proc N'param'
MERGE … OUTPUT inserted.* INTO log_target
EXECUTE FUNCTION f -- trigger EXECUTE without ()
-- MySQL / MariaDB
INSERT customer VALUES (1, 2, 3) -- missing INTO
INSERT OR REPLACE INTO t (id) VALUES(1)
DROP FUNCTION IF EXISTS f(a INTEGER, IN b INTEGER = 1) -- defaults in DROP
-- Snowflake / BigQuery
SELECT i FROM qt QUALIFY ROW_NUMBER() OVER (...) = 1
CREATE OR REPLACE TABLE t (a INT)
CREATE OR REPLACE USER IF NOT EXISTS u1 PASSWORD='secret'
-- ClickHouse
ALTER TABLE t ON CLUSTER my_cluster ADD CONSTRAINT bar PRIMARY KEY (baz)
-- HiveQL
ALTER TABLE t SET TBLPROPERTIES('classification' = 'parquet')
-- Unclear origin / possibly over-permissive parsing
ALTER TABLE t ALTER COLUMN id ADD GENERATED AS IDENTITY -- missing ALWAYS/BY DEFAULT
COPY t FROM 'f.csv' BINARY DELIMITER ',' CSV HEADER -- mutually exclusive formats
SHOW search_path search_path -- duplicate trailing tokenHappy to help with PRs if the direction is clear.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Fields
Give feedbackNo fields configured for issues without a type.