fix(connectors): handle unrecognized types with explicit coverage and diagnostics#3196
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (36.04%) is below the target coverage (50.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #3196 +/- ##
=============================================
- Coverage 74.10% 19.23% -54.87%
Complexity 943 943
=============================================
Files 1159 1157 -2
Lines 102033 90511 -11522
Branches 79083 67579 -11504
=============================================
- Hits 75607 17414 -58193
- Misses 23765 72678 +48913
+ Partials 2661 419 -2242
🚀 New features to boost your workflow:
|
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
/ready |
There was a problem hiding this comment.
The current code matches against wrong literals, which means that everything is coerced into a string.
Current literals (wrong):
- _BOOL
- _INT2
- _INT4
- _INT8
- _FLOAT4
- _FLOAT8
- _TEXT
- _VARCHAR
- _CHAR
- _BPCHAR
- _UUID
- _JSON
- _JSONB
Correct literals:
- BOOL[]
- INT2[]
- INT4[]
- INT8[]
- FLOAT4[]
- FLOAT8[]
- TEXT[]
- VARCHAR[]
- CHAR[]
- UUID[]
- JSON[]
- JSONB[]
There was a problem hiding this comment.
Also, the code treats BPCHAR as a possible option, but sqlx reports the display name CHAR[], so the BPCHAR-style array match can never fire.
| let value: Option<Vec<bool>> = row | ||
| .try_get(column_index) | ||
| .map_err(|_| Error::InvalidRecord)?; | ||
| Ok(value | ||
| .map(|arr| { | ||
| serde_json::Value::Array( | ||
| arr.into_iter().map(serde_json::Value::Bool).collect(), | ||
| ) | ||
| }) | ||
| .unwrap_or(serde_json::Value::Null)) | ||
| } | ||
| "_INT2" => { | ||
| let value: Option<Vec<i16>> = row |
There was a problem hiding this comment.
For arms decoding Option<Vec<T>>, the code assumes every element is non-null, but postgresql can contain null elements. Thus, if a null value is supplied to the current code, sqlx will fail to decode it.
There was a problem hiding this comment.
Something like Option<Vec<Option<T>>> could mitigate this.
slbotbm
left a comment
There was a problem hiding this comment.
Separately, the current code also does not handle the following types:
- TIMETZ
- TIMETZ[]
- DATE[]
- TIME[]
- TIMESTAMP[]
- TIMESTAMPTZ[]
- INTERVAL[]
| let total_secs = interval.microseconds / 1_000_000; | ||
| let remaining_us = (interval.microseconds % 1_000_000).unsigned_abs(); | ||
| let hours = total_secs / 3600; | ||
| let mins = (total_secs % 3600) / 60; | ||
| let secs = total_secs % 60; | ||
| if remaining_us != 0 { | ||
| parts.push(format!( | ||
| "{:02}:{:02}:{:02}.{:06}", | ||
| hours, mins, secs, remaining_us | ||
| )); |
There was a problem hiding this comment.
This part formats negative numbers incorrectly. For example, the above code would convert -1.5 seconds into 00:00:-1.500000
| "INT4" | "OID" => { | ||
| let value: Option<i32> = row | ||
| .try_get(column_index) | ||
| .map_err(|_| Error::InvalidRecord)?; |
There was a problem hiding this comment.
the code current treats OID as i32, but in postgres, that is u32, which can lead batch processing failures. I feel decoding this to an Oid type would be a good approach.
I also noticed you're decoding individual OID's, but not but not Vec<OID>
| } | ||
| "TIMESTAMP" | "TIMESTAMPTZ" => { | ||
| let value: Option<DateTime<Utc>> = row | ||
| .try_get(column_index) |
There was a problem hiding this comment.
though not related to your changes, in sqlx, TIMESTAMP is chrono::NaiveDateTime, while TIMESTAMPZ is chrono::DateTime<Utc>. If the user passes TIMESTAMP, the conversion will fail.
Which issue does this PR close?
Closes #3175
Rationale
Unrecognized Postgres types silently coerced to String, returning opaque
InvalidRecordon failure with no column or type context.What changed?
The default
_branch inextract_column_valueattempted to read any unrecognized Postgres type asOption<String>, failing silently withError::InvalidRecordand no indication of which column or type caused it.Added explicit handlers for DATE, TIME, INTERVAL, OID, NAME, BPCHAR, and 13 array types (_BOOL, _INT2, _INT4, _INT8, _FLOAT4, _FLOAT8, _TEXT, _VARCHAR, _CHAR, _BPCHAR, _UUID, _JSON, _JSONB). The default branch now logs a warning with column name and type before attempting String fallback, and returns
InvalidRecordValuewith a diagnostic message on failure.Local Execution
AI Usage