Skip to content

Commit 29e4267

Browse files
authored
fix: URL-encode special characters in connection string passwords (#597)
* fix: URL-encode special characters in connection string passwords Fixes #589 — stored passwords with special characters (`@`, `#`, `:`, `/`) cause cryptic authentication failures when used in URI-style connection strings because the userinfo section wasn't being percent-encoded. Root cause: - `sanitizeConnectionString()` existed in `normalize.ts` but was never wired into `normalizeConfig()` — connection strings with unencoded passwords went directly to drivers - The userinfo regex split on the *first* `@` instead of the *last*, so passwords containing `@` were silently mis-parsed and the function returned early thinking no encoding was needed Fix: - Wire `sanitizeConnectionString()` into `normalizeConfig()` as a single integration point so every URI-based driver benefits automatically - Fix the regex to use greedy matching for the userinfo portion and split on the last `@` before the host - Leave already-encoded passwords (`%XX` sequences) untouched for idempotency; decodeURIComponent wrapped in try/catch for safety Non-URI connection strings (Oracle TNS, key=value) and drivers using individual config fields (host/user/password) are unaffected — the fix only touches `connection_string`. Tests cover @/#/:/ in passwords, already-encoded passthrough, multiple URI schemes (postgresql://, mongodb://, mongodb+srv://), and the connectionString alias path. * fix: address code review findings on URI sanitizer Multi-model code review flagged three real bugs in sanitizeConnectionString: 1. CRITICAL — greedy regex corrupted URIs with '@' in query/path Example: postgresql://u:p@host/db?email=alice@example.com Rightmost '@' was in the query string, not the userinfo separator. The sanitizer treated host/db?email=alice as part of the password and percent-encoded it, producing a broken URI. 2. MAJOR — short-circuit on %XX skipped passwords with mixed encoding Example: postgresql://user:p%20ss@word@host/db Partially-encoded password contained both a valid %20 and a raw '@'. The 'already-encoded, leave alone' check bailed early and left '@' unencoded. 3. MINOR — username-only userinfo (no colon) was skipped entirely Example: postgresql://alice@example.com@host/db Email-as-username with unencoded '@' broke URI parsing because we returned early on missing colon. Changes: - Scan for the LAST '@' in the afterScheme portion directly instead of first parsing out an 'authority' segment. The URI spec disallows unencoded path/query/fragment delimiters in authority, but real-world passwords do contain them — so we honor user intent over spec. - Replace the %XX short-circuit + needsEncoding pre-check with a single idempotent encoder (decode → encode) applied to both user and password components. This round-trips already-encoded values unchanged and encodes raw values correctly in one pass. - Handle username-only userinfo by encoding it when no colon is present. - Add an ambiguity guard: when the rightmost '@' is followed by no path/query/fragment delimiter but preceded by one, the '@' is in the query/path/fragment — return the URI unchanged and expect the caller to pre-encode. Test coverage added for all three bugs plus IPv6 hosts, URIs with no port, URIs with no path, path-with-@, fragment-with-@, and the ambiguous both-password-and-query-have-@ case.
1 parent 8f52e99 commit 29e4267

3 files changed

Lines changed: 309 additions & 2 deletions

File tree

packages/drivers/src/index.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
export type { Connector, ConnectorResult, SchemaColumn, ConnectionConfig } from "./types"
33

44
// Re-export config normalization
5-
export { normalizeConfig } from "./normalize"
5+
export { normalizeConfig, sanitizeConnectionString } from "./normalize"
66

77
// Re-export driver connect functions
88
export { connect as connectPostgres } from "./postgres"

packages/drivers/src/normalize.ts

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,95 @@ const DRIVER_ALIASES: Record<string, AliasMap> = {
111111
// duckdb and sqlite have simple configs — no aliases needed
112112
}
113113

114+
// ---------------------------------------------------------------------------
115+
// Connection string password encoding
116+
// ---------------------------------------------------------------------------
117+
118+
/**
119+
* URI-style connection strings (postgres://, mongodb://, etc.) embed
120+
* credentials in the userinfo section: `scheme://user:password@host/db`.
121+
* If the password contains special characters (@, #, :, /, ?, etc.) and
122+
* they are NOT percent-encoded, drivers will mis-parse the URI and fail
123+
* with cryptic auth errors.
124+
*
125+
* This function detects an unencoded password in the userinfo portion and
126+
* re-encodes it so the connection string is valid. Already-encoded
127+
* passwords (containing %XX sequences) are left untouched.
128+
*/
129+
export function sanitizeConnectionString(connectionString: string): string {
130+
// Only touch scheme://... URIs.
131+
const schemeMatch = connectionString.match(/^[a-zA-Z][a-zA-Z0-9+.-]*:\/\//)
132+
if (!schemeMatch) return connectionString
133+
134+
const scheme = schemeMatch[0]
135+
const afterScheme = connectionString.slice(scheme.length)
136+
137+
// Find the userinfo/host separator. A password can legitimately contain
138+
// '@', '#', '/', '?', or ':' characters, so the URI spec is ambiguous
139+
// when those are unencoded. We use the LAST '@' as the separator because
140+
// that correctly handles the common case of passwords with special
141+
// characters (the stated purpose of this function — see issue #589).
142+
//
143+
// The known trade-off: if the query string or path also contains an
144+
// unencoded '@' (e.g. `?email=alice@example.com`), the rightmost '@'
145+
// is NOT the userinfo separator. We detect this ambiguous case below
146+
// with a guard and bail to avoid corrupting the URI.
147+
const lastAt = afterScheme.lastIndexOf("@")
148+
if (lastAt < 0) return connectionString // No userinfo — nothing to fix
149+
150+
const beforeAt = afterScheme.slice(0, lastAt)
151+
const afterAt = afterScheme.slice(lastAt + 1)
152+
153+
// Ambiguity guard: if the content AFTER the '@' has no path/query/
154+
// fragment delimiter ('/', '?', or '#') but the content BEFORE the
155+
// '@' does, then the '@' is almost certainly inside a path, query,
156+
// or fragment — not the userinfo separator. Bail and leave the URI
157+
// untouched so the caller can pre-encode explicitly.
158+
//
159+
// Examples that trigger the guard (correctly left alone):
160+
// postgresql://u:p@host/db?email=alice@example.com ('@' in query)
161+
// postgresql://u:p@host:5432/db@archive ('@' in path)
162+
// postgresql://u:p@host/db#at@frag ('@' in fragment)
163+
//
164+
// Examples that pass the guard (correctly encoded):
165+
// postgresql://u:p@ss@host/db (last '@' has '/' after)
166+
// postgresql://u:f@ke#PLACEHOLDER@host/db (last '@' has '/' after)
167+
const afterAtHasDelim = /[/?#]/.test(afterAt)
168+
const beforeAtHasDelim = /[/?#]/.test(beforeAt)
169+
if (!afterAtHasDelim && beforeAtHasDelim) {
170+
return connectionString
171+
}
172+
173+
// Idempotent re-encoding: decode any existing percent-escapes and
174+
// re-encode. Already-encoded values round-trip unchanged; raw values
175+
// with special characters get encoded. Malformed percent sequences
176+
// fall back to encoding the raw input.
177+
const encodeIdempotent = (v: string): string => {
178+
if (v.length === 0) return v
179+
try {
180+
return encodeURIComponent(decodeURIComponent(v))
181+
} catch {
182+
return encodeURIComponent(v)
183+
}
184+
}
185+
186+
// Split userinfo on the FIRST ':' only (password may contain ':').
187+
const colonIdx = beforeAt.indexOf(":")
188+
let encodedUserinfo: string
189+
if (colonIdx < 0) {
190+
// Username-only userinfo: still encode if it contains special chars
191+
// (e.g. email addresses used as usernames).
192+
encodedUserinfo = encodeIdempotent(beforeAt)
193+
} else {
194+
const user = beforeAt.slice(0, colonIdx)
195+
const password = beforeAt.slice(colonIdx + 1)
196+
encodedUserinfo = `${encodeIdempotent(user)}:${encodeIdempotent(password)}`
197+
}
198+
199+
const rebuilt = `${scheme}${encodedUserinfo}@${afterAt}`
200+
return rebuilt === connectionString ? connectionString : rebuilt
201+
}
202+
114203
// ---------------------------------------------------------------------------
115204
// Core logic
116205
// ---------------------------------------------------------------------------
@@ -178,6 +267,15 @@ export function normalizeConfig(config: ConnectionConfig): ConnectionConfig {
178267
const aliases = DRIVER_ALIASES[type]
179268
let result = aliases ? applyAliases(config, aliases) : { ...config }
180269

270+
// Sanitize connection_string: if the password contains special characters
271+
// (@, #, :, /, etc.) that are not percent-encoded, URI-based drivers will
272+
// mis-parse the string and fail with cryptic auth errors. This is the
273+
// single integration point — every caller of normalizeConfig() gets the
274+
// fix automatically.
275+
if (typeof result.connection_string === "string") {
276+
result.connection_string = sanitizeConnectionString(result.connection_string)
277+
}
278+
181279
// Type-specific post-processing
182280
// Note: MySQL SSL fields (ssl_ca, ssl_cert, ssl_key) are NOT constructed
183281
// into an ssl object here. They stay as top-level fields so the credential

packages/opencode/test/altimate/driver-normalize.test.ts

Lines changed: 210 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import { describe, expect, test } from "bun:test"
2-
import { normalizeConfig } from "@altimateai/drivers"
2+
import { normalizeConfig, sanitizeConnectionString } from "@altimateai/drivers"
33
import { isSensitiveField } from "../../src/altimate/native/connections/credential-store"
44

55
// ---------------------------------------------------------------------------
@@ -947,3 +947,212 @@ describe("normalizeConfig — ClickHouse", () => {
947947
expect(result.ssl_key).toBeUndefined()
948948
})
949949
})
950+
951+
// ---------------------------------------------------------------------------
952+
// sanitizeConnectionString — special character encoding
953+
// ---------------------------------------------------------------------------
954+
955+
describe("sanitizeConnectionString", () => {
956+
test("encodes @ in password", () => {
957+
const input = "postgresql://testuser:t@st@localhost:5432/testdb"
958+
const result = sanitizeConnectionString(input)
959+
expect(result).toBe("postgresql://testuser:t%40st@localhost:5432/testdb")
960+
})
961+
962+
test("encodes # in password", () => {
963+
const input = "postgresql://testuser:test#val@localhost:5432/testdb"
964+
const result = sanitizeConnectionString(input)
965+
expect(result).toBe("postgresql://testuser:test%23val@localhost:5432/testdb")
966+
})
967+
968+
test("encodes : in password", () => {
969+
const input = "postgresql://testuser:test:val@localhost:5432/testdb"
970+
const result = sanitizeConnectionString(input)
971+
expect(result).toBe("postgresql://testuser:test%3Aval@localhost:5432/testdb")
972+
})
973+
974+
test("encodes multiple special characters", () => {
975+
const input = "postgresql://testuser:t@st#v:al@localhost:5432/testdb"
976+
const result = sanitizeConnectionString(input)
977+
expect(result).toBe("postgresql://testuser:t%40st%23v%3Aal@localhost:5432/testdb")
978+
})
979+
980+
test("encodes / in password", () => {
981+
const input = "postgresql://testuser:test/val@localhost:5432/testdb"
982+
const result = sanitizeConnectionString(input)
983+
expect(result).toBe("postgresql://testuser:test%2Fval@localhost:5432/testdb")
984+
})
985+
986+
test("encodes ? in password", () => {
987+
const input = "postgresql://testuser:test?val@localhost:5432/testdb"
988+
const result = sanitizeConnectionString(input)
989+
expect(result).toBe("postgresql://testuser:test%3Fval@localhost:5432/testdb")
990+
})
991+
992+
test("handles malformed percent sequence in username gracefully", () => {
993+
const input = "postgresql://bad%ZZuser:t@st@localhost:5432/testdb"
994+
const result = sanitizeConnectionString(input)
995+
// Should not throw — falls back to encoding the raw username
996+
expect(result).toContain("@localhost:5432/testdb")
997+
})
998+
999+
test("leaves already-encoded passwords untouched", () => {
1000+
const input = "postgresql://testuser:t%40st%23val@localhost:5432/testdb"
1001+
expect(sanitizeConnectionString(input)).toBe(input)
1002+
})
1003+
1004+
test("leaves passwords without special characters untouched", () => {
1005+
const input = "postgresql://testuser:simpletestval@localhost:5432/testdb"
1006+
expect(sanitizeConnectionString(input)).toBe(input)
1007+
})
1008+
1009+
test("leaves non-URI strings untouched", () => {
1010+
const input = "host=localhost dbname=mydb"
1011+
expect(sanitizeConnectionString(input)).toBe(input)
1012+
})
1013+
1014+
test("handles mongodb scheme", () => {
1015+
const input = "mongodb://testuser:t@st@localhost:27017/testdb"
1016+
const result = sanitizeConnectionString(input)
1017+
expect(result).toBe("mongodb://testuser:t%40st@localhost:27017/testdb")
1018+
})
1019+
1020+
test("handles mongodb+srv scheme", () => {
1021+
const input = "mongodb+srv://testuser:t@st@cluster.example.com/testdb"
1022+
const result = sanitizeConnectionString(input)
1023+
expect(result).toBe("mongodb+srv://testuser:t%40st@cluster.example.com/testdb")
1024+
})
1025+
1026+
test("leaves URIs without password untouched", () => {
1027+
const input = "postgresql://testuser@localhost:5432/testdb"
1028+
expect(sanitizeConnectionString(input)).toBe(input)
1029+
})
1030+
1031+
test("preserves @ in query string (does not misinterpret as userinfo)", () => {
1032+
const input =
1033+
"postgresql://testuser:simpleval@localhost:5432/testdb?contact=alice@example.com"
1034+
expect(sanitizeConnectionString(input)).toBe(input)
1035+
})
1036+
1037+
test("bails on ambiguous URIs where both password and query contain @", () => {
1038+
// When both the password and the query string contain unencoded '@',
1039+
// there's no way to deterministically pick the userinfo separator.
1040+
// We return the URI untouched and expect the caller to pre-encode.
1041+
const input =
1042+
"postgresql://testuser:p@ss@localhost:5432/testdb?contact=alice@example.com"
1043+
expect(sanitizeConnectionString(input)).toBe(input)
1044+
})
1045+
1046+
test("encodes @ in username-only userinfo (no password)", () => {
1047+
// Email-as-username with no password: the '@' in the email must be
1048+
// encoded so the driver doesn't treat the domain as the host.
1049+
const input = "postgresql://alice@example.com@localhost:5432/testdb"
1050+
const result = sanitizeConnectionString(input)
1051+
expect(result).toBe(
1052+
"postgresql://alice%40example.com@localhost:5432/testdb",
1053+
)
1054+
})
1055+
1056+
test("encodes @ in partially-encoded password (not short-circuited by %XX)", () => {
1057+
// Password contains an encoded space (%20) AND a raw '@'. Previous
1058+
// logic short-circuited on seeing %XX and left '@' unencoded,
1059+
// producing a broken URI.
1060+
const input = "postgresql://testuser:p%20ss@word@localhost:5432/testdb"
1061+
const result = sanitizeConnectionString(input)
1062+
expect(result).toBe(
1063+
"postgresql://testuser:p%20ss%40word@localhost:5432/testdb",
1064+
)
1065+
})
1066+
1067+
test("encodes # in partially-encoded password", () => {
1068+
const input = "postgresql://testuser:pa%40ss#word@localhost:5432/testdb"
1069+
const result = sanitizeConnectionString(input)
1070+
// %40 is preserved; raw '#' gets encoded to %23
1071+
expect(result).toBe(
1072+
"postgresql://testuser:pa%40ss%23word@localhost:5432/testdb",
1073+
)
1074+
})
1075+
1076+
test("handles malformed percent sequence in password gracefully", () => {
1077+
// '%ZZ' is not a valid percent-escape. Falls back to encoding raw.
1078+
const input = "postgresql://testuser:bad%ZZpass@localhost:5432/testdb"
1079+
const result = sanitizeConnectionString(input)
1080+
// Raw-encoded password contains %25 (encoded '%') and ZZ literal
1081+
expect(result).toBe(
1082+
"postgresql://testuser:bad%25ZZpass@localhost:5432/testdb",
1083+
)
1084+
})
1085+
1086+
test("preserves @ in path after authority", () => {
1087+
// A path segment with '@' is unusual but valid and must not be
1088+
// treated as userinfo.
1089+
const input = "postgresql://testuser:simpleval@localhost:5432/db@archive"
1090+
expect(sanitizeConnectionString(input)).toBe(input)
1091+
})
1092+
1093+
test("preserves @ in fragment", () => {
1094+
const input = "postgresql://testuser:simpleval@localhost:5432/testdb#at@frag"
1095+
expect(sanitizeConnectionString(input)).toBe(input)
1096+
})
1097+
1098+
test("handles scheme-only URI with no path", () => {
1099+
const input = "postgresql://testuser:p@ss@localhost:5432"
1100+
const result = sanitizeConnectionString(input)
1101+
expect(result).toBe("postgresql://testuser:p%40ss@localhost:5432")
1102+
})
1103+
1104+
test("handles URI with no port", () => {
1105+
const input = "postgresql://testuser:p@ss@localhost/testdb"
1106+
const result = sanitizeConnectionString(input)
1107+
expect(result).toBe("postgresql://testuser:p%40ss@localhost/testdb")
1108+
})
1109+
})
1110+
1111+
// ---------------------------------------------------------------------------
1112+
// normalizeConfig — connection_string sanitization integration
1113+
// ---------------------------------------------------------------------------
1114+
1115+
describe("normalizeConfig — connection_string sanitization", () => {
1116+
test("sanitizes connection_string with special chars in password", () => {
1117+
const result = normalizeConfig({
1118+
type: "postgres",
1119+
connection_string: "postgresql://testuser:f@ke#PLACEHOLDER@localhost:5432/testdb",
1120+
})
1121+
expect(result.connection_string).toBe(
1122+
"postgresql://testuser:f%40ke%23PLACEHOLDER@localhost:5432/testdb",
1123+
)
1124+
})
1125+
1126+
test("sanitizes connectionString alias with special chars", () => {
1127+
const result = normalizeConfig({
1128+
type: "postgres",
1129+
connectionString: "postgresql://testuser:t@st@localhost:5432/testdb",
1130+
})
1131+
// alias resolved to connection_string, then sanitized
1132+
expect(result.connection_string).toBe(
1133+
"postgresql://testuser:t%40st@localhost:5432/testdb",
1134+
)
1135+
expect(result.connectionString).toBeUndefined()
1136+
})
1137+
1138+
test("does not alter connection_string without special chars", () => {
1139+
const result = normalizeConfig({
1140+
type: "redshift",
1141+
connection_string: "postgresql://testuser:testval@localhost:5439/testdb",
1142+
})
1143+
expect(result.connection_string).toBe(
1144+
"postgresql://testuser:testval@localhost:5439/testdb",
1145+
)
1146+
})
1147+
1148+
test("does not alter config without connection_string", () => {
1149+
const result = normalizeConfig({
1150+
type: "postgres",
1151+
host: "localhost",
1152+
password: "f@ke#PLACEHOLDER",
1153+
})
1154+
// Individual fields are NOT URI-encoded — drivers handle them natively
1155+
expect(result.password).toBe("f@ke#PLACEHOLDER")
1156+
expect(result.connection_string).toBeUndefined()
1157+
})
1158+
})

0 commit comments

Comments
 (0)