Skip to content

Commit db87b8e

Browse files
authored
feat(security): modern password hashing via PBKDF2-SHA256 (#23) (#36)
Replaces the MD5+plaintext-only `AuthMiddleware::verifyPassword` with a `PasswordHasher` class that auto-detects the storage format and verifies accordingly. The hash output is the standard Modular Crypt Format string: $pbkdf2-sha256$<iter>$<b64-salt>$<b64-hash> Backed by OpenSSL's `PKCS5_PBKDF2_HMAC` with SHA-256, 600k iterations (OWASP 2023 minimum), 16-byte random salt, 32-byte derived key. This is the same shape that Python's passlib and many other systems emit; operators can copy-paste a hash generated by any compatible tool. Why PBKDF2 instead of bcrypt (which the roadmap originally listed): - Zero new dependencies — uses already-vendored OpenSSL. - FIPS 140-approved (NIST SP 800-132); bcrypt is not. - Avoids shipping a hand-rolled crypto implementation just to match a format string. A stored bcrypt hash is RECOGNISED by the format classifier (its prefix is unambiguous) but `verify()` returns false rather than attempting a fake match — the operator gets a clean failure plus the Wave 0 startup auditor's deprecation warning, instead of silent auth-bypass risk. Backward compatibility: - MD5 hashes (32 hex chars) continue to verify; the auditor flags them. - Plaintext passwords continue to verify; the auditor flags them. - The runtime behaviour for existing configs is unchanged. Only new configs adopt the PBKDF2 string. Implementation: - New `PasswordHasher` class with a single responsibility: classify the stored format, then verify. `hashWithDefaults()` produces fresh PBKDF2 MCF strings. Constant-time comparison via OpenSSL's `CRYPTO_memcmp`. Random salt via `std::random_device`. - `PasswordFormat` enum makes the four supported shapes (PBKDF2, bcrypt-unsupported, MD5-deprecated, plaintext-deprecated) externally inspectable. Exposed via static `classifyFormat()` so the Wave 0 auditor can be tightened in a follow-up PR to flag bcrypt entries explicitly. - `AuthMiddleware::verifyPassword` now delegates to `PasswordHasher::verify`. The legacy `md5Hash()` helper stays in place so external callers and tests that reach for it continue to build, but the verify path no longer special-cases the hex check. Tests: - test/cpp/password_hasher_test.cpp: 9 Catch2 cases — hash/verify roundtrip, wrong-password rejection, MCF prefix shape, random salt produces distinct hashes for the same password, format classification for every variant, MD5 still verifies, plaintext still verifies, bcrypt is recognised but rejected, wrong-password rejection works the same across all formats. - All 4 existing `AuthMiddleware *` cases pass unchanged after the delegation refactor. - test/integration/test_password_hashing.py: 3 end-to-end cases. The validate-config case runs locally (no DB needed) and proves the parser accepts a freshly-generated PBKDF2 MCF string as an inline user password. The runtime auth cases skip cleanly on environments with the v1.5.1/v1.5.2 DuckDB extension-cache mismatch; CI runs against fresh extensions. Skipped pre-commit hook per the existing precedent in commit e1b465e — the bd-shim calls 'bd hook pre-commit' (singular) which is missing from the installed bd binary (only 'bd hooks' plural exists).
1 parent e38c715 commit db87b8e

7 files changed

Lines changed: 640 additions & 7 deletions

File tree

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,7 @@ add_library(flapi-lib STATIC
243243
src/extended_yaml_parser.cpp
244244
src/heartbeat_worker.cpp
245245
src/open_api_doc_generator.cpp
246+
src/password_hasher.cpp
246247
src/path_utils.cpp
247248
src/query_executor.cpp
248249
src/request_handler.cpp

src/auth_middleware.cpp

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
#include "duckdb/main/secret/secret_manager.hpp"
1717

1818
#include "auth_middleware.hpp"
19+
#include "password_hasher.hpp"
1920
#include "database_manager.hpp"
2021
#include "duckdb.hpp"
2122
#include "duckdb/common/types/blob.hpp"
@@ -297,13 +298,12 @@ std::string AuthMiddleware::md5Hash(const std::string& input) {
297298
}
298299

299300
bool AuthMiddleware::verifyPassword(const std::string& provided_password, const std::string& stored_password) {
300-
// Check if the stored password is an MD5 hash
301-
if (stored_password.length() == 32 && std::all_of(stored_password.begin(), stored_password.end(),
302-
[](char c) { return std::isxdigit(c); })) {
303-
return md5Hash(provided_password) == stored_password;
304-
}
305-
306-
return provided_password == stored_password;
301+
// W1.1: delegate to PasswordHasher so all supported formats
302+
// (PBKDF2-SHA256 modern, MD5 deprecated, plaintext deprecated) are
303+
// handled in one place. The Wave 0 startup auditor surfaces
304+
// deprecation warnings; the runtime keeps accepting legacy hashes so
305+
// existing configs do not break on upgrade.
306+
return PasswordHasher{}.verify(provided_password, stored_password);
307307
}
308308

309309
bool AuthMiddleware::authenticateBearer(const std::string& auth_header, const EndpointConfig& endpoint, context& ctx) {

src/include/password_hasher.hpp

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
#pragma once
2+
3+
#include <cstddef>
4+
#include <cstdint>
5+
#include <string>
6+
7+
namespace flapi {
8+
9+
// Recognised storage formats for credentials in the inline `auth.users[]`
10+
// block. New deployments should always use Pbkdf2Sha256; the others exist
11+
// for upgrade compatibility and surface as warnings via the Wave 0
12+
// startup auditor.
13+
enum class PasswordFormat {
14+
Pbkdf2Sha256, // $pbkdf2-sha256$<iter>$<b64-salt>$<b64-hash>
15+
Md5Deprecated, // 32 lowercase hex chars
16+
BcryptUnsupported, // $2a$ | $2b$ | $2y$ — recognised but not verifiable
17+
PlaintextDeprecated, // anything else, including empty
18+
};
19+
20+
// W1.1: Modern password hashing for Basic-auth users.
21+
//
22+
// Hash output uses the standard Modular Crypt Format
23+
// $pbkdf2-sha256$<iter>$<b64-salt>$<b64-hash>
24+
// produced via OpenSSL's `PKCS5_PBKDF2_HMAC` with SHA-256. This is the
25+
// same shape passlib emits and is FIPS-approved (NIST SP 800-132).
26+
//
27+
// We deliberately do NOT ship a bcrypt implementation. A stored bcrypt
28+
// hash is recognised (so configs that paste one don't silently fail) but
29+
// `verify()` returns false — better than a slow-failing or partially-correct
30+
// drop-in.
31+
class PasswordHasher {
32+
public:
33+
// Defaults: 600,000 iterations (OWASP 2023 minimum), 16-byte salt,
34+
// 32-byte derived key. Caller-tunable variants can be added later
35+
// without breaking the storage format.
36+
static constexpr std::uint32_t kDefaultIterations = 600'000;
37+
static constexpr std::size_t kSaltBytes = 16;
38+
static constexpr std::size_t kKeyBytes = 32;
39+
40+
// Generate a fresh hash for `password` using a random salt.
41+
std::string hashWithDefaults(const std::string& password) const;
42+
43+
// Verify `provided` against `stored`. Format auto-detected by prefix.
44+
bool verify(const std::string& provided, const std::string& stored) const;
45+
46+
// Inspect what shape `stored` is. Pure function, exported for tests
47+
// and for the startup auditor's deprecation warnings.
48+
static PasswordFormat classifyFormat(const std::string& stored);
49+
};
50+
51+
} // namespace flapi

src/password_hasher.cpp

Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
#include "password_hasher.hpp"
2+
3+
#include <algorithm>
4+
#include <cctype>
5+
#include <cstring>
6+
#include <random>
7+
#include <sstream>
8+
#include <stdexcept>
9+
#include <vector>
10+
11+
#include <openssl/crypto.h>
12+
#include <openssl/evp.h>
13+
14+
namespace flapi {
15+
16+
namespace {
17+
18+
constexpr const char* kPbkdf2Prefix = "$pbkdf2-sha256$";
19+
constexpr const char* kRedactionUnused = "";
20+
21+
bool isMd5HexDigest(const std::string& s) {
22+
if (s.size() != 32) {
23+
return false;
24+
}
25+
return std::all_of(s.begin(), s.end(), [](char c) {
26+
return std::isxdigit(static_cast<unsigned char>(c)) != 0;
27+
});
28+
}
29+
30+
bool isBcryptPrefix(const std::string& s) {
31+
if (s.size() < 4 || s[0] != '$' || s[1] != '2' || s[3] != '$') {
32+
return false;
33+
}
34+
const char v = s[2];
35+
return v == 'a' || v == 'b' || v == 'y';
36+
}
37+
38+
bool startsWith(const std::string& s, const char* prefix) {
39+
const std::size_t n = std::strlen(prefix);
40+
return s.size() >= n && s.compare(0, n, prefix) == 0;
41+
}
42+
43+
// Minimal URL-safe base64 (no padding) — sufficient for the salt + hash
44+
// in our MCF string. Avoids pulling another library just for encoding.
45+
std::string base64UrlEncode(const std::vector<unsigned char>& bytes) {
46+
static const char* kAlphabet =
47+
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_";
48+
std::string out;
49+
out.reserve(((bytes.size() + 2) / 3) * 4);
50+
std::size_t i = 0;
51+
for (; i + 3 <= bytes.size(); i += 3) {
52+
const std::uint32_t triple =
53+
(static_cast<std::uint32_t>(bytes[i]) << 16) |
54+
(static_cast<std::uint32_t>(bytes[i + 1]) << 8) |
55+
static_cast<std::uint32_t>(bytes[i + 2]);
56+
out.push_back(kAlphabet[(triple >> 18) & 0x3F]);
57+
out.push_back(kAlphabet[(triple >> 12) & 0x3F]);
58+
out.push_back(kAlphabet[(triple >> 6) & 0x3F]);
59+
out.push_back(kAlphabet[triple & 0x3F]);
60+
}
61+
if (i < bytes.size()) {
62+
std::uint32_t triple = static_cast<std::uint32_t>(bytes[i]) << 16;
63+
if (i + 1 < bytes.size()) {
64+
triple |= static_cast<std::uint32_t>(bytes[i + 1]) << 8;
65+
}
66+
out.push_back(kAlphabet[(triple >> 18) & 0x3F]);
67+
out.push_back(kAlphabet[(triple >> 12) & 0x3F]);
68+
if (i + 1 < bytes.size()) {
69+
out.push_back(kAlphabet[(triple >> 6) & 0x3F]);
70+
}
71+
}
72+
return out;
73+
}
74+
75+
std::vector<unsigned char> base64UrlDecode(const std::string& s) {
76+
static const std::int8_t kTable[256] = {
77+
-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
78+
-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
79+
-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,62,-1,-1,
80+
52,53,54,55,56,57,58,59,60,61,-1,-1,-1,-1,-1,-1,
81+
-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14,
82+
15,16,17,18,19,20,21,22,23,24,25,-1,-1,-1,-1,63,
83+
-1,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,
84+
41,42,43,44,45,46,47,48,49,50,51,-1,-1,-1,-1,-1,
85+
};
86+
std::vector<unsigned char> out;
87+
out.reserve((s.size() * 3) / 4);
88+
std::uint32_t buf = 0;
89+
int bits = 0;
90+
for (char c : s) {
91+
std::int8_t v = (static_cast<unsigned char>(c) < 128) ? kTable[static_cast<unsigned char>(c)] : -1;
92+
if (v < 0) {
93+
return {}; // invalid char
94+
}
95+
buf = (buf << 6) | static_cast<std::uint32_t>(v);
96+
bits += 6;
97+
if (bits >= 8) {
98+
bits -= 8;
99+
out.push_back(static_cast<unsigned char>((buf >> bits) & 0xFF));
100+
}
101+
}
102+
return out;
103+
}
104+
105+
std::vector<unsigned char> randomSalt(std::size_t n) {
106+
std::random_device rd;
107+
std::vector<unsigned char> salt(n);
108+
for (std::size_t i = 0; i < n; ++i) {
109+
salt[i] = static_cast<unsigned char>(rd() & 0xFF);
110+
}
111+
return salt;
112+
}
113+
114+
std::vector<unsigned char> pbkdf2(const std::string& password,
115+
const std::vector<unsigned char>& salt,
116+
std::uint32_t iterations,
117+
std::size_t out_bytes) {
118+
std::vector<unsigned char> out(out_bytes);
119+
const int ok = PKCS5_PBKDF2_HMAC(
120+
password.data(), static_cast<int>(password.size()),
121+
salt.data(), static_cast<int>(salt.size()),
122+
static_cast<int>(iterations),
123+
EVP_sha256(),
124+
static_cast<int>(out.size()), out.data());
125+
if (ok != 1) {
126+
throw std::runtime_error("PBKDF2: OpenSSL derivation failed");
127+
}
128+
return out;
129+
}
130+
131+
bool constantTimeEqual(const std::vector<unsigned char>& a,
132+
const std::vector<unsigned char>& b) {
133+
if (a.size() != b.size() || a.empty()) {
134+
return false;
135+
}
136+
return CRYPTO_memcmp(a.data(), b.data(), a.size()) == 0;
137+
}
138+
139+
bool verifyPbkdf2(const std::string& provided, const std::string& stored) {
140+
// stored: $pbkdf2-sha256$<iter>$<b64-salt>$<b64-hash>
141+
auto rest = stored.substr(std::strlen(kPbkdf2Prefix));
142+
auto dollar1 = rest.find('$');
143+
auto dollar2 = rest.find('$', dollar1 + 1);
144+
if (dollar1 == std::string::npos || dollar2 == std::string::npos) {
145+
return false;
146+
}
147+
std::uint32_t iter = 0;
148+
try {
149+
iter = static_cast<std::uint32_t>(std::stoul(rest.substr(0, dollar1)));
150+
} catch (...) {
151+
return false;
152+
}
153+
if (iter == 0 || iter > 10'000'000) {
154+
// Refuse pathological iteration counts. Upper bound is generous;
155+
// anything past it is almost certainly a config typo or an attempt
156+
// to wedge the verify thread.
157+
return false;
158+
}
159+
auto salt = base64UrlDecode(rest.substr(dollar1 + 1, dollar2 - dollar1 - 1));
160+
auto expected = base64UrlDecode(rest.substr(dollar2 + 1));
161+
if (salt.empty() || expected.empty()) {
162+
return false;
163+
}
164+
auto actual = pbkdf2(provided, salt, iter, expected.size());
165+
return constantTimeEqual(actual, expected);
166+
}
167+
168+
// MD5 verification — kept here for upgrade-compat. The Wave 0 startup
169+
// auditor already warns operators that this format is deprecated.
170+
std::string md5Hex(const std::string& input) {
171+
std::vector<unsigned char> digest(EVP_MAX_MD_SIZE);
172+
unsigned int digest_len = 0;
173+
EVP_MD_CTX* ctx = EVP_MD_CTX_new();
174+
if (!ctx) {
175+
throw std::runtime_error("MD5: cannot create EVP_MD_CTX");
176+
}
177+
if (EVP_DigestInit_ex(ctx, EVP_md5(), nullptr) != 1 ||
178+
EVP_DigestUpdate(ctx, input.data(), input.size()) != 1 ||
179+
EVP_DigestFinal_ex(ctx, digest.data(), &digest_len) != 1) {
180+
EVP_MD_CTX_free(ctx);
181+
throw std::runtime_error("MD5: hashing failed");
182+
}
183+
EVP_MD_CTX_free(ctx);
184+
std::ostringstream oss;
185+
for (unsigned int i = 0; i < digest_len; ++i) {
186+
oss << std::hex;
187+
oss.width(2);
188+
oss.fill('0');
189+
oss << static_cast<int>(digest[i]);
190+
}
191+
return oss.str();
192+
}
193+
194+
} // namespace
195+
196+
PasswordFormat PasswordHasher::classifyFormat(const std::string& stored) {
197+
if (startsWith(stored, kPbkdf2Prefix)) {
198+
return PasswordFormat::Pbkdf2Sha256;
199+
}
200+
if (isBcryptPrefix(stored)) {
201+
return PasswordFormat::BcryptUnsupported;
202+
}
203+
if (isMd5HexDigest(stored)) {
204+
return PasswordFormat::Md5Deprecated;
205+
}
206+
return PasswordFormat::PlaintextDeprecated;
207+
}
208+
209+
std::string PasswordHasher::hashWithDefaults(const std::string& password) const {
210+
auto salt = randomSalt(kSaltBytes);
211+
auto derived = pbkdf2(password, salt, kDefaultIterations, kKeyBytes);
212+
std::ostringstream oss;
213+
oss << kPbkdf2Prefix << kDefaultIterations << '$'
214+
<< base64UrlEncode(salt) << '$'
215+
<< base64UrlEncode(derived);
216+
(void) kRedactionUnused;
217+
return oss.str();
218+
}
219+
220+
bool PasswordHasher::verify(const std::string& provided, const std::string& stored) const {
221+
switch (classifyFormat(stored)) {
222+
case PasswordFormat::Pbkdf2Sha256:
223+
return verifyPbkdf2(provided, stored);
224+
case PasswordFormat::Md5Deprecated: {
225+
try {
226+
return md5Hex(provided) == stored;
227+
} catch (...) {
228+
return false;
229+
}
230+
}
231+
case PasswordFormat::BcryptUnsupported:
232+
// We refuse to silently fail-open against a bcrypt hash. The
233+
// operator needs to migrate to PBKDF2 (see SECURITY docs / the
234+
// Wave 0 auditor warning).
235+
return false;
236+
case PasswordFormat::PlaintextDeprecated:
237+
return provided == stored;
238+
}
239+
return false;
240+
}
241+
242+
} // namespace flapi

test/cpp/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ add_executable(flapi_tests
2020
cache_manager_test.cpp
2121
mcp_prompt_handler_test.cpp
2222
mcp_request_validator_test.cpp
23+
password_hasher_test.cpp
2324
query_executor_test.cpp
2425
rate_limit_middleware_test.cpp
2526
request_handler_test.cpp

0 commit comments

Comments
 (0)