Skip to content

Commit 92ee3ea

Browse files
committed
feat(secrets): configurable credential chain and writer
Turning on the Postgres secrets backend used to be all-or-nothing: setting [secrets] dropped vault from the read chain and ran a mandatory import. Now the backend stores and the write target are operator config, so a site can read from postgres, vault, or both -- in whatever order it wants -- and choose where new writes land. Two new [secrets] fields drive it, both naming a CredentialBackend (postgres or vault): - stores -- the backend read order, first match wins. The env/file local overrides are always tried first (when their [credentials.*] section is enabled); this list just orders the backends behind them. Defaults to ["vault"], today's behavior. Order is the operator's choice; for example, to roll Postgres in gradually: ["vault"] -> ["postgres", "vault"] -> ["postgres"]. - writer -- vault (default) or postgres. Flip it to send new writes to the journal. import_from stays fully independent -- importing from vault is orthogonal to where reads and writes flow, and it is now gated visibly at the call site. An empty or duplicate stores list fails the boot; a writer whose backend is not in stores is allowed but warns about the read-after-write gap (e.g. a deliberate postgres shadow-write). Tests added! This supports #2811 Signed-off-by: Chet Nichols III <chetn@nvidia.com>
1 parent f187779 commit 92ee3ea

3 files changed

Lines changed: 338 additions & 95 deletions

File tree

crates/api-core/src/cfg/file.rs

Lines changed: 165 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -730,9 +730,9 @@ pub struct CarbideConfig {
730730
#[serde(default)]
731731
pub tracing: TracingConfig,
732732

733-
/// Secrets backend configuration. When present, credentials live
734-
/// encrypted in Postgres and vault leaves the credential chain
735-
/// entirely; when absent, vault remains the credential store.
733+
/// Secrets backend configuration. When present, the credential reader
734+
/// chain and write target are operator-configured (defaulting to the same
735+
/// env -> file -> vault behavior as when it is absent); see `SecretsConfig`.
736736
pub secrets: Option<SecretsConfig>,
737737
}
738738

@@ -846,24 +846,26 @@ pub enum BgpLeafSessionPassword {
846846
SiteWide,
847847
}
848848

849-
/// Configures the Postgres secrets backend. When this section is present,
850-
/// credentials live encrypted in Postgres and vault is not in the
851-
/// credential chain at all -- the one-time import either completes before
852-
/// the process serves traffic, or the process does not start. Vault keeps
853-
/// serving PKI certificates either way.
849+
/// Configures the Postgres secrets backend and how credentials flow. When
850+
/// this section is present the reader chain and the write target come from
851+
/// `stores` / `writer` below; their defaults keep today's behavior
852+
/// (env -> file -> vault, writes to vault), so adding `[secrets]` does not
853+
/// change credential routing on its own. Operators choose which backend
854+
/// stores to read, in what order, and which store takes writes, by editing
855+
/// `stores` and `writer`. Vault keeps serving PKI certificates regardless of
856+
/// the chain.
854857
///
855-
/// Enabling this on an existing site has two prerequisites that live
856-
/// outside this process:
858+
/// Two prerequisites live outside this process and matter once writes move
859+
/// to Postgres (`writer = "postgres"`) or vault leaves `stores`:
857860
///
858861
/// - Services that read credentials from vault through their own chains
859-
/// (`bmc-proxy`, `dsx-exchange-consumer`) keep reading vault and will
860-
/// not see anything carbide-api writes to Postgres afterwards. They must
861-
/// be migrated or fed another way before credentials here change.
862-
/// - During a rolling upgrade, replicas still running the vault config
863-
/// keep writing rotated credentials to vault, where they are stranded
864-
/// once the import has completed. Keep autonomous credential writers
865-
/// (site-explorer credential rotation) disabled until the whole fleet
866-
/// runs this config.
862+
/// (`bmc-proxy`, `dsx-exchange-consumer`) will not see anything carbide-api
863+
/// writes to Postgres. They must be pointed at the same store, or fed
864+
/// another way, before the credentials they read change.
865+
/// - During a rolling upgrade, replicas still on an older config keep writing
866+
/// rotated credentials to their own writer. Keep autonomous credential
867+
/// writers (site-explorer credential rotation) disabled until the whole
868+
/// fleet runs a consistent config.
867869
#[derive(Clone, Debug, Deserialize, Serialize)]
868870
#[serde(deny_unknown_fields)]
869871
pub struct SecretsConfig {
@@ -884,9 +886,37 @@ pub struct SecretsConfig {
884886
/// ```
885887
pub routing: std::collections::HashMap<String, String>,
886888

889+
/// The credential *store* read order, highest priority first (first match
890+
/// wins). The local-override readers (env, file) are always tried ahead of
891+
/// these, when their `[credentials.*]` section is enabled; this list only
892+
/// orders the stores behind them. Order is the operator's choice -- list
893+
/// the stores you want, in the priority you want. Defaults to `["vault"]`
894+
/// -- with the local overrides, that is the env -> file -> vault chain.
895+
///
896+
/// For example, to roll Postgres in gradually, walk this list:
897+
///
898+
/// 1. `["vault"]` -- Postgres configured but not yet read.
899+
/// 2. `["postgres", "vault"]` -- Postgres in front, vault as the safety net
900+
/// for anything Postgres misses.
901+
/// 3. `["postgres"]` -- vault no longer read.
902+
///
903+
/// An empty list, or a store named twice, fails the boot.
904+
#[serde(default = "default_secret_stores")]
905+
pub stores: Vec<CredentialBackend>,
906+
907+
/// Where new credential writes go. Defaults to `vault`; set to `postgres`
908+
/// to send new writes to the journal. Independent of `stores`: e.g.
909+
/// `writer = "postgres"` while `postgres` is not in `stores` (reads still
910+
/// served by vault) is a valid shadow-write -- it confirms writes land
911+
/// before reads start trusting Postgres -- and only logs a warning.
912+
#[serde(default)]
913+
pub writer: CredentialBackend,
914+
887915
/// A source backend to import secrets from at startup. Unset means a
888916
/// fresh site with nothing to import; unsupported values fail config
889-
/// parsing rather than silently skipping the import.
917+
/// parsing rather than silently skipping the import. Independent of
918+
/// `stores`/`writer` -- importing from vault is orthogonal to where
919+
/// reads and writes flow.
890920
pub import_from: Option<ImportSource>,
891921

892922
/// How to treat secrets that already exist in Postgres during import.
@@ -902,6 +932,27 @@ pub enum ImportSource {
902932
Vault,
903933
}
904934

935+
/// A credential backend -- postgres or vault. Listed in `[secrets].stores` to
936+
/// order the backends behind the always-first local overrides (env, file;
937+
/// first match wins, see `ChainedCredentialReader`), and named by
938+
/// `[secrets].writer` to choose where new writes go.
939+
#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, Hash, Deserialize, Serialize)]
940+
#[serde(rename_all = "lowercase")]
941+
pub enum CredentialBackend {
942+
/// The Postgres secrets journal.
943+
Postgres,
944+
/// Vault/OpenBao KV. The default write target (today's behavior).
945+
#[default]
946+
Vault,
947+
}
948+
949+
/// The default backend-store order (just vault). With the always-first env/file
950+
/// local overrides, this is the env -> file -> vault chain, so adding
951+
/// `[secrets]` changes nothing until an operator edits it.
952+
fn default_secret_stores() -> Vec<CredentialBackend> {
953+
vec![CredentialBackend::Vault]
954+
}
955+
905956
/// Configures the KMS backends that wrap DEKs. Several named providers can
906957
/// be defined: the active one wraps DEKs for new writes, and every provider
907958
/// answers unwraps for the kek_ids it has.
@@ -4156,6 +4207,11 @@ firmware_url = "https://firmware.example.com/fw-b.bin"
41564207
secrets.import_approach,
41574208
crate::secrets::ImportApproach::MissingOnly
41584209
);
4210+
4211+
// stores/writer were omitted above, so they default to vault-only
4212+
// (env/file are prepended separately) writing to vault.
4213+
assert_eq!(secrets.stores, vec![CredentialBackend::Vault]);
4214+
assert_eq!(secrets.writer, CredentialBackend::Vault);
41594215
}
41604216

41614217
// Verifies that a typo'd import source fails config parsing instead of
@@ -4186,6 +4242,96 @@ firmware_url = "https://firmware.example.com/fw-b.bin"
41864242
assert!(toml::from_str::<Wrapper>(toml_str).is_err());
41874243
}
41884244

4245+
// Verifies the stores list and writer parse from their enum values --
4246+
// one with Postgres in front of vault (writes to Postgres) and a
4247+
// postgres-only one (vault not read, writes to Postgres).
4248+
#[test]
4249+
fn secrets_config_parses_stores_and_writer() {
4250+
#[derive(Deserialize)]
4251+
struct Wrapper {
4252+
secrets: SecretsConfig,
4253+
}
4254+
4255+
let pg_first = r#"
4256+
[secrets]
4257+
stores = ["postgres", "vault"]
4258+
writer = "postgres"
4259+
4260+
[secrets.kms]
4261+
active = "local"
4262+
[secrets.kms.providers.local]
4263+
type = "integrated"
4264+
keys.default-key = { env = "K" }
4265+
4266+
[secrets.routing]
4267+
"/" = "default-key"
4268+
"#;
4269+
let secrets = toml::from_str::<Wrapper>(pg_first)
4270+
.expect("parse pg-first")
4271+
.secrets;
4272+
assert_eq!(
4273+
secrets.stores,
4274+
vec![CredentialBackend::Postgres, CredentialBackend::Vault]
4275+
);
4276+
assert_eq!(secrets.writer, CredentialBackend::Postgres);
4277+
4278+
// Postgres-only reads, writes to postgres too. (The
4279+
// writer-defaults-to-vault case is covered by the deserialize test
4280+
// above, with vault still in stores -- pairing a postgres-only chain
4281+
// with a vault writer is the read-after-write gap run.rs warns about.)
4282+
let postgres_only = r#"
4283+
[secrets]
4284+
stores = ["postgres"]
4285+
writer = "postgres"
4286+
4287+
[secrets.kms]
4288+
active = "local"
4289+
[secrets.kms.providers.local]
4290+
type = "integrated"
4291+
keys.default-key = { env = "K" }
4292+
4293+
[secrets.routing]
4294+
"/" = "default-key"
4295+
"#;
4296+
let secrets = toml::from_str::<Wrapper>(postgres_only)
4297+
.expect("parse postgres-only")
4298+
.secrets;
4299+
assert_eq!(secrets.stores, vec![CredentialBackend::Postgres]);
4300+
assert_eq!(secrets.writer, CredentialBackend::Postgres);
4301+
}
4302+
4303+
// Verifies a typo'd store or writer value fails parsing rather than
4304+
// silently dropping a backend from the chain.
4305+
#[test]
4306+
fn secrets_config_rejects_unknown_backend() {
4307+
#[derive(Deserialize)]
4308+
struct Wrapper {
4309+
#[expect(dead_code)]
4310+
secrets: SecretsConfig,
4311+
}
4312+
4313+
let base_kms = r#"
4314+
[secrets.kms]
4315+
active = "local"
4316+
[secrets.kms.providers.local]
4317+
type = "integrated"
4318+
keys.default-key = { env = "K" }
4319+
[secrets.routing]
4320+
"/" = "default-key"
4321+
"#;
4322+
4323+
let bad_store = format!("[secrets]\nstores = [\"postgrez\"]\n{base_kms}");
4324+
assert!(toml::from_str::<Wrapper>(&bad_store).is_err());
4325+
4326+
// env/file are local overrides, not backend stores -- they belong in
4327+
// [credentials.*], not [secrets].stores, so they're rejected here.
4328+
let env_as_store = format!("[secrets]\nstores = [\"env\"]\n{base_kms}");
4329+
assert!(toml::from_str::<Wrapper>(&env_as_store).is_err());
4330+
4331+
let bad_writer = format!("[secrets]\nwriter = \"valt\"\n{base_kms}");
4332+
assert!(toml::from_str::<Wrapper>(&bad_writer).is_err());
4333+
}
4334+
41894335
// Verifies that a misspelled optional key in [secrets] -- here
41904336
// `import_fom` for `import_from` -- fails to parse instead of leaving
41914337
// the import silently disabled. Without deny_unknown_fields, the typo'd

0 commit comments

Comments
 (0)