Skip to content

Commit bc45948

Browse files
committed
feat(secrets): configurable credential chain and writer
Turning on the Postgres secrets backend used to be all-or-nothing: setting [secrets] dropped vault from the read chain and ran a mandatory import. Now the backend stores and the write target are operator config, so a site can read from postgres, vault, or both -- in whatever order it wants -- and choose where new writes land. Two new [secrets] fields drive it, both naming a CredentialBackend (postgres or vault): - stores -- the backend read order, first match wins. The env/file local overrides are always tried first (when their [credentials.*] section is enabled); this list just orders the backends behind them. Defaults to ["vault"], today's behavior. Order is the operator's choice; for example, to roll Postgres in gradually: ["vault"] -> ["postgres", "vault"] -> ["postgres"]. - writer -- vault (default) or postgres. Flip it to send new writes to the journal. import_from stays fully independent -- importing from vault is orthogonal to where reads and writes flow, and it is now gated visibly at the call site. An empty or duplicate stores list fails the boot. Store order is the operator's choice; when the writer's backend isn't the highest-priority store, a write can be shadowed on read for any path a higher-priority store also holds, so that warns -- a deliberate shadow-write stays valid, it is not rejected. Tests added! This supports #2811 Signed-off-by: Chet Nichols III <chetn@nvidia.com>
1 parent f187779 commit bc45948

3 files changed

Lines changed: 346 additions & 95 deletions

File tree

crates/api-core/src/cfg/file.rs

Lines changed: 165 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -730,9 +730,9 @@ pub struct CarbideConfig {
730730
#[serde(default)]
731731
pub tracing: TracingConfig,
732732

733-
/// Secrets backend configuration. When present, credentials live
734-
/// encrypted in Postgres and vault leaves the credential chain
735-
/// entirely; when absent, vault remains the credential store.
733+
/// Secrets backend configuration. When present, the credential reader
734+
/// chain and write target are operator-configured (defaulting to the same
735+
/// env -> file -> vault behavior as when it is absent); see `SecretsConfig`.
736736
pub secrets: Option<SecretsConfig>,
737737
}
738738

@@ -846,24 +846,26 @@ pub enum BgpLeafSessionPassword {
846846
SiteWide,
847847
}
848848

849-
/// Configures the Postgres secrets backend. When this section is present,
850-
/// credentials live encrypted in Postgres and vault is not in the
851-
/// credential chain at all -- the one-time import either completes before
852-
/// the process serves traffic, or the process does not start. Vault keeps
853-
/// serving PKI certificates either way.
849+
/// Configures the Postgres secrets backend and how credentials flow. When
850+
/// this section is present the reader chain and the write target come from
851+
/// `stores` / `writer` below; their defaults keep today's behavior
852+
/// (env -> file -> vault, writes to vault), so adding `[secrets]` does not
853+
/// change credential routing on its own. Operators choose which backend
854+
/// stores to read, in what order, and which store takes writes, by editing
855+
/// `stores` and `writer`. Vault keeps serving PKI certificates regardless of
856+
/// the chain.
854857
///
855-
/// Enabling this on an existing site has two prerequisites that live
856-
/// outside this process:
858+
/// Two prerequisites live outside this process and matter once writes move
859+
/// to Postgres (`writer = "postgres"`) or vault leaves `stores`:
857860
///
858861
/// - Services that read credentials from vault through their own chains
859-
/// (`bmc-proxy`, `dsx-exchange-consumer`) keep reading vault and will
860-
/// not see anything carbide-api writes to Postgres afterwards. They must
861-
/// be migrated or fed another way before credentials here change.
862-
/// - During a rolling upgrade, replicas still running the vault config
863-
/// keep writing rotated credentials to vault, where they are stranded
864-
/// once the import has completed. Keep autonomous credential writers
865-
/// (site-explorer credential rotation) disabled until the whole fleet
866-
/// runs this config.
862+
/// (`bmc-proxy`, `dsx-exchange-consumer`) will not see anything carbide-api
863+
/// writes to Postgres. They must be pointed at the same store, or fed
864+
/// another way, before the credentials they read change.
865+
/// - During a rolling upgrade, replicas still on an older config keep writing
866+
/// rotated credentials to their own writer. Keep autonomous credential
867+
/// writers (site-explorer credential rotation) disabled until the whole
868+
/// fleet runs a consistent config.
867869
#[derive(Clone, Debug, Deserialize, Serialize)]
868870
#[serde(deny_unknown_fields)]
869871
pub struct SecretsConfig {
@@ -884,9 +886,37 @@ pub struct SecretsConfig {
884886
/// ```
885887
pub routing: std::collections::HashMap<String, String>,
886888

889+
/// The credential *store* read order, highest priority first (first match
890+
/// wins). The local-override readers (env, file) are always tried ahead of
891+
/// these, when their `[credentials.*]` section is enabled; this list only
892+
/// orders the stores behind them. Order is the operator's choice -- list
893+
/// the stores you want, in the priority you want. Defaults to `["vault"]`
894+
/// -- with the local overrides, that is the env -> file -> vault chain.
895+
///
896+
/// For example, to roll Postgres in gradually, walk this list:
897+
///
898+
/// 1. `["vault"]` -- Postgres configured but not yet read.
899+
/// 2. `["postgres", "vault"]` -- Postgres in front, vault as the safety net
900+
/// for anything Postgres misses.
901+
/// 3. `["postgres"]` -- vault no longer read.
902+
///
903+
/// An empty list, or a store named twice, fails the boot.
904+
#[serde(default = "default_secret_stores")]
905+
pub stores: Vec<CredentialBackend>,
906+
907+
/// Where new credential writes go. Defaults to `vault`; set to `postgres`
908+
/// to send new writes to the journal. Independent of `stores`: e.g.
909+
/// `writer = "postgres"` while `postgres` is not in `stores` (reads still
910+
/// served by vault) is a valid shadow-write -- it confirms writes land
911+
/// before reads start trusting Postgres -- and only logs a warning.
912+
#[serde(default)]
913+
pub writer: CredentialBackend,
914+
887915
/// A source backend to import secrets from at startup. Unset means a
888916
/// fresh site with nothing to import; unsupported values fail config
889-
/// parsing rather than silently skipping the import.
917+
/// parsing rather than silently skipping the import. Independent of
918+
/// `stores`/`writer` -- importing from vault is orthogonal to where
919+
/// reads and writes flow.
890920
pub import_from: Option<ImportSource>,
891921

892922
/// How to treat secrets that already exist in Postgres during import.
@@ -902,6 +932,27 @@ pub enum ImportSource {
902932
Vault,
903933
}
904934

935+
/// A credential backend -- postgres or vault. Listed in `[secrets].stores` to
936+
/// order the backends behind the always-first local overrides (env, file;
937+
/// first match wins, see `ChainedCredentialReader`), and named by
938+
/// `[secrets].writer` to choose where new writes go.
939+
#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, Hash, Deserialize, Serialize)]
940+
#[serde(rename_all = "lowercase")]
941+
pub enum CredentialBackend {
942+
/// The Postgres secrets journal.
943+
Postgres,
944+
/// Vault/OpenBao KV. The default write target (today's behavior).
945+
#[default]
946+
Vault,
947+
}
948+
949+
/// The default backend-store order (just vault). With the always-first env/file
950+
/// local overrides, this is the env -> file -> vault chain, so adding
951+
/// `[secrets]` changes nothing until an operator edits it.
952+
fn default_secret_stores() -> Vec<CredentialBackend> {
953+
vec![CredentialBackend::Vault]
954+
}
955+
905956
/// Configures the KMS backends that wrap DEKs. Several named providers can
906957
/// be defined: the active one wraps DEKs for new writes, and every provider
907958
/// answers unwraps for the kek_ids it has.
@@ -4156,6 +4207,11 @@ firmware_url = "https://firmware.example.com/fw-b.bin"
41564207
secrets.import_approach,
41574208
crate::secrets::ImportApproach::MissingOnly
41584209
);
4210+
4211+
// stores/writer were omitted above, so they default to vault-only
4212+
// (env/file are prepended separately) writing to vault.
4213+
assert_eq!(secrets.stores, vec![CredentialBackend::Vault]);
4214+
assert_eq!(secrets.writer, CredentialBackend::Vault);
41594215
}
41604216

41614217
// Verifies that a typo'd import source fails config parsing instead of
@@ -4186,6 +4242,96 @@ firmware_url = "https://firmware.example.com/fw-b.bin"
41864242
assert!(toml::from_str::<Wrapper>(toml_str).is_err());
41874243
}
41884244

4245+
// Verifies the stores list and writer parse from their enum values --
4246+
// one with Postgres in front of vault (writes to Postgres) and a
4247+
// postgres-only one (vault not read, writes to Postgres).
4248+
#[test]
4249+
fn secrets_config_parses_stores_and_writer() {
4250+
#[derive(Deserialize)]
4251+
struct Wrapper {
4252+
secrets: SecretsConfig,
4253+
}
4254+
4255+
let pg_first = r#"
4256+
[secrets]
4257+
stores = ["postgres", "vault"]
4258+
writer = "postgres"
4259+
4260+
[secrets.kms]
4261+
active = "local"
4262+
[secrets.kms.providers.local]
4263+
type = "integrated"
4264+
keys.default-key = { env = "K" }
4265+
4266+
[secrets.routing]
4267+
"/" = "default-key"
4268+
"#;
4269+
let secrets = toml::from_str::<Wrapper>(pg_first)
4270+
.expect("parse pg-first")
4271+
.secrets;
4272+
assert_eq!(
4273+
secrets.stores,
4274+
vec![CredentialBackend::Postgres, CredentialBackend::Vault]
4275+
);
4276+
assert_eq!(secrets.writer, CredentialBackend::Postgres);
4277+
4278+
// Postgres-only reads, writes to postgres too. (The
4279+
// writer-defaults-to-vault case is covered by the deserialize test
4280+
// above, with vault still in stores -- pairing a postgres-only chain
4281+
// with a vault writer is the read-after-write gap run.rs warns about.)
4282+
let postgres_only = r#"
4283+
[secrets]
4284+
stores = ["postgres"]
4285+
writer = "postgres"
4286+
4287+
[secrets.kms]
4288+
active = "local"
4289+
[secrets.kms.providers.local]
4290+
type = "integrated"
4291+
keys.default-key = { env = "K" }
4292+
4293+
[secrets.routing]
4294+
"/" = "default-key"
4295+
"#;
4296+
let secrets = toml::from_str::<Wrapper>(postgres_only)
4297+
.expect("parse postgres-only")
4298+
.secrets;
4299+
assert_eq!(secrets.stores, vec![CredentialBackend::Postgres]);
4300+
assert_eq!(secrets.writer, CredentialBackend::Postgres);
4301+
}
4302+
4303+
// Verifies a typo'd store or writer value fails parsing rather than
4304+
// silently dropping a backend from the chain.
4305+
#[test]
4306+
fn secrets_config_rejects_unknown_backend() {
4307+
#[derive(Deserialize)]
4308+
struct Wrapper {
4309+
#[expect(dead_code)]
4310+
secrets: SecretsConfig,
4311+
}
4312+
4313+
let base_kms = r#"
4314+
[secrets.kms]
4315+
active = "local"
4316+
[secrets.kms.providers.local]
4317+
type = "integrated"
4318+
keys.default-key = { env = "K" }
4319+
[secrets.routing]
4320+
"/" = "default-key"
4321+
"#;
4322+
4323+
let bad_store = format!("[secrets]\nstores = [\"postgrez\"]\n{base_kms}");
4324+
assert!(toml::from_str::<Wrapper>(&bad_store).is_err());
4325+
4326+
// env/file are local overrides, not backend stores -- they belong in
4327+
// [credentials.*], not [secrets].stores, so they're rejected here.
4328+
let env_as_store = format!("[secrets]\nstores = [\"env\"]\n{base_kms}");
4329+
assert!(toml::from_str::<Wrapper>(&env_as_store).is_err());
4330+
4331+
let bad_writer = format!("[secrets]\nwriter = \"valt\"\n{base_kms}");
4332+
assert!(toml::from_str::<Wrapper>(&bad_writer).is_err());
4333+
}
4334+
41894335
// Verifies that a misspelled optional key in [secrets] -- here
41904336
// `import_fom` for `import_from` -- fails to parse instead of leaving
41914337
// the import silently disabled. Without deny_unknown_fields, the typo'd

0 commit comments

Comments
 (0)