Understanding how all the pieces fit together — or, "Why are there so many services?"
OpenVox (like Puppet before it) follows a client-server architecture with a declarative model. Instead of writing scripts that say "do this, then do that", you describe the desired state of your systems and let OpenVox figure out how to get there. Here's the 30,000-foot view:
┌─────────────────────────────────────────────────────┐
│ OpenVox Primary Server │
│ │
│ ┌──────────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ PuppetServer │ │ PuppetDB │ │ Certificate │ │
│ │ (Catalog │ │ (Facts, │ │ Authority │ │
│ │ Compiler) │ │ Reports, │ │ (SSL/TLS) │ │
│ │ │ │ Resources│ │ │ │
│ └──────┬───────┘ └────┬─────┘ └──────┬───────┘ │
│ │ │ │ │
│ └───────────────┼───────────────┘ │
│ │ │
│ ┌──────────────────────┴────────────────────────┐ │
│ │ Code Directory (/etc/puppetlabs/code) │ │
│ │ environments/ → production/ → manifests/ │ │
│ │ → modules/ │ │
│ │ → data/ (Hiera) │ │
│ └───────────────────────────────────────────────┘ │
└─────────────────────┬───────────────────────────────┘
│ Port 8140 (HTTPS/mTLS)
┌───────────┼──────────┐
│ │ │
┌─────┴─────┐ ┌───┴─────┐ ┌──┴──────┐
│ Agent 1 │ │ Agent 2 │ │ Agent N │
│ (node1) │ │ (node2) │ │ (nodeN) │
└───────────┘ └─────────┘ └─────────┘
The agent is the software that runs on every managed node (server, workstation, container — anything you want to manage). Its job is simple but important:
- Gather facts about the system (OS, IP address, memory, disk, etc.) using Facter
- Send those facts to the Primary Server
- Receive a compiled catalog (the "blueprint" of desired state)
- Apply the catalog — make the system match the desired state
- Send a report back to the server
The agent runs as a background service (typically via systemd) and checks in every 30 minutes by default. You can also trigger it manually with puppet agent -t.
Key paths:
| Path | Purpose |
|---|---|
/opt/puppetlabs/bin/puppet |
The puppet binary |
/etc/puppetlabs/puppet/puppet.conf |
Agent configuration |
/opt/puppetlabs/puppet/cache/ |
Cached catalogs, facts, reports |
/etc/puppetlabs/puppet/ssl/ |
SSL certificates |
PuppetServer is the brains of the operation. It's a JVM-based (Clojure + JRuby) application that:
- Receives agent requests over HTTPS (port 8140)
- Compiles catalogs — takes your Puppet code + the node's facts and produces a catalog
- Serves configuration elements from modules
- Manages the Certificate Authority (CA)
- Connects to PuppetDB for stored data
PuppetServer runs inside a Jetty web server and uses JRuby to execute Puppet's Ruby-based compiler. Yes, it's Java wrapping Ruby. No, we don't talk about that at parties.
Key paths:
| Path | Purpose |
|---|---|
/opt/puppetlabs/bin/puppetserver |
The server binary |
/etc/puppetlabs/puppetserver/ |
Server configuration |
/etc/puppetlabs/puppetserver/conf.d/ |
Server config fragments |
/var/log/puppetlabs/puppetserver/ |
Server logs |
PuppetDB is the data warehouse for your infrastructure. Every time an agent runs, PuppetDB stores:
- Facts — What each node looks like (OS, hardware, network, custom facts)
- Catalogs — What each node should look like
- Reports — What happened during each Puppet run
- Resources — Every resource on every node (exportable/collectible)
Branding note: The OpenVox project is rebranding PuppetDB to OpenVoxDB. The packages are now
openvoxdbandopenvoxdb-termini, but the systemd unit, schema, query API, and PQL are unchanged. You'll see both names during the transition.
PuppetDB uses PostgreSQL as its backend and exposes a powerful query API using PQL (Puppet Query Language). Want to find all nodes running CentOS 8 with more than 16GB of RAM? PQL can do that in one line.
nodes[certname] { facts.os.name = "CentOS" and facts.memory.system.total_bytes > 17179869184 }
Key paths:
| Path | Purpose |
|---|---|
/etc/puppetlabs/puppetdb/ |
PuppetDB configuration |
/etc/puppetlabs/puppetdb/conf.d/ |
Config fragments |
| Port 8081 | PuppetDB API (HTTPS with mTLS) |
OpenVox uses mutual TLS (mTLS) for all communication between agents and the server. This means both sides present certificates — the server proves it's the real server, and the agent proves it's a real agent. The CA (built into PuppetServer) manages this:
- Agent generates a key pair and sends a Certificate Signing Request (CSR)
- An admin (or autosign policy) signs the certificate
- Both sides now trust each other via the CA's root certificate
This is PKI (Public Key Infrastructure) done right. Every node gets its own certificate, and compromised nodes can be revoked individually.
Facter is a cross-platform system profiling tool. It discovers facts about the node — things like:
- Operating system and version
- IP addresses and MAC addresses
- CPU count, architecture, and model
- Memory and disk information
- Cloud provider metadata (AWS, GCP, Azure)
- Virtualization status
Facts are available in your Puppet code as variables (e.g., $facts['os']['name']), which lets you write conditional logic like "install Apache on RedHat, install apache2 on Debian."
Branding note: Facter has been rebranded to OpenFact in the OpenVox ecosystem (5.6.0+). The
facterbinary,facter.confconfiguration file, and thefacts.d/directory all keep their existing names — only the project/product name has changed.
Hiera is the hierarchical data lookup system built into Puppet. It lets you separate your data (parameters, configuration values) from your code (classes, modules). Instead of hardcoding values in your manifests, you put them in YAML files organized in a hierarchy:
data/
├── nodes/
│ └── webserver1.example.com.yaml ← Most specific
├── os/
│ ├── RedHat.yaml
│ └── Debian.yaml
├── environment/
│ ├── production.yaml
│ └── staging.yaml
└── common.yaml ← Least specific (fallback)
Hiera searches from most-specific to least-specific, returning the first match. This means you can set defaults in common.yaml and override them per-node, per-OS, or per-environment. It's like CSS specificity, but for infrastructure. (We dive deep into Hiera in the Hiera Deep-Dive.)
Here's what happens during a typical agent run, from start to finish:
Agent Node Primary Server
────────── ──────────────
1. Agent wakes up (timer or manual)
│
2. Facter gathers system facts
│
3. Agent sends facts ──────────────────────► 4. Server receives facts
│
5. Server compiles catalog
│ (code + facts + Hiera data)
│
4. Agent receives catalog ◄────────────────── 6. Server sends catalog
│
5. Agent compares catalog to
current system state
│
6. Agent applies changes
(creates files, installs packages,
starts services, etc.)
│
7. Agent sends report ────────────────────► 8. Server stores report
in PuppetDB
When the agent finishes, it returns an exit code:
| Exit Code | Meaning |
|---|---|
0 |
No changes needed — system already matches desired state |
1 |
Errors occurred during the run |
2 |
Changes were successfully applied |
4 |
Failures occurred (some resources failed) |
6 |
Both changes and failures occurred |
Common gotcha: Exit code 2 means "changes applied successfully." It's NOT an error! Many CI/CD systems treat non-zero exit codes as failures, so you may need to handle this explicitly. Puppet has been confusing automation engineers with this since approximately forever.
Environments let you isolate different versions of your Puppet code. Each environment is simply a directory under the environmentpath (by default /etc/puppetlabs/code/environments/), and every environment is self-contained with its own manifests, modules, and data.
OpenVox ships with a single default environment: production. That's it — everything else is up to you and your workflow. There's no mandated set of environments; you create whatever makes sense for your organization.
When using r10k for code deployment (which most teams do), environments map directly to Git branches in your control repository. Create a branch, deploy with r10k, and a matching environment appears on the server. This means your environments are as dynamic as your Git workflow:
Git Branch Puppet Environment
────────── ──────────────────
main ─── r10k deploy ──► production/
feature/add-nginx ─── r10k deploy ──► feature_add_nginx/
hotfix/ssl-cert ─── r10k deploy ──► hotfix_ssl_cert/
Note: Puppet converts characters that aren't valid in environment names (like
/and-) to underscores. The Git branchfeature/add-nginxbecomes the environmentfeature_add_nginx.
Each environment has its own:
- Manifests (
manifests/site.pp) - Modules (
modules/) - Hiera data (
data/) - Hiera config (
hiera.yaml) - Environment config (
environment.conf)
The directory structure looks like this:
/etc/puppetlabs/code/environments/
├── production/ ← The default (and often only permanent) environment
│ ├── manifests/
│ │ └── site.pp
│ ├── modules/ ← Managed by r10k (from Puppetfile)
│ ├── site-modules/ ← Your organization's custom modules
│ ├── data/
│ │ ├── common.yaml
│ │ └── nodes/
│ ├── hiera.yaml
│ ├── environment.conf
│ └── Puppetfile
├── feature_add_nginx/ ← Created automatically by r10k from a Git branch
│ └── ...
└── hotfix_ssl_cert/ ← Temporary — removed when the branch is merged/deleted
└── ...
Agents are assigned to environments in one of three ways:
puppet.conf— setenvironment = productionin the[agent]section (this is the default)- External Node Classifier (ENC) — a script or service that tells the server which environment a node belongs to
- Command line —
puppet agent -t --environment feature_add_nginx(great for testing a branch on a single node before merging)
Modules are how you organize and share Puppet code. A module is a directory with a specific structure:
mymodule/
├── manifests/
│ ├── init.pp ← Main class (class mymodule)
│ ├── config.pp ← Configuration subclass
│ ├── install.pp ← Package installation
│ └── service.pp ← Service management
├── files/ ← Static files to distribute
├── templates/ ← ERB or EPP templates
├── lib/
│ ├── facter/ ← Custom facts
│ └── puppet/
│ ├── functions/ ← Custom functions
│ └── types/ ← Custom types and providers
├── data/ ← Module-level Hiera data
├── spec/ ← Tests (rspec-puppet)
├── examples/ ← Usage examples
└── metadata.json ← Module metadata (name, version, deps)
The Puppet Forge hosts thousands of community modules. Because OpenVox is a drop-in replacement for Puppet, all Forge modules work with OpenVox without modification.
Here's a complete picture of how data flows through the system:
┌─────────────┐ ┌──────────────┐
│ Git Repo │──r10k───►│ Code Dir │
│ (control │ deploy │ /etc/puppet │
│ repo) │ │ labs/code/ │
└─────────────┘ └──────┬───────┘
│
┌─────────────┐ ┌──────┴───────┐ ┌──────────┐
│ Agent │──facts──►│ PuppetServer │──store──►│ PuppetDB │
│ (node) │ │ (compiler) │ │ (Postgres│
│ │◄─catalog─│ │◄─query───│ backend)│
│ │ └──────────────┘ └──────────┘
│ │
│ ┌────────┐ │
│ │ Facter │ │ (gathers facts)
│ └────────┘ │
│ ┌────────┐ │
│ │ Report │─┼──report──► PuppetDB
│ └────────┘ │
└─────────────┘
| Term | Definition |
|---|---|
| Agent | The software running on managed nodes that applies catalogs |
| Catalog | A compiled document describing all resources and their desired state |
| CA | Certificate Authority — manages SSL certificates for mTLS |
| Certname | A node's unique identifier (usually its FQDN) |
| ENC | External Node Classifier — assigns classes/parameters to nodes |
| Environment | An isolated set of Puppet code (production, staging, etc.) |
| Fact | A piece of information about a node (OS, IP, RAM, etc.) |
| Forge | The Puppet Forge — community module repository |
| Hiera | Hierarchical data lookup system |
| Idempotent | Can be applied repeatedly with the same result |
| Manifest | A .pp file containing Puppet code |
| Module | A self-contained bundle of Puppet code |
| mTLS | Mutual TLS — both client and server verify each other's certificates |
| Node | A managed system (server, VM, container, etc.) |
| OpenBolt | OpenVox's name for Bolt; package openbolt, binary still bolt |
| OpenFact | OpenVox's name for Facter; binary still facter |
| OpenVoxDB | OpenVox's name for PuppetDB; packages openvoxdb, openvoxdb-termini |
| PQL | Puppet Query Language — SQL-like language for querying PuppetDB |
| Primary Server | The central PuppetServer that compiles catalogs |
| Resource | A single unit of configuration (file, package, service, etc.) |
| r10k | Tool for deploying Puppet code from Git |
Next up: The Puppet Language →
This document was created with the assistance of AI (Grok, xAI). All technical content has been reviewed and verified by human contributors.