- ποΈ A real-world startup, ~2014 β
web-03has been serving production for two years. Nobody has SSH-d in for six months. It "just works" - π The disk fails. The team tries to rebuild from the runbook
- πͺ¦ The runbook is 3 pages long. The actual installed software, by
dpkg --list, has 412 packages with no version tracking - πͺ Three days of downtime while engineers reverse-engineer what was on the box
- π Lesson: A server that can't be rebuilt from a text file is a liability, not an asset. Configuration Management is the discipline that fixes this
π€ Think: If your laptop died right now, could you rebuild your QuickNotes VM from a text file in 5 minutes? Or would you spend an evening clicking?
| # | π Outcome |
|---|---|
| 1 | β Explain what Configuration Management is and the cattle-vs-pets connection |
| 2 | β Recognize the major tools: CFEngine, Puppet, Chef, Ansible, Salt |
| 3 | β Define idempotency and why it's the magic property |
| 4 | β Write a small Ansible playbook with tasks, handlers, and inventory |
| 5 | β
Use modules: apt, copy, template, systemd, service |
| 6 | β
Deploy QuickNotes to a Vagrant VM with a single ansible-playbook |
graph LR
A["π Cattle-vs-Pets"] --> B["π History of CM"]
B --> C["β»οΈ Idempotency"]
C --> D["π Playbooks"]
D --> E["π§° Modules"]
E --> F["π¦ Roles & Inventory"]
F --> G["β¬οΈ ansible-pull (GitOps preview)"]
- π Slides 1-5 β Why CM exists; the four big tools
- π Slides 6-10 β Ansible's model, playbooks, idempotency
- π Slides 11-15 β Modules, templates, handlers, roles
- π Slides 16-19 β ansible-pull, antipatterns, Lab 7
- π Slides 20-21 β Takeaways + resources
| Year | Tool | What it added |
|---|---|---|
| 1993 | CFEngine (Mark Burgess) | Promise theory; declarative; agent-based |
| 2005 | Puppet (Luke Kanies) | Ruby DSL; large enterprise adoption |
| 2009 | Chef (Adam Jacob) | Pure Ruby "recipes"; ordered execution |
| 2011 | Salt (Thomas Hatch) | Event-driven; ZeroMQ transport |
| 2012 | Ansible (Michael DeHaan) | Agentless via SSH; YAML; flat learning curve |
| 2015 | Ansible Galaxy | Shareable roles community |
| 2018 | Ansible acquired by Red Hat | Becomes corporate-supported |
- π― Ansible won the mindshare in the 2015-2020 era for one reason: no agent. SSH is everywhere; YAML is readable
- π€ In 2026 Ansible still dominates the on-prem and VM space; Kubernetes ate the container-orchestration use case
π‘ Idempotent (n): A property such that running an operation once produces the same end state as running it a hundred times.
| Operation | Idempotent? | Why it matters |
|---|---|---|
apt install -y nginx |
β | Run twice β still installed |
echo "x" >> /etc/hosts |
β | Run twice β two copies |
Ansible lineinfile module |
β | Ensures exactly the line you want |
Ansible command module |
β by default (you must add creates: / changed_when:) |
Pure side effect |
kubectl apply -f deployment.yaml |
β | The whole declarative model |
- π‘οΈ Idempotency is what lets you re-run a playbook safely after a partial failure
- πͺ€ Most config-mgmt tool modules are idempotent by construction β that's their value over shell scripts
graph LR
C["π» Control node<br/>your laptop / CI"] -- "SSH + YAML" --> M1["π₯οΈ managed node 1"]
C -- "SSH + YAML" --> M2["π₯οΈ managed node 2"]
C -- "SSH + YAML" --> M3["π₯οΈ managed node 3"]
| Term | What it is |
|---|---|
| Control node | Where Ansible runs (your laptop, a CI job). Needs Python + Ansible installed |
| Managed node | The target. Needs only Python (already on most Linuxes). No agent |
| Inventory | The list of managed nodes β INI, YAML, or dynamic (cloud-discovered) |
| Playbook | A YAML file describing the desired state |
| Module | A pre-built atomic action (install package, copy file, restart service) |
| Role | A reusable bundle of tasks/templates/vars |
- π― No agent on the target β installing Ansible on a fleet is "install on your laptop once"
# playbook.yaml
- name: Install and run QuickNotes
hosts: quicknotes_vm
become: true # β
run with sudo
vars:
quicknotes_version: "0.1.0"
listen_addr: ":8080"
tasks:
- name: Create system user
user:
name: quicknotes
system: true
shell: /usr/sbin/nologin
- name: Copy binary
copy:
src: "files/quicknotes-{{ quicknotes_version }}"
dest: /usr/local/bin/quicknotes
owner: quicknotes
mode: "0755"
notify: restart quicknotes
- name: Install systemd unit
template:
src: templates/quicknotes.service.j2
dest: /etc/systemd/system/quicknotes.service
notify: restart quicknotes
- name: Enable + start service
systemd:
name: quicknotes
enabled: true
state: started
daemon_reload: true
handlers:
- name: restart quicknotes
systemd:
name: quicknotes
state: restarted# inventory.ini
[quicknotes_vm]
qn-vm-1 ansible_host=127.0.0.1 ansible_port=2222 ansible_user=vagrant
[production]
qn-prod-1 ansible_host=10.0.1.10
qn-prod-2 ansible_host=10.0.1.11
[production:vars]
listen_addr=":80"# inventory.yaml (the modern style)
all:
children:
quicknotes_vm:
hosts:
qn-vm-1:
ansible_host: 127.0.0.1
ansible_port: 2222
ansible_user: vagrant- π·οΈ Groups let you target subsets:
hosts: production,hosts: quicknotes_vm - π€ Dynamic inventory queries AWS/GCP/Azure/Hetzner at runtime β no manual list to maintain
| Module | What it does | Idempotent |
|---|---|---|
apt / dnf / package |
Install/remove packages | β |
copy / template |
Put files on the target | β |
file |
Manage permissions, symlinks, directories | β |
service / systemd |
Start/stop/enable services | β |
user / group |
Manage users & groups | β |
# β
idempotent: ensures the file's owner + mode, only changes what's wrong
- name: Configure quicknotes data dir
file:
path: /var/lib/quicknotes
state: directory
owner: quicknotes
group: quicknotes
mode: "0750"- πͺ€
shell:andcommand:are escape hatches β use last; you lose idempotency
# templates/quicknotes.service.j2
[Unit]
Description=QuickNotes API
After=network-online.target
[Service]
ExecStart=/usr/local/bin/quicknotes
Restart=on-failure
User=quicknotes
Environment=ADDR={{ listen_addr }}
Environment=DATA_PATH=/var/lib/quicknotes/notes.json
[Install]
WantedBy=multi-user.target- π§ Variables flow in from the playbook β group_vars β host_vars β CLI
-e key=value - πͺ Same template, different values per environment (
listen_addr: :8080in dev,:80in prod) - β
The
templatemodule regenerates the file only if the rendered output differs β and triggers handlers if so
tasks:
- name: Install nginx config
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: reload nginx # β
notify only if file actually changed
handlers:
- name: reload nginx
service:
name: nginx
state: reloaded- πͺ€ If the config didn't change, the handler does not fire β no unnecessary reloads
- β³ Handlers run at the end of the play (or
meta: flush_handlersto force earlier) - π‘οΈ This is the Ansible pattern for "config changed β restart only this service"
roles/quicknotes/
βββ tasks/main.yaml # the task list
βββ handlers/main.yaml # restart, reload
βββ templates/
β βββ quicknotes.service.j2
βββ files/
β βββ quicknotes-0.1.0 # static binary
βββ defaults/main.yaml # overridable variables
βββ vars/main.yaml # role-pinned variables
βββ meta/main.yaml # dependencies on other roles
# playbook just composes roles
- hosts: quicknotes_vm
roles:
- common
- quicknotes- π€ Ansible Galaxy (galaxy.ansible.com) hosts community roles β great for ssh-hardening, nginx, postgres
β οΈ Like any third-party content, review before you trust it
# encrypt a vars file (interactive password)
ansible-vault create group_vars/production/vault.yaml
ansible-vault edit group_vars/production/vault.yaml
ansible-vault view group_vars/production/vault.yaml
# run a playbook using the password
ansible-playbook -i inventory.ini play.yaml --ask-vault-pass- π AES-256 symmetric encryption; password stays out of Git
- π€ In CI, password lives in a CI secret and is passed via
--vault-password-file - π‘οΈ Vault is good for "config-time" secrets; for "deploy-time cloud creds", prefer OIDC (Lecture 3)
Instead of pushing from a control node, let the target pull from Git on a schedule:
# on the managed node (or systemd timer)
ansible-pull \
-U https://github.com/inno-devops-labs/quicknotes.git \
-i hosts.ini \
ansible/playbook.yamlgraph LR
Repo["π Git repo (truth)"] -. clone/pull every 5 min .-> Node1["π₯οΈ Node 1"]
Repo -. clone/pull every 5 min .-> Node2["π₯οΈ Node 2"]
Repo -. clone/pull every 5 min .-> Node3["π₯οΈ Node 3"]
- π This is the same pattern as ArgoCD/Flux β Git is the truth, the agent pulls and converges
- π Lab 7 Bonus task wires this up via a systemd timer β your first GitOps experience
π¬ "Git β pull β reconcile." β the spine of every modern deploy system, just at different abstraction levels
| π₯ Antipattern | β Better |
|---|---|
| Long playbooks (1000+ lines) | Roles + role dependencies |
shell: everywhere instead of modules |
Use apt, file, systemd modules first |
| Hard-coded paths in tasks | Variables in defaults/main.yaml, overridable |
Plaintext secrets in vars.yaml |
Ansible Vault |
| Running playbooks against unknown hosts | Inventory groups; --limit flag |
gather_facts: true (default) when you don't need it |
gather_facts: false saves 5-30 s on every run |
| One playbook for 100 unrelated tasks | Tag tasks (tags: [config, restart]) and run subsets |
# ansible.cfg
[defaults]
forks = 20 # β
parallel hosts; default 5
host_key_checking = false # OK for ephemeral CI hosts
gathering = smart # β
cache facts where safe
[ssh_connection]
pipelining = true # β
~2x faster on slow links
control_path = ~/.ansible/cp/%%h-%%p-%%r- β‘ Pipelining sends fewer SSH round-trips per task β huge wins on high-latency links
- π Mitogen for Ansible β drops the right Python connection model in; 1.25-7Γ speed-ups on real workloads
- π§ͺ Lab 7 plays will finish in under 30 seconds on a single VM β measure before/after
Recall Lecture 1's Knight Capital story β manual deploy missed one server out of eight, $440M loss.
How would proper config mgmt have prevented it?
- π
ansible-playbook -i prod-inventory deploy.yamlβ all 8 hosts at once, in a singleansible-playbookinvocation - β
Pre-deploy:
--check(dry-run) confirms what will change - πͺͺ Post-deploy: a task verifies the binary checksum on each host
- π¨ If even one host fails, the play aborts and reports
- β³ Total wall-clock time: 2 minutes. Manual checklist time: ~45 (and one server missed)
π The lesson isn't "Ansible would have saved Knight." It's that making deploys atomic and verified is what saves you β Ansible is one good tool that helps you do it.
- π¨ Task 1 (6 pts): Write
ansible/playbook.yaml+ a Jinja2 systemd unit + an inventory targeting your Lab 5 VirtualBox VM. Runansible-playbookto deploy QuickNotes;curl :8080/healthfrom the host - β»οΈ Task 2 (4 pts): Demonstrate idempotency β run the playbook twice; verify
changed=0the second time. Then change one variable, re-run, verify only the affected handlers fire - π Bonus (2 pts): Wire
ansible-pullvia a systemd timer on the VM so it auto-converges every 5 minutes from the course repo. Edit something in Git; watch the VM heal - π Deliverable:
submissions/lab7.mdwith playbook output, idempotency proof, and reflection
- π Config Management is what makes cattle-vs-pets executable β your servers exist because of a text file
- β»οΈ Idempotency is the property β re-runs are safe; partial failures are recoverable
- π€ Ansible's win: agentless β SSH + Python on the target is enough
- π¦ Roles for reuse, templates for variation, vault for secrets β three patterns, used together
- π Handlers fire only on change β no needless restarts
- β¬οΈ ansible-pull = Git β target convergence loop β the same pattern that powers ArgoCD, Flux, and every modern deploy system
- π Next lecture: SRE & Monitoring β golden signals, Prometheus, dashboards
- π§ͺ Lab 7: Deploy QuickNotes to your VirtualBox VM via Ansible; demonstrate idempotency; Bonus: ansible-pull GitOps
- π Read this week:
- π Ansible Up & Running β Lorin Hochstein & RenΓ© Moser (3rd ed) β Chapters 1-6
- π Ansible for DevOps β Jeff Geerling β free draft + paid full edition
- π Ansible docs β User Guide
- π The Cathedral and the Bazaar β Eric Raymond β for the broader OSS context behind tools like Ansible
- π οΈ Tools to install this week: Ansible 10.x (Python 3.11+),
ansible-lint, optionally Mitogen
graph LR
P["π³ Week 6<br/>Containers"] --> Y["π You Are Here<br/>Ansible CM"]
Y --> N["π Week 8<br/>SRE & Monitoring"]
N --> M["π‘οΈ Week 9<br/>DevSecOps"]
π― Remember: The discipline is "everything that runs on a server is described in a file in Git" β Ansible is one good way to express that file. The discipline outlives the tool.