Skip to content

Latest commit

Β 

History

History
419 lines (321 loc) Β· 15.4 KB

File metadata and controls

419 lines (321 loc) Β· 15.4 KB

πŸ“Œ Lecture 7 β€” Configuration Management with Ansible: Idempotent, Declarative, in Git


πŸ“ Slide 1 – πŸ’₯ The Server Nobody Could Reproduce

  • πŸ—“οΈ A real-world startup, ~2014 β€” web-03 has been serving production for two years. Nobody has SSH-d in for six months. It "just works"
  • πŸ’€ The disk fails. The team tries to rebuild from the runbook
  • πŸͺ¦ The runbook is 3 pages long. The actual installed software, by dpkg --list, has 412 packages with no version tracking
  • πŸšͺ Three days of downtime while engineers reverse-engineer what was on the box
  • πŸŽ“ Lesson: A server that can't be rebuilt from a text file is a liability, not an asset. Configuration Management is the discipline that fixes this

πŸ€” Think: If your laptop died right now, could you rebuild your QuickNotes VM from a text file in 5 minutes? Or would you spend an evening clicking?


πŸ“ Slide 2 – 🎯 Learning Outcomes

# πŸŽ“ Outcome
1 βœ… Explain what Configuration Management is and the cattle-vs-pets connection
2 βœ… Recognize the major tools: CFEngine, Puppet, Chef, Ansible, Salt
3 βœ… Define idempotency and why it's the magic property
4 βœ… Write a small Ansible playbook with tasks, handlers, and inventory
5 βœ… Use modules: apt, copy, template, systemd, service
6 βœ… Deploy QuickNotes to a Vagrant VM with a single ansible-playbook

πŸ“ Slide 3 – πŸ—ΊοΈ Lecture Overview

graph LR
    A["πŸ„ Cattle-vs-Pets"] --> B["πŸ“œ History of CM"]
    B --> C["♻️ Idempotency"]
    C --> D["πŸ“„ Playbooks"]
    D --> E["🧰 Modules"]
    E --> F["πŸ“¦ Roles & Inventory"]
    F --> G["⬇️ ansible-pull (GitOps preview)"]
Loading
  • πŸ“ Slides 1-5 β€” Why CM exists; the four big tools
  • πŸ“ Slides 6-10 β€” Ansible's model, playbooks, idempotency
  • πŸ“ Slides 11-15 β€” Modules, templates, handlers, roles
  • πŸ“ Slides 16-19 β€” ansible-pull, antipatterns, Lab 7
  • πŸ“ Slides 20-21 β€” Takeaways + resources

πŸ“ Slide 4 – πŸ“œ A Short History of Configuration Management

Year Tool What it added
1993 CFEngine (Mark Burgess) Promise theory; declarative; agent-based
2005 Puppet (Luke Kanies) Ruby DSL; large enterprise adoption
2009 Chef (Adam Jacob) Pure Ruby "recipes"; ordered execution
2011 Salt (Thomas Hatch) Event-driven; ZeroMQ transport
2012 Ansible (Michael DeHaan) Agentless via SSH; YAML; flat learning curve
2015 Ansible Galaxy Shareable roles community
2018 Ansible acquired by Red Hat Becomes corporate-supported
  • 🎯 Ansible won the mindshare in the 2015-2020 era for one reason: no agent. SSH is everywhere; YAML is readable
  • πŸ€– In 2026 Ansible still dominates the on-prem and VM space; Kubernetes ate the container-orchestration use case

πŸ“ Slide 5 – ♻️ The Magic Word: Idempotency

πŸ’‘ Idempotent (n): A property such that running an operation once produces the same end state as running it a hundred times.

Operation Idempotent? Why it matters
apt install -y nginx βœ… Run twice β†’ still installed
echo "x" >> /etc/hosts ❌ Run twice β†’ two copies
Ansible lineinfile module βœ… Ensures exactly the line you want
Ansible command module ❌ by default (you must add creates: / changed_when:) Pure side effect
kubectl apply -f deployment.yaml βœ… The whole declarative model
  • πŸ›‘οΈ Idempotency is what lets you re-run a playbook safely after a partial failure
  • πŸͺ€ Most config-mgmt tool modules are idempotent by construction β€” that's their value over shell scripts

πŸ“ Slide 6 – πŸ”Œ Ansible's Mental Model

graph LR
    C["πŸ’» Control node<br/>your laptop / CI"] -- "SSH + YAML" --> M1["πŸ–₯️ managed node 1"]
    C -- "SSH + YAML" --> M2["πŸ–₯️ managed node 2"]
    C -- "SSH + YAML" --> M3["πŸ–₯️ managed node 3"]
Loading
Term What it is
Control node Where Ansible runs (your laptop, a CI job). Needs Python + Ansible installed
Managed node The target. Needs only Python (already on most Linuxes). No agent
Inventory The list of managed nodes β€” INI, YAML, or dynamic (cloud-discovered)
Playbook A YAML file describing the desired state
Module A pre-built atomic action (install package, copy file, restart service)
Role A reusable bundle of tasks/templates/vars
  • 🎯 No agent on the target β†’ installing Ansible on a fleet is "install on your laptop once"

πŸ“ Slide 7 – πŸ“„ The Playbook, Sample

# playbook.yaml
- name: Install and run QuickNotes
  hosts: quicknotes_vm
  become: true                # βœ… run with sudo
  vars:
    quicknotes_version: "0.1.0"
    listen_addr: ":8080"

  tasks:
    - name: Create system user
      user:
        name: quicknotes
        system: true
        shell: /usr/sbin/nologin

    - name: Copy binary
      copy:
        src: "files/quicknotes-{{ quicknotes_version }}"
        dest: /usr/local/bin/quicknotes
        owner: quicknotes
        mode: "0755"
      notify: restart quicknotes

    - name: Install systemd unit
      template:
        src: templates/quicknotes.service.j2
        dest: /etc/systemd/system/quicknotes.service
      notify: restart quicknotes

    - name: Enable + start service
      systemd:
        name: quicknotes
        enabled: true
        state: started
        daemon_reload: true

  handlers:
    - name: restart quicknotes
      systemd:
        name: quicknotes
        state: restarted

πŸ“ Slide 8 – πŸ“’ Inventory: Who Are We Targeting?

# inventory.ini
[quicknotes_vm]
qn-vm-1 ansible_host=127.0.0.1 ansible_port=2222 ansible_user=vagrant

[production]
qn-prod-1 ansible_host=10.0.1.10
qn-prod-2 ansible_host=10.0.1.11

[production:vars]
listen_addr=":80"
# inventory.yaml (the modern style)
all:
  children:
    quicknotes_vm:
      hosts:
        qn-vm-1:
          ansible_host: 127.0.0.1
          ansible_port: 2222
          ansible_user: vagrant
  • 🏷️ Groups let you target subsets: hosts: production, hosts: quicknotes_vm
  • πŸ€– Dynamic inventory queries AWS/GCP/Azure/Hetzner at runtime β€” no manual list to maintain

πŸ“ Slide 9 – 🧰 The Five Modules You'll Use 80% of the Time

Module What it does Idempotent
apt / dnf / package Install/remove packages βœ…
copy / template Put files on the target βœ…
file Manage permissions, symlinks, directories βœ…
service / systemd Start/stop/enable services βœ…
user / group Manage users & groups βœ…
# βœ… idempotent: ensures the file's owner + mode, only changes what's wrong
- name: Configure quicknotes data dir
  file:
    path: /var/lib/quicknotes
    state: directory
    owner: quicknotes
    group: quicknotes
    mode: "0750"
  • πŸͺ€ shell: and command: are escape hatches β€” use last; you lose idempotency

πŸ“ Slide 10 – πŸͺž Templates with Jinja2

# templates/quicknotes.service.j2
[Unit]
Description=QuickNotes API
After=network-online.target

[Service]
ExecStart=/usr/local/bin/quicknotes
Restart=on-failure
User=quicknotes
Environment=ADDR={{ listen_addr }}
Environment=DATA_PATH=/var/lib/quicknotes/notes.json

[Install]
WantedBy=multi-user.target
  • 🧠 Variables flow in from the playbook β†’ group_vars β†’ host_vars β†’ CLI -e key=value
  • πŸͺ„ Same template, different values per environment (listen_addr: :8080 in dev, :80 in prod)
  • βœ… The template module regenerates the file only if the rendered output differs β€” and triggers handlers if so

πŸ“ Slide 11 – πŸ”” Handlers: Run Only When Something Changed

tasks:
  - name: Install nginx config
    template:
      src: nginx.conf.j2
      dest: /etc/nginx/nginx.conf
    notify: reload nginx       # βœ… notify only if file actually changed

handlers:
  - name: reload nginx
    service:
      name: nginx
      state: reloaded
  • πŸͺ€ If the config didn't change, the handler does not fire β€” no unnecessary reloads
  • ⏳ Handlers run at the end of the play (or meta: flush_handlers to force earlier)
  • πŸ›‘οΈ This is the Ansible pattern for "config changed β†’ restart only this service"

πŸ“ Slide 12 – πŸ“¦ Roles: Reusable Building Blocks

roles/quicknotes/
β”œβ”€β”€ tasks/main.yaml         # the task list
β”œβ”€β”€ handlers/main.yaml      # restart, reload
β”œβ”€β”€ templates/
β”‚   └── quicknotes.service.j2
β”œβ”€β”€ files/
β”‚   └── quicknotes-0.1.0    # static binary
β”œβ”€β”€ defaults/main.yaml      # overridable variables
β”œβ”€β”€ vars/main.yaml          # role-pinned variables
└── meta/main.yaml          # dependencies on other roles
# playbook just composes roles
- hosts: quicknotes_vm
  roles:
    - common
    - quicknotes
  • 🀝 Ansible Galaxy (galaxy.ansible.com) hosts community roles β€” great for ssh-hardening, nginx, postgres
  • ⚠️ Like any third-party content, review before you trust it

πŸ“ Slide 13 – πŸ” Secrets: Ansible Vault

# encrypt a vars file (interactive password)
ansible-vault create group_vars/production/vault.yaml
ansible-vault edit  group_vars/production/vault.yaml
ansible-vault view  group_vars/production/vault.yaml

# run a playbook using the password
ansible-playbook -i inventory.ini play.yaml --ask-vault-pass
  • πŸ”’ AES-256 symmetric encryption; password stays out of Git
  • πŸ€– In CI, password lives in a CI secret and is passed via --vault-password-file
  • πŸ›‘οΈ Vault is good for "config-time" secrets; for "deploy-time cloud creds", prefer OIDC (Lecture 3)

πŸ“ Slide 14 – ⬇️ ansible-pull: The GitOps Preview

Instead of pushing from a control node, let the target pull from Git on a schedule:

# on the managed node (or systemd timer)
ansible-pull \
  -U https://github.com/inno-devops-labs/quicknotes.git \
  -i hosts.ini \
  ansible/playbook.yaml
graph LR
    Repo["πŸ“š Git repo (truth)"] -. clone/pull every 5 min .-> Node1["πŸ–₯️ Node 1"]
    Repo -. clone/pull every 5 min .-> Node2["πŸ–₯️ Node 2"]
    Repo -. clone/pull every 5 min .-> Node3["πŸ–₯️ Node 3"]
Loading
  • 🌟 This is the same pattern as ArgoCD/Flux β€” Git is the truth, the agent pulls and converges
  • 🎁 Lab 7 Bonus task wires this up via a systemd timer β€” your first GitOps experience

πŸ’¬ "Git β†’ pull β†’ reconcile." β€” the spine of every modern deploy system, just at different abstraction levels


πŸ“ Slide 15 – ❌ Ansible Antipatterns

πŸ”₯ Antipattern βœ… Better
Long playbooks (1000+ lines) Roles + role dependencies
shell: everywhere instead of modules Use apt, file, systemd modules first
Hard-coded paths in tasks Variables in defaults/main.yaml, overridable
Plaintext secrets in vars.yaml Ansible Vault
Running playbooks against unknown hosts Inventory groups; --limit flag
gather_facts: true (default) when you don't need it gather_facts: false saves 5-30 s on every run
One playbook for 100 unrelated tasks Tag tasks (tags: [config, restart]) and run subsets

πŸ“ Slide 16 – 🏎️ Speed: Forks, Pipelining, Mitogen

# ansible.cfg
[defaults]
forks = 20             # βœ… parallel hosts; default 5
host_key_checking = false   # OK for ephemeral CI hosts
gathering = smart      # βœ… cache facts where safe

[ssh_connection]
pipelining = true      # βœ… ~2x faster on slow links
control_path = ~/.ansible/cp/%%h-%%p-%%r
  • ⚑ Pipelining sends fewer SSH round-trips per task β€” huge wins on high-latency links
  • 🐍 Mitogen for Ansible β€” drops the right Python connection model in; 1.25-7Γ— speed-ups on real workloads
  • πŸ§ͺ Lab 7 plays will finish in under 30 seconds on a single VM β€” measure before/after

πŸ“ Slide 17 – πŸ“œ Real Story: A Better Knight Capital

Recall Lecture 1's Knight Capital story β€” manual deploy missed one server out of eight, $440M loss.

How would proper config mgmt have prevented it?

  • πŸ” ansible-playbook -i prod-inventory deploy.yaml β€” all 8 hosts at once, in a single ansible-playbook invocation
  • βœ… Pre-deploy: --check (dry-run) confirms what will change
  • πŸͺͺ Post-deploy: a task verifies the binary checksum on each host
  • 🚨 If even one host fails, the play aborts and reports
  • ⏳ Total wall-clock time: 2 minutes. Manual checklist time: ~45 (and one server missed)

πŸŽ“ The lesson isn't "Ansible would have saved Knight." It's that making deploys atomic and verified is what saves you β€” Ansible is one good tool that helps you do it.


πŸ“ Slide 18 – πŸ§ͺ Lab 7 Preview: Deploy QuickNotes via Ansible

  • πŸ”¨ Task 1 (6 pts): Write ansible/playbook.yaml + a Jinja2 systemd unit + an inventory targeting your Lab 5 VirtualBox VM. Run ansible-playbook to deploy QuickNotes; curl :8080/health from the host
  • ♻️ Task 2 (4 pts): Demonstrate idempotency β€” run the playbook twice; verify changed=0 the second time. Then change one variable, re-run, verify only the affected handlers fire
  • 🎁 Bonus (2 pts): Wire ansible-pull via a systemd timer on the VM so it auto-converges every 5 minutes from the course repo. Edit something in Git; watch the VM heal
  • πŸ“œ Deliverable: submissions/lab7.md with playbook output, idempotency proof, and reflection

πŸ“ Slide 19 – 🧠 Key Takeaways

  1. πŸ„ Config Management is what makes cattle-vs-pets executable β€” your servers exist because of a text file
  2. ♻️ Idempotency is the property β€” re-runs are safe; partial failures are recoverable
  3. 🀝 Ansible's win: agentless β€” SSH + Python on the target is enough
  4. πŸ“¦ Roles for reuse, templates for variation, vault for secrets β€” three patterns, used together
  5. πŸ” Handlers fire only on change β€” no needless restarts
  6. ⬇️ ansible-pull = Git β†’ target convergence loop β€” the same pattern that powers ArgoCD, Flux, and every modern deploy system

πŸ“ Slide 20 – πŸš€ What's Next + πŸ“š Resources

  • πŸ“ Next lecture: SRE & Monitoring β€” golden signals, Prometheus, dashboards
  • πŸ§ͺ Lab 7: Deploy QuickNotes to your VirtualBox VM via Ansible; demonstrate idempotency; Bonus: ansible-pull GitOps
  • πŸ“– Read this week:
    • πŸ“• Ansible Up & Running β€” Lorin Hochstein & RenΓ© Moser (3rd ed) β€” Chapters 1-6
    • πŸ“— Ansible for DevOps β€” Jeff Geerling β€” free draft + paid full edition
    • πŸ“˜ Ansible docs β€” User Guide
    • πŸ“ The Cathedral and the Bazaar β€” Eric Raymond β€” for the broader OSS context behind tools like Ansible
  • πŸ› οΈ Tools to install this week: Ansible 10.x (Python 3.11+), ansible-lint, optionally Mitogen
graph LR
    P["🐳 Week 6<br/>Containers"] --> Y["πŸ“ You Are Here<br/>Ansible CM"]
    Y --> N["πŸ“Š Week 8<br/>SRE & Monitoring"]
    N --> M["πŸ›‘οΈ Week 9<br/>DevSecOps"]
Loading

🎯 Remember: The discipline is "everything that runs on a server is described in a file in Git" β€” Ansible is one good way to express that file. The discipline outlives the tool.