Skip to content

Commit ca98109

Browse files
author
HackTricks News Bot
committed
Add content from: Research Update Enhanced src/linux-hardening/privilege-escal...
1 parent d2b6049 commit ca98109

1 file changed

Lines changed: 63 additions & 0 deletions

File tree

src/linux-hardening/privilege-escalation/container-security/assessment-and-hardening.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66

77
A good container assessment should answer two parallel questions. First, what can an attacker do from the current workload? Second, which operator choices made that possible? Enumeration tools help with the first question, and hardening guidance helps with the second. Keeping both on one page makes the section more useful as a field reference rather than just a catalog of escape tricks.
88

9+
One practical update for modern environments is that many older container writeups quietly assume a **rootful runtime**, **no user namespace isolation**, and often **cgroup v1**. Those assumptions are not safe anymore. Before spending time on old escape primitives, first confirm whether the workload is rootless or userns-remapped, whether the host is using cgroup v2, and whether Kubernetes or the runtime is now applying default seccomp and AppArmor profiles. These details often decide whether a famous breakout still applies.
10+
911
## Enumeration Tools
1012

1113
A number of tools remain useful for quickly characterizing a container environment:
@@ -15,6 +17,8 @@ A number of tools remain useful for quickly characterizing a container environme
1517
- `amicontained` is lightweight and useful for identifying container restrictions, capabilities, namespace exposure, and likely breakout classes.
1618
- `deepce` is another container-focused enumerator with breakout-oriented checks.
1719
- `grype` is useful when the assessment includes image-package vulnerability review instead of only runtime escape analysis.
20+
- `Tracee` is useful when you need **runtime evidence** rather than static posture alone, especially for suspicious process execution, file access, and container-aware event collection.
21+
- `Inspektor Gadget` is useful in Kubernetes and Linux-host investigations when you need eBPF-backed visibility tied back to pods, containers, namespaces, and other higher-level concepts.
1822

1923
The value of these tools is speed and coverage, not certainty. They help reveal the rough posture quickly, but the interesting findings still need manual interpretation against the actual runtime, namespace, capability, and mount model.
2024

@@ -24,6 +28,42 @@ The most important hardening principles are conceptually simple even though thei
2428

2529
Image and build hygiene matter as much as runtime posture. Use minimal images, rebuild frequently, scan them, require provenance where practical, and keep secrets out of layers. A container running as non-root with a small image and a narrow syscall and capability surface is much easier to defend than a large convenience image running as host-equivalent root with debugging tools preinstalled.
2630

31+
For Kubernetes, current hardening baselines are more opinionated than many operators still assume. The built-in **Pod Security Standards** treat `restricted` as the "current best practice" profile: `allowPrivilegeEscalation` should be `false`, workloads should run as non-root, seccomp should be explicitly set to `RuntimeDefault` or `Localhost`, and capability sets should be dropped aggressively. During assessment, this matters because a cluster that is only using `warn` or `audit` labels may look hardened on paper while still admitting risky pods in practice.
32+
33+
## Modern Triage Questions
34+
35+
Before diving into escape-specific pages, answer these quick questions:
36+
37+
1. Is the workload **rootful**, **rootless**, or **userns-remapped**?
38+
2. Is the node using **cgroup v1** or **cgroup v2**?
39+
3. Are **seccomp** and **AppArmor/SELinux** explicitly configured, or merely inherited when available?
40+
4. In Kubernetes, is the namespace actually **enforcing** `baseline` or `restricted`, or only warning/auditing?
41+
42+
Useful checks:
43+
44+
```bash
45+
id
46+
cat /proc/self/uid_map 2>/dev/null
47+
cat /proc/self/gid_map 2>/dev/null
48+
stat -fc %T /sys/fs/cgroup 2>/dev/null
49+
cat /sys/fs/cgroup/cgroup.controllers 2>/dev/null
50+
grep -E 'Seccomp|NoNewPrivs' /proc/self/status
51+
cat /proc/1/attr/current 2>/dev/null
52+
find /var/run/secrets -maxdepth 3 -type f 2>/dev/null | head
53+
NS=$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace 2>/dev/null)
54+
kubectl get ns "$NS" -o jsonpath='{.metadata.labels}' 2>/dev/null
55+
kubectl get pod "$HOSTNAME" -n "$NS" -o jsonpath='{.spec.securityContext.supplementalGroupsPolicy}{"\n"}' 2>/dev/null
56+
kubectl get pod "$HOSTNAME" -n "$NS" -o jsonpath='{.spec.securityContext.seccompProfile.type}{"\n"}{.spec.containers[*].securityContext.allowPrivilegeEscalation}{"\n"}{.spec.containers[*].securityContext.capabilities.drop}{"\n"}' 2>/dev/null
57+
```
58+
59+
What is interesting here:
60+
61+
- If `/proc/self/uid_map` shows container root mapped to a **high host UID range**, many older host-root writeups become less relevant because root in the container is no longer host-root equivalent.
62+
- If `/sys/fs/cgroup` is `cgroup2fs`, old **cgroup v1**-specific writeups such as `release_agent` abuse should no longer be your first guess.
63+
- If seccomp and AppArmor are only inherited implicitly, portability can be weaker than defenders expect. In Kubernetes, explicitly setting `RuntimeDefault` is often stronger than silently relying on node defaults.
64+
- If `supplementalGroupsPolicy` is set to `Strict`, the pod should avoid silently inheriting extra group memberships from `/etc/group` inside the image, which makes group-based volume and file access behavior more predictable.
65+
- Namespace labels such as `pod-security.kubernetes.io/enforce=restricted` are worth checking directly. `warn` and `audit` are useful, but they do not stop a risky pod from being created.
66+
2767
## Resource-Exhaustion Examples
2868

2969
Resource controls are not glamorous, but they are part of container security because they limit the blast radius of compromise. Without memory, CPU, or PID limits, a simple shell may be enough to degrade the host or neighboring workloads.
@@ -38,6 +78,15 @@ nc -lvp 4444 >/dev/null & while true; do cat /dev/urandom | nc <target_ip> 4444;
3878

3979
These examples are useful because they show that not every dangerous container outcome is a clean "escape". Weak cgroup limits can still turn code execution into real operational impact.
4080

81+
In Kubernetes-backed environments, also check whether resource controls exist at all before treating DoS as theoretical:
82+
83+
```bash
84+
kubectl get pod "$HOSTNAME" -n "$NS" -o jsonpath='{range .spec.containers[*]}{.name}{" cpu="}{.resources.limits.cpu}{" mem="}{.resources.limits.memory}{"\n"}{end}' 2>/dev/null
85+
cat /sys/fs/cgroup/pids.max 2>/dev/null
86+
cat /sys/fs/cgroup/memory.max 2>/dev/null
87+
cat /sys/fs/cgroup/cpu.max 2>/dev/null
88+
```
89+
4190
## Hardening Tooling
4291

4392
For Docker-centric environments, `docker-bench-security` remains a useful host-side audit baseline because it checks common configuration issues against widely recognized benchmark guidance:
@@ -50,6 +99,11 @@ sudo sh docker-bench-security.sh
5099

51100
The tool is not a substitute for threat modeling, but it is still valuable for finding careless daemon, mount, network, and runtime defaults that accumulate over time.
52101

102+
For Kubernetes and runtime-heavy environments, pair static checks with runtime visibility:
103+
104+
- `Tracee` is useful for container-aware runtime detection and quick forensics when you need to confirm what a compromised workload actually touched.
105+
- `Inspektor Gadget` is useful when the assessment needs kernel-level telemetry mapped back to pods, containers, DNS activity, file execution, or network behavior.
106+
53107
## Checks
54108

55109
Use these as quick first-pass commands during assessment:
@@ -58,13 +112,22 @@ Use these as quick first-pass commands during assessment:
58112
id
59113
capsh --print 2>/dev/null
60114
grep -E 'Seccomp|NoNewPrivs' /proc/self/status
115+
cat /proc/self/uid_map 2>/dev/null
116+
stat -fc %T /sys/fs/cgroup 2>/dev/null
61117
mount
62118
find / -maxdepth 3 \( -name docker.sock -o -name containerd.sock -o -name crio.sock -o -name podman.sock \) 2>/dev/null
63119
```
64120

65121
What is interesting here:
66122

67123
- A root process with broad capabilities and `Seccomp: 0` deserves immediate attention.
124+
- A root process that also has a **1:1 UID map** is far more interesting than "root" inside a properly isolated user namespace.
125+
- `cgroup2fs` usually means many older **cgroup v1** escape chains are not your best starting point, while missing `memory.max` or `pids.max` still points to weak blast-radius controls.
68126
- Suspicious mounts and runtime sockets often provide a faster path to impact than any kernel exploit.
69127
- The combination of weak runtime posture and weak resource limits usually indicates a generally permissive container environment rather than a single isolated mistake.
128+
129+
## References
130+
131+
- [Kubernetes Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/)
132+
- [Docker Security Advisory: Multiple Vulnerabilities in runc, BuildKit, and Moby](https://docs.docker.com/security/security-announcements/)
70133
{{#include ../../../banners/hacktricks-training.md}}

0 commit comments

Comments
 (0)