Skip to content

Commit 91274a1

Browse files
JAORMXclaude
andcommitted
Add CIS sysctls, migrate prctl to x/sys/unix, add SetNoNewPrivs
Address remaining security review findings: - Add four CIS-recommended sysctls: perf_event_paranoid, yama.ptrace_scope, bpf_jit_harden, and sysrq - Replace raw syscall.Syscall prctl calls with unix.Prctl() from golang.org/x/sys/unix (already an indirect dep) - Add SetNoNewPrivs() helper for PR_SET_NO_NEW_PRIVS - Update SECURITY.md with new sysctls, process privilege restriction, and filesystem hardening documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 857a9f0 commit 91274a1

6 files changed

Lines changed: 81 additions & 17 deletions

File tree

docs/SECURITY.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -229,6 +229,10 @@ aborting boot, because not all kernels support every sysctl.
229229
| `kernel.kptr_restrict` | `2` | Hide kernel pointers from all users. Prevents information leaks that aid exploit development. |
230230
| `kernel.dmesg_restrict` | `1` | Restrict `dmesg` to privileged users. Prevents unprivileged processes from reading kernel log messages that may contain sensitive addresses or operations. |
231231
| `kernel.unprivileged_bpf_disabled` | `1` | Disable unprivileged BPF. Prevents unprivileged users from loading BPF programs, which have historically been a source of kernel privilege escalation vulnerabilities. |
232+
| `kernel.perf_event_paranoid` | `3` | Disallow all perf events for unprivileged users. Prevents unprivileged access to performance counters, which can be used for side-channel attacks. |
233+
| `kernel.yama.ptrace_scope` | `2` | Restrict ptrace to `CAP_SYS_PTRACE` holders. Prevents unprivileged processes from attaching to other processes to inspect memory or inject code. |
234+
| `net.core.bpf_jit_harden` | `2` | Harden BPF JIT against spraying attacks. Forces constant blinding and disables JIT kallsyms exposure. |
235+
| `kernel.sysrq` | `0` | Disable magic SysRq key. Prevents unprivileged users from triggering kernel debugging and recovery commands. |
232236

233237
### Capability bounding set
234238

@@ -245,6 +249,32 @@ For a typical SSH-based guest, the minimal keep set is:
245249
| `CAP_SETGID` | 6 | sshd group switching |
246250
| `CAP_NET_BIND_SERVICE` | 10 | Binding port 22 (privileged port) |
247251

252+
### Process privilege restriction
253+
254+
`SetNoNewPrivs()` sets the `PR_SET_NO_NEW_PRIVS` bit on the calling
255+
process. Once set, the process and all descendants (via fork/exec)
256+
cannot gain new privileges through `execve` — setuid binaries run
257+
without elevation and file capabilities are ignored.
258+
259+
This is intended to be called after all privileged operations are
260+
complete (mounts, network config, credential setup, capability
261+
dropping). Consumers that spawn child processes via `os/exec` inherit
262+
the bit automatically; consumers that need to set it on the calling
263+
process itself (e.g., an init that doesn't use `os/exec`) can call
264+
`SetNoNewPrivs()` directly.
265+
266+
Note: `no_new_privs` does not affect `setresuid`/`setresgid` syscalls
267+
used by Go's `SysProcAttr.Credential` — credential switching for SSH
268+
sessions continues to work after the bit is set.
269+
270+
### Filesystem hardening
271+
272+
Consumers should lock down `/root/` (mode `0700`) after completing
273+
initial setup so the sandbox user cannot read root's home directory
274+
contents (bootstrap config, debug logs, credentials). This is not
275+
performed by the `harden` package itself but is a recommended consumer
276+
practice — see apiary's `lockdownRoot()` for an example.
277+
248278
### Threat model
249279

250280
These hardening measures are defense-in-depth for the guest

go.mod

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,10 @@ require (
1010
github.com/gofrs/flock v0.13.0
1111
github.com/google/go-containerregistry v0.20.3
1212
github.com/miekg/dns v1.1.72
13-
github.com/sirupsen/logrus v1.9.4
1413
github.com/stretchr/testify v1.11.1
1514
golang.org/x/crypto v0.47.0
1615
golang.org/x/sync v0.19.0
16+
golang.org/x/sys v0.40.0
1717
)
1818

1919
require (
@@ -47,6 +47,7 @@ require (
4747
github.com/pierrec/lz4/v4 v4.1.14 // indirect
4848
github.com/pkg/errors v0.9.1 // indirect
4949
github.com/pmezard/go-difflib v1.0.0 // indirect
50+
github.com/sirupsen/logrus v1.9.4 // indirect
5051
github.com/u-root/uio v0.0.0-20240224005618-d2acac8f3701 // indirect
5152
github.com/vbatts/tar-split v0.11.6 // indirect
5253
go.opentelemetry.io/auto/sdk v1.2.1 // indirect
@@ -59,7 +60,6 @@ require (
5960
go.opentelemetry.io/proto/otlp v1.9.0 // indirect
6061
golang.org/x/mod v0.32.0 // indirect
6162
golang.org/x/net v0.49.0 // indirect
62-
golang.org/x/sys v0.40.0 // indirect
6363
golang.org/x/time v0.5.0 // indirect
6464
golang.org/x/tools v0.41.0 // indirect
6565
google.golang.org/genproto/googleapis/api v0.0.0-20260209200024-4cfbd4190f57 // indirect

guest/harden/capability.go

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,8 @@ import (
1010
"os"
1111
"strconv"
1212
"strings"
13-
"syscall"
13+
14+
"golang.org/x/sys/unix"
1415
)
1516

1617
// Linux capability constants. Only the subset typically needed by guest
@@ -23,11 +24,6 @@ const (
2324
CapNetBindService uintptr = 10
2425
)
2526

26-
// prctl constants for capability bounding set manipulation.
27-
const (
28-
prCapBSetDrop = 24 // PR_CAPBSET_DROP
29-
)
30-
3127
// capLastCap reads the highest valid capability number from
3228
// /proc/sys/kernel/cap_last_cap. Falls back to 41 (CAP_CHECKPOINT_RESTORE,
3329
// the highest cap on Linux 6.x kernels) if the file is unreadable.
@@ -80,14 +76,19 @@ func DropBoundingCaps(keep ...uintptr) error {
8076
// capBSetDrop calls prctl(PR_CAPBSET_DROP, cap) to remove a single
8177
// capability from the bounding set.
8278
func capBSetDrop(cap uintptr) error {
83-
_, _, errno := syscall.Syscall(
84-
syscall.SYS_PRCTL,
85-
prCapBSetDrop,
86-
cap,
87-
0,
88-
)
89-
if errno != 0 {
90-
return fmt.Errorf("prctl(PR_CAPBSET_DROP, %d): %w", cap, errno)
79+
if err := unix.Prctl(unix.PR_CAPBSET_DROP, cap, 0, 0, 0); err != nil {
80+
return fmt.Errorf("prctl(PR_CAPBSET_DROP, %d): %w", cap, err)
81+
}
82+
return nil
83+
}
84+
85+
// SetNoNewPrivs sets the PR_SET_NO_NEW_PRIVS bit on the calling
86+
// process. Once set, the process and its children cannot gain new
87+
// privileges through execve (setuid, file capabilities, etc.).
88+
// This is irreversible for the calling process.
89+
func SetNoNewPrivs() error {
90+
if err := unix.Prctl(unix.PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); err != nil {
91+
return fmt.Errorf("prctl(PR_SET_NO_NEW_PRIVS): %w", err)
9192
}
9293
return nil
9394
}

guest/harden/capability_test.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,3 +122,12 @@ func TestCapLastCap_ReadsProc(t *testing.T) {
122122
// Modern kernels have at least 40 capabilities.
123123
assert.GreaterOrEqual(t, got, uintptr(36))
124124
}
125+
126+
func TestSetNoNewPrivs(t *testing.T) {
127+
t.Parallel()
128+
129+
// SetNoNewPrivs is safe to call in tests — it only affects the
130+
// current process and is non-reversible but harmless.
131+
err := SetNoNewPrivs()
132+
require.NoError(t, err)
133+
}

guest/harden/sysctl.go

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,26 @@ var defaults = []kernelDefault{
4848
value: "1",
4949
reason: "disable unprivileged BPF",
5050
},
51+
{
52+
key: "kernel.perf_event_paranoid",
53+
value: "3",
54+
reason: "disallow all perf events for unprivileged users",
55+
},
56+
{
57+
key: "kernel.yama.ptrace_scope",
58+
value: "2",
59+
reason: "restrict ptrace to CAP_SYS_PTRACE holders",
60+
},
61+
{
62+
key: "net.core.bpf_jit_harden",
63+
value: "2",
64+
reason: "harden BPF JIT against spraying attacks",
65+
},
66+
{
67+
key: "kernel.sysrq",
68+
value: "0",
69+
reason: "disable magic SysRq key",
70+
},
5171
}
5272

5373
// KernelDefaults applies recommended kernel sysctl hardening. Each

guest/harden/sysctl_test.go

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,5 +83,9 @@ func TestDefaults_AreComplete(t *testing.T) {
8383
assert.Equal(t, "2", keys["kernel.kptr_restrict"])
8484
assert.Equal(t, "1", keys["kernel.dmesg_restrict"])
8585
assert.Equal(t, "1", keys["kernel.unprivileged_bpf_disabled"])
86-
assert.Len(t, DefaultsForTest, 3)
86+
assert.Equal(t, "3", keys["kernel.perf_event_paranoid"])
87+
assert.Equal(t, "2", keys["kernel.yama.ptrace_scope"])
88+
assert.Equal(t, "2", keys["net.core.bpf_jit_harden"])
89+
assert.Equal(t, "0", keys["kernel.sysrq"])
90+
assert.Len(t, DefaultsForTest, 7)
8791
}

0 commit comments

Comments
 (0)