This document summarizes the disk- and noise-reduction changes that were applied to the current repo so the Loki stack remains usable during evaluation runs.
The current tuning aims to:
- reduce unnecessary host and container disk writes
- reduce alert noise from known desktop false positives
- avoid bursty osquery scheduling
- make Docker bind mounts reliable on this host
- make image selection reproducible for benchmarking
Why:
The original host path /var/log/osquery was not reliably visible inside containers in this environment.
What changed:
- osquery logs are now written to
./.data/osquery - Alloy reads
./.data/osqueryvia a bind mount - the OTel Collector comparison profile reads the same file
Files:
docker-compose.yamlosqueryd.confosqueryd-ssd-optimized.confsetup-osqueryd.sh
Impact:
- avoids duplicate troubleshooting restarts caused by an empty bind mount
- keeps the ingest file in a single repo-local location that is easy to inspect and rotate
Why:
The installed osqueryd in this environment treated several JSON options as invalid or CLI-only.
What changed:
- removed invalid JSON options from
osqueryd.conf - removed invalid JSON options from
osqueryd-ssd-optimized.conf - added
osquery.flagsfor CLI-only plugin settings:--config_plugin=filesystem--logger_plugin=filesystem
- updated
setup-osqueryd.shto install both:/etc/osquery/osquery.conf/etc/osquery/osquery.flags
Impact:
- eliminates invalid-flag warnings during config validation
- keeps the service config cleaner and closer to what the local binary actually supports
Why: A low splay value causes more scheduled queries to execute in tighter bursts, which can create short I/O spikes.
What changed:
- increased
schedule_splay_percentfrom10to25
Files:
osqueryd.confosqueryd-ssd-optimized.conf
Impact:
- spreads query execution more evenly across time
- reduces bursty writes into
osqueryd.results.log
Why: Falco can generate substantial log volume on a desktop-style Linux host if left near its defaults or if custom overrides are not actually loaded.
What changed:
- ensured the repo override is mounted into
/etc/falco/config.d/zz-local.yaml - enabled
buffered_outputs: true - set
priority: warning - kept
file_outputdisabled - kept
syslog_outputdisabled - used HTTP output to
falcosidekick
Files:
docker-compose.yamlfalco-config.yaml
Impact:
- reduces Falco write amplification
- reduces low-value diagnostic noise in the active eval path
Why: This host showed expected desktop/GUI helper behavior that created noisy Falco alerts without adding much security value for this evaluation.
What changed:
Added a local extension to known_ptrace_binaries for:
chrome_crashpadchrome_crashpad_handler
Added a targeted exception to user_read_sensitive_file_conditions for:
systemd-executorcinnamon-screensaver-pam-helper
only when they read:
/etc/shadow/etc/pam.d/*
Files:
falco_rules.local.yaml
Impact:
- reduces repeated desktop false positives
- preserves a narrow exception scope instead of broadly ignoring sensitive-file activity
Why: Publishing extra host ports increases process churn, collision risk, and operational noise without helping local evaluation.
What changed:
- Loki
read,write, andbackenduse internal exposure instead of published host ports falcosidekickno longer publishes2801to the hostgatewayremains the main Loki entrypoint on3100- MinIO is pinned to stable ports
9000and9001
Files:
docker-compose.yaml
Impact:
- fewer port conflicts
- fewer unnecessary host listeners
- simpler compose lifecycle during repeated benchmark runs
Why:
Using latest makes repeatable comparisons harder because behavior can drift between runs.
What changed: Pinned these services to exact digests:
- Loki (
read,write,backend) - Grafana
- Alloy
- Falco
- Falcosidekick
- OpenObserve
- OTel Collector Contrib
Files:
docker-compose.yaml
Impact:
- more reproducible evaluation runs
- easier to compare before/after tuning results
To apply the osquery changes to the installed host service:
- run
sudo ./setup-osqueryd.sh configure - run
sudo systemctl restart osqueryd
The repo changes were validated to the extent possible from this environment:
- Alloy tails
./.data/osquery/osqueryd.results.log - Loki ingests osquery events from Alloy
- Falcosidekick successfully posts to Loki
- Falco loads the repo override from
config.d - Falco loads the local rules file with the ptrace and desktop auth helper exceptions
- the pinned images resolve correctly in Compose
- Falco still reports LinuxKit / eBPF tracepoint attachment warnings on this host, so syscall coverage is only partially validated
- more aggressive synthetic Falco tests on this host did not produce reliable end-to-end detections, so desktop syscall evaluation should still be treated as environment-limited here
- the Grafana live-tail UI issue is still considered a Grafana-side behavior issue rather than a disk-optimization issue
- the installed
osquerydhere does not exposeudev_read_buffer_size, so that suggestion was reviewed but not applied
- Removed unnecessary host port publishing from the internal Loki read, write, and backend services.
- Removed the published
2801host port fromfalcosidekick; Falco still reaches it over the Docker network. - Added a consistent
/readyroute on the Loki gateway and hardened websocket proxy settings for tail requests. - Bound MinIO to fixed host ports:
- API:
9000 - Console:
9001
- API:
- Switched the Alloy osquery bind mount to the repo-local path
./.data/osquery. - Updated the osquery setup flow to write results into
./.data/osquery, which is visible to Docker Desktop based environments. - Removed invalid osquery options from the JSON configs, added a repo-managed
osquery.flags, and increased scheduler splay to reduce bursty execution. - Tuned the default osquery profile to reduce high-volume low-value events by bounding process snapshots, narrowing
process_envs, narrowingprocess_memory_map, reducing package and Docker inventory churn, and disabling pack-based duplication by default. - Corrected the Falco config mounting so the container actually loads the intended overrides.
- Tuned Falco with
priority: warning,buffered_outputs: true, a narrow local ptrace allowlist for crashpad noise, and a targeted desktop auth-helper exception for/etc/shadowand/etc/pam.d/*reads. - Pinned Loki, Grafana, Alloy, Falco, Falcosidekick, OpenObserve, and the OTel Collector to exact image digests for more reproducible evaluation runs.
- Added an optional
openobserveprofile todocker-compose.yaml. - Added
otel-collector-config.yamlso you can compare:- Loki + Grafana + Alloy
- OpenTelemetry Collector Contrib + OpenObserve
- The collector currently supports:
- file tailing for osquery via
./.data/osquery/osqueryd.results.log - file tailing for Falco via
./.data/falco/events.jsonl - generic OTLP log ingestion on
4317and4318
- file tailing for osquery via
docker compose up -ddocker compose --profile openobserve up -d openobserve otel-collectorsudo ./setup-osqueryd.sh configure
sudo systemctl restart osqueryd- Loki accepts pushes on
http://localhost:3100/loki/api/v1/push. - Loki
query_rangerequests return results successfully. - Alloy now tails and ships logs from
./.data/osquery/osqueryd.results.loginto Loki. - Synthetic Falco payloads sent through Falcosidekick reach Loki successfully.
- The websocket tail endpoint upgrades successfully through the nginx gateway.
- Grafana reaches Loki through the Docker service name
gatewayinstead oflocalhost. - OpenObserve responds on
http://localhost:5080/healthz. - The OTel collector starts cleanly and watches
./.data/osquery/osqueryd.results.log. - The OpenObserve profile is isolated behind a Compose profile so it does not interfere with the Loki stack.
This appears to be a Docker Desktop style host-path sharing issue in the current environment.
Mitigation applied:
- osquery output is redirected to
./.data/osquery - Alloy and the OTel collector both read from that repo-local path
- invalid osquery options were removed from the JSON configs
- CLI-only plugin settings are now installed via
osquery.flags - scheduler splay was raised from
10to25
What still must happen on the host:
- rerun
setup-osqueryd.shso the installed host osquery config and flagfile are refreshed
Suggestion reviewed but not implemented:
udev_read_buffer_sizewas not added because the installedosqueryddoes not expose that flag in this environment
Investigation showed:
- Loki push works
- Loki range queries work
- websocket upgrade on
/loki/api/v1/tailworks through the gateway - Loki rejects instant log queries on
/loki/api/v1/querywith400because that API no longer supports log selectors as instant queries
That leaves the remaining symptom looking more like a Grafana Explore / datasource UI behavior issue than a Loki transport issue.
Workaround:
- use normal Explore range queries with auto-refresh while benchmarking
Next likely mitigation if you want to keep chasing it:
- pin Grafana to a specific non-
latestversion and test live tail again - hard refresh the browser or test in a clean browser profile
The previous Compose file published 2801:2801, which collided with an existing listener in this environment.
Mitigation applied:
falcosidekickis now internal-only on the Docker network- the Loki output was reconfigured with the supported
LOKI_HOSTPORTstyle settings - synthetic Falco payloads now POST through Falcosidekick to Loki successfully (
204from the gateway)
Falco now starts, loads the repo override from config.d, and loads the custom local rules file. It still logs several libbpf tracepoint attachment warnings on the LinuxKit kernel. That means the sidekick shipping path is configured and the noise-reduction overrides are active, while actual syscall coverage on this host still needs validation with real Falco detections.
Additional validation attempted:
- active
chrome_crashpad_handlerprocesses were present on the host during review - no new crashpad ptrace alerts appeared in sampled Falco logs after the allowlist was loaded
- a synthetic ptrace reproduction using a
chrome_crashpad-named symlink tostracewas attempted, but the host denied the attach withOperation not permittedbefore a Falco alert could be compared against an unsuppressed baseline - a more aggressive synthetic container test using execution from
/dev/shmfailed withPermission deniedbecause that path is non-executable in this environment - a harmless warning-level host test for the
Find AWS Credentialsrule also did not produce an observable Falco alert in this environment
So the override looks directionally correct, but this host still does not provide a fully conclusive end-to-end validation for real crashpad ptrace events, and it should not be treated as a reliable desktop syscall-eval environment for Falco.
Falco -> Falcosidekick -> Loki remains the primary alerting path in this repo.
For the OpenObserve profile, Falco now also writes JSON events to ./.data/falco/events.jsonl, and the OTel collector tails that file into the falco stream in OpenObserve. That keeps the Loki path intact while giving the comparison stack access to the same Falco event feed.
- already wired to Grafana
- Falco path already exists
- simple manual push/query validation
- single UI and storage layer for logs
- native OTLP ingestion path for future app instrumentation
- easy side-by-side comparison without replacing the Loki stack yet