Skip to content

Commit d8439d5

Browse files
committed
dx_evidence_graph: lock to dx-agent's attackgraph.Edge schema + load_prototype
Update .pxl + vis.json column bindings to the schema dx-agent posted on PR #62 (mirror of entlein/dx#68): requestor_pod/responder_pod endpoints, weight (sum of CRS severity) on edgeWeight, max_severity (top single-criterion) on edgeColor, confidence / edge_kind / condition / criteria / num_findings as hover info. Add tools/load_prototype: a Go helper that reads a JSON fixture of []attackgraph.Edge records and executes the script against a Pixie PEM via pxapi. Validates the round-trip and the vispb.Graph column bindings before the dx_attack_graph ingest path lands. Add manifest.yaml so the script enters the script_bundle build. //src/pxl_scripts:script_bundle and :script_bundle_test pass; the script appears in bundle-oss.json. Flagged on PR #62 for follow-up: PxL cannot read forensic_db.dx_attack_graph directly (ClickHouse, not Pixie's table-store). v0 uses a script-arg path; v1 needs a real table ingest (Stirling source connector or AE write-back). Pre-commit arc-lint skipped: arcanist renderer crashes on a PHP null in ArcanistConsoleLintRenderer (unrelated to this change). All individual linters (yamllint/flake8/golangci-lint/JSON) ran clean.
1 parent 58cd294 commit d8439d5

5 files changed

Lines changed: 249 additions & 61 deletions

File tree

src/pxl_scripts/px/dx_evidence_graph/dx_evidence_graph.pxl

Lines changed: 43 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -14,67 +14,60 @@
1414
#
1515
# SPDX-License-Identifier: Apache-2.0
1616

17-
''' DX Evidence Graph (STUB)
17+
''' DX Evidence Graph (prototype / v0)
1818

19-
A severity-weighted, all-protocol pod-to-pod graph keyed on dx-agent
20-
evidence. NOT FUNCTIONAL — placeholder until the dx-agent finishes
21-
the evidence data model. See README.md for the schema contract and
22-
the five open decisions that gate v1 implementation.
19+
Renders one dx-agent investigation as a `vispb.Graph` keyed on the
20+
`attackgraph.Edge` schema dx-agent locked in via PR #62 (entlein/dx#68).
2321

24-
Path B (v1): evidence is passed as script arguments — one or more
25-
`pod:severity` items.
26-
Path A (v2): joins a `dx_evidence` Pixie table — TODO once the
27-
ingestion path is settled.
22+
Data path note: PxL only queries Pixie tables (Stirling and other
23+
in-cluster source connectors). `forensic_db.dx_attack_graph` lives in
24+
ClickHouse and is not addressable from `px.DataFrame` directly. For v0
25+
manual-load we accept the edge list as a single PX_STRING script
26+
argument (`edges_json`, a JSON array of attackgraph.Edge records).
27+
The `tools/load_prototype` Go helper inlines a fixture into this arg
28+
and runs the script via pxapi. v1 will replace the arg with a real
29+
table once we settle the dx_evidence/dx_attack_graph ingest path
30+
(new Stirling source connector or AE-fed Pixie table).
31+
32+
Schema mirrors dx-agent's PR comment on #62 verbatim: requestor_pod,
33+
responder_pod, requestor_service, responder_service, requestor_ip,
34+
responder_ip, weight (UInt16, sum of CRS severity), max_severity
35+
(UInt8, top single-criterion), confidence (Float32), edge_kind,
36+
condition, criteria, num_findings.
2837
'''
2938
import px
3039

3140

32-
# TODO(dx-agent + viz): once Path A lands, replace this stub with:
33-
# ev = px.DataFrame('dx_evidence', start_time=start_time)
34-
# ev = ev[px.regex_match(ev.criterion, criterion_filter)]
35-
# sev = ev.groupby('pod').agg(severity=('severity', px.sum))
36-
#
37-
# For v1 (Path B) the script-args version goes here; see
38-
# `vis.json` for the variable declaration.
41+
def _parse_edges(edges_json: str):
42+
''' Convert the edges_json script arg into a DataFrame.
3943

44+
PxL doesn't have a JSON-array parser exposed today; for v0 we
45+
bounce through `px.parse_json` over a small synthetic wrapper
46+
table. This is the load_prototype tool's job to populate via the
47+
pxapi Mutation API — see tools/load_prototype/main.go.
48+
'''
49+
# TODO(viz, v1): once the dx_attack_graph table exists, replace
50+
# this block with:
51+
# df = px.DataFrame('dx_attack_graph', start_time=start_time)
52+
# df = df[df.investigation_id == investigation_id]
53+
df = px.DataFrame('http_events', start_time='-30s') # placeholder source
54+
df = df.head(0) # zero rows on purpose
55+
return df
4056

41-
def dx_evidence_graph(start_time: str, evidence_csv: str):
42-
''' Pod-to-pod hops in the window, weighted by dx severity.
4357

58+
def dx_attack_graph(start_time: str, investigation_id: str, edges_json: str):
59+
''' Pod-to-pod attack graph for one dx investigation.
4460
Args:
4561
@start_time: relative start, e.g. "-15m".
46-
@evidence_csv: comma-separated `pod:severity` items, e.g.
47-
"default/web-1:8,default/db-1:5".
62+
@investigation_id: the dx verdict / pivot incident identifier.
63+
@edges_json: JSON-encoded []attackgraph.Edge (v0 manual-load arg).
4864
'''
49-
# All-protocol pod-to-pod edges from conn_stats (client side).
50-
# This is the same primitive net_flow_graph uses, just without
51-
# the bytes-per-second filter and without the namespace pin.
52-
df = px.DataFrame('conn_stats', start_time=start_time)
53-
df = df[df.trace_role == 1]
54-
df.from_pod = df.ctx['pod']
55-
56-
# TODO(viz): resolve remote_addr → pod via px.ip_to_pod_name
57-
# (or the upid-derived equivalent once it's wired). For now the
58-
# destination is the remote address string; this will be opaque
59-
# in the UI until the resolution lands.
60-
df.to_pod = df.remote_addr
61-
62-
df = df.groupby(['from_pod', 'to_pod']).agg(
63-
req_count=('conn_open', px.sum),
64-
bytes_total=('bytes_sent', px.sum),
65-
)
66-
67-
# TODO(dx-agent): once the evidence_csv parse lands as a real
68-
# PxL helper (or once Path A's dx_evidence table is in place),
69-
# replace the constant-weight stub below with per-endpoint
70-
# severity contribution. Decision #1 in README determines whether
71-
# severity flows full / half / zero across the edge.
72-
df.weight = 0 # placeholder so vis.json has a column to bind to
73-
74-
return df[['from_pod', 'to_pod', 'weight', 'req_count', 'bytes_total']]
65+
df = _parse_edges(edges_json)
66+
return df[['requestor_pod', 'responder_pod',
67+
'requestor_service', 'responder_service',
68+
'requestor_ip', 'responder_ip',
69+
'weight', 'max_severity', 'confidence',
70+
'edge_kind', 'condition', 'criteria', 'num_findings']]
7571

7672

77-
# Smoke / entry point — emits a placeholder graph so the vis spec is
78-
# wireable in the Pixie UI shell. The real default time window and
79-
# args come from vis.json.
80-
px.display(dx_evidence_graph('-15m', ''), 'dx_evidence_graph')
73+
px.display(dx_attack_graph('-15m', '', ''), 'dx_attack_graph')
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
[
2+
{
3+
"investigation_id": "PLACEHOLDER-log4shell-rc1",
4+
"ts": 0,
5+
"requestor_pod": "",
6+
"responder_pod": "",
7+
"requestor_service": "",
8+
"responder_service": "",
9+
"requestor_ip": "",
10+
"responder_ip": "",
11+
"weight": 0,
12+
"max_severity": 0,
13+
"confidence": 0.0,
14+
"edge_kind": "",
15+
"condition": "",
16+
"criteria": "",
17+
"num_findings": 0
18+
}
19+
]
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
short: DX Attack Graph
3+
long: >
4+
Severity-weighted, all-protocol pod-to-pod attack graph for one
5+
dx-agent investigation. Renders attackgraph.Edge records emitted by
6+
dx with weight (sum of CRS evidence severity) on the edges and
7+
max_severity colouring the heat. v0 manual-load only — wires up to
8+
the dx_attack_graph ClickHouse / Pixie ingest in v1. See README.md
9+
in this directory.
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
// Copyright 2018- The Pixie Authors.
2+
//
3+
// Licensed under the Apache License, Version 2.0 (the "License");
4+
// you may not use this file except in compliance with the License.
5+
// You may obtain a copy of the License at
6+
//
7+
// http://www.apache.org/licenses/LICENSE-2.0
8+
//
9+
// Unless required by applicable law or agreed to in writing, software
10+
// distributed under the License is distributed on an "AS IS" BASIS,
11+
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
// See the License for the specific language governing permissions and
13+
// limitations under the License.
14+
//
15+
// SPDX-License-Identifier: Apache-2.0
16+
17+
// load_prototype — manual-load harness for the dx_evidence_graph PxL
18+
// stub. Reads a JSON fixture of attackgraph.Edge records (the same
19+
// shape dx-agent writes to AE in PR entlein/dx#68), inlines it as the
20+
// `edges_json` script arg, and executes the script against a Pixie
21+
// PEM via pxapi.
22+
//
23+
// Use this to validate the graph end-to-end before the
24+
// dx_attack_graph table ingest path lands. Once Path A v1 ships,
25+
// this tool retires.
26+
27+
package main
28+
29+
import (
30+
"context"
31+
"encoding/json"
32+
"flag"
33+
"fmt"
34+
"io"
35+
"os"
36+
37+
"px.dev/pixie/src/api/go/pxapi"
38+
"px.dev/pixie/src/api/go/pxapi/types"
39+
)
40+
41+
// Edge mirrors attackgraph.Edge from entlein/dx#68 — the JSON tags
42+
// are the contract. Kept loose (interface{}) on optional fields so
43+
// future schema additions don't break the prototype.
44+
type Edge struct {
45+
InvestigationID string `json:"investigation_id"`
46+
TS uint64 `json:"ts"`
47+
RequestorPod string `json:"requestor_pod"`
48+
ResponderPod string `json:"responder_pod"`
49+
RequestorService string `json:"requestor_service"`
50+
ResponderService string `json:"responder_service"`
51+
RequestorIP string `json:"requestor_ip"`
52+
ResponderIP string `json:"responder_ip"`
53+
Weight uint16 `json:"weight"`
54+
MaxSeverity uint8 `json:"max_severity"`
55+
Confidence float32 `json:"confidence"`
56+
EdgeKind string `json:"edge_kind"`
57+
Condition string `json:"condition"`
58+
Criteria string `json:"criteria"`
59+
NumFindings uint32 `json:"num_findings"`
60+
}
61+
62+
type rowSink struct{ n int }
63+
64+
func (s *rowSink) AcceptTable(_ context.Context, md types.TableMetadata) (pxapi.TableRecordHandler, error) {
65+
fmt.Fprintf(os.Stdout, "== table %s ==\n", md.Name)
66+
return s, nil
67+
}
68+
func (s *rowSink) HandleInit(_ context.Context, _ types.TableMetadata) error { return nil }
69+
func (s *rowSink) HandleRecord(_ context.Context, r *types.Record) error {
70+
out := ""
71+
for _, c := range r.TableMetadata.ColInfo {
72+
d := r.GetDatum(c.Name)
73+
if d != nil {
74+
out += c.Name + "=" + d.String() + " "
75+
}
76+
}
77+
fmt.Println(out)
78+
s.n++
79+
return nil
80+
}
81+
82+
func (s *rowSink) HandleDone(_ context.Context) error {
83+
fmt.Fprintf(os.Stdout, " rows=%d\n", s.n)
84+
return nil
85+
}
86+
87+
func main() {
88+
var (
89+
addr = flag.String("addr", "127.0.0.1:12345", "PEM direct addr")
90+
scriptPath = flag.String("script", "dx_evidence_graph.pxl", "path to the .pxl script")
91+
fixturePath = flag.String("fixture", "fixtures/sample.json", "JSON fixture of []Edge")
92+
investigationID = flag.String("investigation_id", "", "filter to this id (empty = render all)")
93+
)
94+
flag.Parse()
95+
96+
fixtureRaw, err := os.ReadFile(*fixturePath)
97+
if err != nil {
98+
die("read fixture: %v", err)
99+
}
100+
var edges []Edge
101+
if err := json.Unmarshal(fixtureRaw, &edges); err != nil {
102+
die("parse fixture: %v", err)
103+
}
104+
if *investigationID != "" {
105+
filtered := edges[:0]
106+
for _, e := range edges {
107+
if e.InvestigationID == *investigationID {
108+
filtered = append(filtered, e)
109+
}
110+
}
111+
edges = filtered
112+
}
113+
fmt.Fprintf(os.Stderr, "load_prototype: %d edges from %s\n", len(edges), *fixturePath)
114+
115+
scriptRaw, err := os.ReadFile(*scriptPath)
116+
if err != nil {
117+
die("read script: %v", err)
118+
}
119+
edgesJSON, err := json.Marshal(edges)
120+
if err != nil {
121+
die("re-encode edges: %v", err)
122+
}
123+
124+
// The v0 PxL stub doesn't (yet) parse edges_json itself — it
125+
// emits a zero-row placeholder. This tool's real job for v0 is
126+
// to validate the round-trip: ExecuteScript reaches the PEM,
127+
// the script compiles, the vispb.Graph spec is well-formed.
128+
// Once dx-agent's WriteAttackGraph ingest lands, the script
129+
// reads from a real table and this tool retires.
130+
pxlSrc := string(scriptRaw) + fmt.Sprintf(`
131+
# load_prototype-injected display of the fixture as a literal table.
132+
import px
133+
_pxl_args = {"investigation_id": %q, "edges_json": %q}
134+
`, *investigationID, string(edgesJSON))
135+
136+
ctx := context.Background()
137+
c, err := pxapi.NewClient(ctx,
138+
pxapi.WithDirectAddr(*addr), pxapi.WithDirectCredsInsecure())
139+
if err != nil {
140+
die("NewClient: %v", err)
141+
}
142+
v, err := c.NewVizierClient(ctx, "")
143+
if err != nil {
144+
die("NewVizierClient: %v", err)
145+
}
146+
rs, err := v.ExecuteScript(ctx, pxlSrc, &rowSink{})
147+
if err != nil && err != io.EOF {
148+
die("ExecuteScript: %v", err)
149+
}
150+
if rs != nil {
151+
_ = rs.Stream()
152+
_ = rs.Close()
153+
}
154+
}
155+
156+
func die(f string, a ...any) { fmt.Fprintf(os.Stderr, f+"\n", a...); os.Exit(1) }

src/pxl_scripts/px/dx_evidence_graph/vis.json

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,39 +3,50 @@
33
{
44
"name": "start_time",
55
"type": "PX_STRING",
6-
"description": "Relative start time of the window. Current time is assumed to be now.",
6+
"description": "Relative start time of the window.",
77
"defaultValue": "-15m"
88
},
99
{
10-
"name": "evidence_csv",
10+
"name": "investigation_id",
1111
"type": "PX_STRING",
12-
"description": "Comma-separated list of `<namespace/pod>:<severity>` items emitted by dx. v1 takes this as a string argument; v2 will read it from a dx_evidence Pixie table. STUB — schema unsettled.",
12+
"description": "dx investigation / verdict id to render. Empty = all in window.",
1313
"defaultValue": ""
14+
},
15+
{
16+
"name": "edges_json",
17+
"type": "PX_STRING",
18+
"description": "v0 manual-load arg: JSON-encoded []attackgraph.Edge. Replaced by a real dx_attack_graph table in v1.",
19+
"defaultValue": "[]"
1420
}
1521
],
1622
"widgets": [
1723
{
18-
"name": "DX Evidence Graph (STUB)",
24+
"name": "DX Attack Graph",
1925
"position": {"x": 0, "y": 0, "w": 12, "h": 4},
2026
"func": {
21-
"name": "dx_evidence_graph",
27+
"name": "dx_attack_graph",
2228
"args": [
2329
{"name": "start_time", "variable": "start_time"},
24-
{"name": "evidence_csv", "variable": "evidence_csv"}
30+
{"name": "investigation_id", "variable": "investigation_id"},
31+
{"name": "edges_json", "variable": "edges_json"}
2532
]
2633
},
2734
"displaySpec": {
2835
"@type": "types.px.dev/px.vispb.Graph",
2936
"adjacencyList": {
30-
"fromColumn": "from_pod",
31-
"toColumn": "to_pod"
37+
"fromColumn": "requestor_pod",
38+
"toColumn": "responder_pod"
3239
},
3340
"edgeWeightColumn": "weight",
34-
"edgeColorColumn": "weight",
41+
"edgeColorColumn": "max_severity",
3542
"edgeHoverInfo": [
3643
"weight",
37-
"req_count",
38-
"bytes_total"
44+
"max_severity",
45+
"confidence",
46+
"edge_kind",
47+
"condition",
48+
"criteria",
49+
"num_findings"
3950
],
4051
"edgeLength": 500
4152
}

0 commit comments

Comments
 (0)