Commit f12a916
[zscaler] Add OCSF normalization to all NSS log pipelines (#2972)
* [zscaler] Add OCSF normalization to all NSS log pipelines
Adds OCSF 1.5.0 sub-pipelines for every zscalernss-* log type so logs
flow through Datadog's normalized security detection surface.
Classes mapped:
- zscalernss-web -> HTTP Activity [4002]
- zscalernss-dns -> DNS Activity [4003]
- zscalernss-fw (threat) -> Detection Finding [2004]
- zscalernss-fw (policy) -> Network Activity [4001]
- zscalernss-tunnel -> Tunnel Activity [4014]
- zscalernss-casb -> File Hosting Activity [6006]
- zscalernss-emaildlp -> Data Security Finding [2006]
- zscalernss-endpointdlp -> Data Security Finding [2006]
- zscalernss-audit AUTH -> Authentication [3002]
- zscalernss-audit other -> API Activity [6003]
- zscalernss-alert (ZIA+UEBA) -> Detection Finding [2004]
- fallback -> Base Event [0]
Metadata hierarchy:
- ocsf.metadata.product.vendor_name = "Zscaler"
- ocsf.metadata.product.name = "ZIA" (the product)
- ocsf.metadata.product.feature.name = log type (fw/dns/web/tunnel/casb/
emaildlp/endpointdlp/audit/alert), derived from sourcetype via grok
- ocsf.metadata.log_name = raw sourcetype
Firewall logs are split by threat detection: blocks/drops with a
populated `threatname` or `ipsrulelabel` route to Detection Finding
[2004]; all other fw events (Allow, policy Block, policy Drop) route
to Network Activity [4001]. Without this split, every policy block
would surface as a "security finding" and inflate alert noise.
Audit non-AUTH events route to API Activity [6003] (not Group
Management [3006] or Web Resources Activity [6001]), matching how
CloudTrail/Okta/Entra/Workspace admin audit logs map to OCSF.
Pipeline uses standard Datadog processors (string-builder, grok-parser,
schema-processor, schema-remapper, schema-category-mapper, attribute-
remapper, array-processor). Time handling is centralized: the existing
`datetime -> date` grok is extended with a `createTime -> date` grok
(for alert logs), then `date` is mapped to `ocsf.time` once in the
OCSF pre-transforms sub-pipeline.
5 existing attribute-remappers were flipped preserveSource: false ->
true (clt_sip, srv_dip, dns_req, dns_reqtype, src_ip) so OCSF sub-
pipelines can read source fields. This is sanctioned by the OCSF style
guide section 7.2 and is backwards-compatible.
All 12 test cases validate at 100% required OCSF fields via
logs-backend ocsf-validator-cli. Added one new test case for a
non-threat fw Allow event to exercise Network Activity [4001].
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: use Unknown/0 instead of Other/99 for category catch-alls
OCSF semantics distinguish Unknown (0) = no data / no confident mapping
from Other (99) = source supplied a known literal value that doesn't
match any defined enum. All our catch-all filters fire via generic
wildcards ("@sourcetype:...", "@zscaler.action:*") meaning "we don't
know", not "source literally said Other". Flip 13 catch-alls across
status_id, activity_id, device.type_id, and src_endpoint.os.type_id.
Clears the OCSF validator's attribute_enum_sibling_suspicious_other
warnings. All 12 tests still VALID at 100% required fields.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* remove email_addr from endpointdlp sub-pipeline
* zscaler: address validate-logs CI annotations
- Rename 12 self-mapping schema-remappers from "Cast X to integer" to
"Map X to X" — CI expects the standard self-map naming convention
for integer type-casts that share source and target field.
- Rename 3 schema-processors from "Apply OCSF schema for NNNN (label)"
to the canonical "Apply OCSF schema for NNNN" form (2004, 2006, 4001).
- Fix greedy-at-start grok in `activity_count` rule: `.*Activity count`
-> `.*?Activity count` (lazy quantifier) to avoid the performance
warning.
- Regenerate test expectations for 8 tests with CI's canonical output
format (alphabetical key ordering, `{}` for empty maps, 9-space list
indent). All 12 tests still pass OCSF validation at 100% required.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: strip extra keys from facets (description, facetType, type)
The log pipeline facet schema only accepts groups/name/path/source.
description/facetType/type are auto-inferred from the indexed value
and, when explicitly set, are rejected by downstream validators and
ignored by the Datadog app. Rewrite all 94 facets in the canonical
minimal form. All 12 OCSF tests still VALID at 100% required fields.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: fix validate-logs CI annotations (facets + date format)
Facet conflicts (rename to canonical names CI suggests):
- ocsf.actor.user.name: "Actor User Name" -> "Name"
- ocsf.user.name: "User Name" -> "Target Name"
- ocsf.http_request.url.url_string: "URL" -> "Request URL String"
- ocsf.finding_info.title: "Finding Title" -> "Finding Info Title"
- ocsf.finding_info.uid: "Finding UID" -> "Finding Info Unique ID"
Facet conflicts (path already defined by another integration — remove):
- ocsf.dst_endpoint.hostname
- ocsf.query.type
- ocsf.dst_endpoint.port
Date format (alert tests #10, #11):
- `createTime` (epoch seconds) -> `date` was stored in scientific
notation (1.755569028E12) because grok `scale(1000)` returns a
double. Rewrote to decimal form (1755569028000). Pipeline unchanged
— the scientific form is valid floating point, but CI's date-
remapper validator requires integer ms.
All 12 OCSF tests still VALID at 100% required fields.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: remove 14 no-op OCSF self-map schema-remappers
Same rationale as the metadata self-map cleanup: a schema-remapper with
source == target and no targetFormat is a no-op. The schema-processor
already walks custom.ocsf.* against its declared schema block — fields
don't need to be listed in mappers to be recognized.
Removed 14 self-maps across several OCSF paths:
- ocsf.finding_info.analytic.type / .type_id (×4)
- ocsf.finding_info.data_sources / .types (×2)
- ocsf.data_security.data_lifecycle_state / .detection_system (×3)
- ocsf.evidence.* (×several)
Kept 21 self-maps that carry targetFormat: integer (real string->int
coercion for fields upstream-set by string-builders or category-mappers).
All 12 OCSF tests still VALID at 100% required fields.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Revert "zscaler: remove 14 no-op OCSF self-map schema-remappers"
This reverts commit 7aff951.
* zscaler: fix tunnel email_addr + add product metadata self-maps
- tunnel 4014: remove `zscaler.vpncredentialname -> ocsf.user.email_addr`
mapping. OCSF user.email_addr is optional and credential names are
not email-formatted per Zscaler tunnel log samples.
- All 12 OCSF schema-processors: add self-maps for
ocsf.metadata.product.name and ocsf.metadata.product.vendor_name.
Pre-transforms already sets both values via string-builder ("ZIA"
and "Zscaler"), but the staging API's OCSF validator checks each
schema-processor's mapper list independently, so every class now
declares the self-map.
All 12 OCSF tests still VALID at 100% required fields.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: source finding_info.product.* from ocsf.metadata.product.*
The alert 2004 sub-pipeline was sourcing finding_info.product.name and
finding_info.product.vendor_name from `company` (which is "Zscaler" -
the vendor, not the product), and finding_info.product.feature.name
from `alias` (which is "Zscaler-ZIA-UEBA" - a composite log source
label, not a feature name).
Reuse the already-set ocsf.metadata.product.* tree instead:
- finding_info.product.name <- ocsf.metadata.product.name ("ZIA")
- finding_info.product.vendor_name <- ocsf.metadata.product.vendor_name ("Zscaler")
- finding_info.product.feature.name <- ocsf.metadata.product.feature.name ("alert")
Single source of truth, no divergence between metadata.product and
finding_info.product. All 12 tests still VALID.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: add IAM audit sub-pipelines (3001/3002/3005) + misc fixes
Add OCSF normalization for Zscaler admin audit log IAM categories:
LOGIN / AUTH -> 3002 Authentication (expanded filter)
USER_MANAGEMENT -> 3001 Account Change (NEW)
ROLE_MANAGEMENT -> 3005 User Access Management (NEW)
(anything else) -> 6003 API Activity (narrowed filter)
3001 Account Change activity_id (per OCSF 1.5.0 enum):
CREATE -> 1 Create
DELETE -> 6 Delete
UPDATE -> 99 Other (enum has no generic update value)
3005 User Access Management activity_id (per OCSF 1.5.0 enum):
CREATE -> 1 Assign Privileges
DELETE -> 2 Revoke Privileges
UPDATE -> 99 Other (enum only defines Assign/Revoke/Unknown/Other)
Additional changes:
- Added `zscaler.time -> zscaler.datetime` attribute-remapper upstream
so audit logs using `time` (per Zscaler NSS admin audit docs) get
date-parsed identically to logs using `datetime`.
- Added `ocsf.privileges` schema-remapper self-map inside the 3005
schema-processor so the staging API's OCSF validator sees it
declared in the mapper list (required per-schema-processor check).
- Added `ocsf.severity_id` integer-cast self-maps after the
schema-category-mapper in 3001, 3002, and 3005 (category-mapper
emits strings; OCSF requires integer).
- Fixed `ocsf.tunnel_type_id` integer-cast self-map in 4014: source
was `zscaler.tunneltype_id` (nonexistent field) - changed to a
proper self-map so the "99" string-builder value casts to integer.
- Removed `zscaler.resource -> ocsf.user.email_addr` mapping in 3001:
resource is a role/user name in ROLE_MANAGEMENT logs (not email),
which tripped the email_addr regex validator.
3 new test fixtures (LOGIN, USER_MANAGEMENT, ROLE_MANAGEMENT),
bringing total to 15. All 15 VALID at 100% required fields.
Generated by Fleak from synthetic + real production samples, corrected
for OCSF 1.5.0 enum accuracy (Fleak proposed invalid activity_id
values that don't exist in the spec for 3001/3005).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: regenerate test file with CI-aligned format + fix dstip name
- zscaler.yaml: fix schema-remapper name `srcip_country` ->
`dstip_country` to match its sources/target (CI annotation at
zscaler.yaml:3281).
- zscaler_tests.yaml: regenerated via the updated ocsf-validator-cli
(logs-backend#131855) which now alphabetizes map keys, renders empty
maps/lists inline as `{}`/`[]`, and uses Jackson-INDENT_ARRAYS
dash-offset conventions for scalar lists. Matches the YAML format
produced by the CI IntegrationTestsFileGenerator.
5 of 7 previously-failing tests are now byte-identical to CI's
expected output. Remaining 2 minor diffs (dns list-of-maps dash
form, one input-side empty map retention) tracked as follow-ups.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: drop empty preaction from ROLE_MANAGEMENT test output
Regenerated with the updated ocsf-validator-cli (logs-backend PR
#131855) which now parses sample JSON through Datadog's JSONParser
instead of vanilla Jackson. Empty JSON objects in the input (like
`"preaction": {}` in the ROLE_MANAGEMENT sample) are now dropped at
parse time, matching CI's behavior. One-line diff that was the last
remaining validate-logs CI mismatch for the zscaler tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: address codex review feedback
Four items raised by codex-review on PR #2972:
P3: Restore published pipeline name. The top-level pipeline.name was
renamed to "Zscaler - Crash" during staging testing to avoid clobbering
the prod entry. Revert back to "Zscaler" so the customer-facing
pipeline name is correct on merge.
P2: Firewall endpoints mapped from removed source fields. The
pre-OCSF firewall remappers for csip/csport/cdip/cdport moved
zscaler.* attributes to network.* with preserveSource: false. My
4001 (Firewall Policy) sub-pipeline reads from zscaler.* but those
fields were already gone, so ocsf.src_endpoint.ip/port and
ocsf.dst_endpoint.ip/port were never populated for real policy events.
Flip preserveSource: false -> true on the three affected remappers
(csip, csport, cdport; cdip was already true). OCSF sub-pipeline
reads unchanged.
P2: Audit OCSF status filters on removed zscaler.result. The pre-
OCSF attribute-remapper zscaler.result -> evt.outcome had
preserveSource: false, so my audit sub-pipelines' filters
(@zscaler.result:SUCCESS, @zscaler.result:FAILURE) never matched,
making every audit event resolve to status_id: 0 / Unknown and
severity_id: 0 / Unknown. Flip preserveSource: false -> true.
OCSF sub-pipeline filters unchanged.
P2: CASB activity filter uses wrong field name. The 6006 File
Hosting Activity sub-pipeline filtered on @zscaler.act_type_name but
the real emitted NSS CASB field is zscaler.activity_type_name.
Activity filters never matched, so all CASB events resolved to
activity_id: 0. Rename field path in 5 filter queries.
All 15 OCSF tests still VALID at 100% required fields.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* zscaler: fix DNS activity_name to match OCSF enum caption
The DNS sub-pipeline set ocsf.activity_name = "Traffic" via a
string-builder, but activity_id was mapped to 2 (Response) by the
schema-category-mapper. OCSF 1.5.0 flags this as
attribute_enum_sibling_incorrect — expected "Response", got "Traffic".
Change the string-builder template to "Response" so the sibling
matches the enum caption.
Also includes the manually-fixed list-of-map indent in the CASB audit
test (was left corrupted by a YamlGenerator bug now fixed in logs-
backend PR #131855 commit 2b63bc0c5070).
All 15 OCSF tests still VALID at 100% required fields, zero warnings.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Update tests file
* zscaler: align OCSF mapper filters with NSS values
Three issues raised by codex-review on PR #2972 — all P2:
- Firewall threat severity filters used PascalCase (`Informational`,
`Low`, ...) but the pre-OCSF firewall pipeline lowercases
`zscaler.threatseverity` before the OCSF mapper runs, so no real
threat event ever matched. Lowercase the 5 filter values; OCSF
output names (`Informational`, `Low`, ...) unchanged.
- Email/Endpoint DLP severity filters tested `Informational`/`Low`/
`Medium`/`High`/`Critical` but NSS DLP feeds emit `"Info Severity"`/
`"Low Severity"`/`"Medium Severity"`/`"High Severity"` (with the
trailing " Severity" — confirmed by the existing status lookup
tables and the existing `@zscaler.severity:"High Severity"` query
in the pre-OCSF pipeline). Update the 5 filters in both DLP mappers
to use the literal NSS labels; output `name`/`id` unchanged.
- Firewall protocol mapper read only `zscaler.ipproto`, but the
documented NSS firewall feed format in zscaler/README.md emits
`"proto":"%s{ipproto}"` — so `zscaler.proto` is what actually
arrives for customers using the recommended feed. Add
`zscaler.proto` as an additional source so both field names work.
Test fixtures were already aligned with the rewritten test format from
the previous regenerate commit; the tests.yaml change here is the same
canonical-format conversion for the audit samples that the rest of
the file already used.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* zscaler: fix multi-source schema-remapper name format
CI validate-logs requires comma-separated source names in
schema-remapper `name`, not slash-separated. Change
Map `zscaler.proto`/`zscaler.ipproto` to ...
to
Map `zscaler.proto`, `zscaler.ipproto` to ...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* zscaler: flip wildcard catch-alls from Unknown/0 to Other/99
Datadog's MO is to preserve the vendor's signal when an event code
or status doesn't map to an OCSF enum entry — not collapse it to
Unknown. Apply this to all 26 wildcard catch-alls in OCSF
schema-category-mappers across the zscaler pipeline:
- Firewall threat severity: @zscaler.threatseverity:*
- Email/Endpoint DLP severity: @zscaler.severity:*
- Endpoint DLP activity type, device os/type, file type
- Account Change / Authentication / API Activity:
@zscaler.action:*, @zscaler.result:*
- Network Activity: @sourcetype:zscalernss-fw status, src_endpoint OS
- HTTP Activity: @zscaler.requestmethod:*, @zscaler.action:*
- DNS Activity: @zscaler.reqaction:* OR @zscaler.resaction:*
- Tunnel Activity: @sourcetype:zscalernss-tunnel
- File Hosting (CASB): @sourcetype:zscalernss-casb
- ZIA Alert: @sourcetype:zscalernss-alert (3 mappers)
The one numeric-range wildcard (`@zscaler.threat_score:*` for
ocsf.confidence_id) stays Unknown/0 since there's no categorical
label to preserve in that case.
Test expectations regenerated via the OCSF validator CLI's
--check-all --write; all 15 tests still VALID at 100% required
fields.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* zscaler: address @jbfeldman-dd review feedback
Consolidates all changes made in response to Jonah's review of PR
#2972. (Squashed from 13 incremental commits.)
## Pattern fixes
- Replaced hardcoded `string-builder-processor` pairs (one for OCSF
*_name, one for *_id) with `schema-category-mapper` blocks across
FW Detection Finding, ZIA Alert (later replaced by Alert),
Email DLP, Endpoint DLP, Authentication, and CASB sub-pipelines.
- Replaced two-step "build string then grok-parse to array" patterns
with single `array-processor` for `ocsf.finding_info.data_sources`
and `ocsf.finding_info.types`.
- Converted `string-builder-processor`s that just interpolate a
single source field (template "%{field}") to `attribute-remapper`
or `schema-remapper` as appropriate.
- Multi-source schema-remappers use comma-separated source names in
the title (Map `a`, `b` to `target`), not slash-separated, per
validate-logs CI naming convention.
- Wildcard catch-alls in schema-category-mappers default to `Other`
/`99` (Datadog convention) — preserves the vendor's signal in the
OCSF output rather than collapsing all unmapped values to
`Unknown`. Numeric-range catch-alls (threat_score) keep
`Unknown`/`0`. Tunnel `activity_id`/`status_id` catch-alls use
`Unknown`/`0` since "Tunnel Samples" heartbeats are genuinely
not-a-tunnel-event cases without a meaningful source label.
- Single-category mappers gained an explicit `Other`/`99` catch-all
alongside their primary category as a defensive fallback.
## Removed intermediate fields
- `zscaler.time` → `zscaler.datetime` attribute-remapper removed.
Audit logs now grok directly via a second grok-parser sourcing
`zscaler.time`.
- `date` field eliminated entirely. Pre-OCSF grok-parsers
(zscaler.datetime, zscaler.time, createTime) write directly to
`ocsf.time`. Date-remapper sources `ocsf.time`. The 13
per-sub-pipeline `Map date to ocsf.time` schema-remappers became
`Map ocsf.time to ocsf.time` self-maps. Three adjacent mappers
(Tunnel + Authentication metadata.logged_time, Detection Finding
finding_info.created_time) also retargeted to source `ocsf.time`.
- `activity_count` intermediate eliminated; grok writes directly to
`ocsf.finding_info.related_events_count`.
- Removed redundant per-sub-pipeline `Parse zscaler.datetime to
ocsf.time` grok-parsers in HTTP Activity and DNS Activity (the
pre-OCSF grok already handles this).
## OCSF mapping corrections (specific reviewer comments)
- `zscaler.event_id` retargeted from `ocsf.http_request.uid` to
`ocsf.metadata.uid` (event_id is the unique log record id, not an
HTTP-request-specific uid).
- Dropped `zscaler.protocol → ocsf.http_request.url.scheme` (NSS
emits values like "SSL" which aren't valid URL schemes).
- `alertId → ocsf.metadata.event_code` removed; `ruleName` (later
swapped to `zscaler.rulelabel` family in the new Alert pipeline)
→ `ocsf.metadata.event_code`. Rule name is the canonical
event_code per OCSF guidance.
- `zscaler.resource → ocsf.user.name` removed in Account Change
(resource is the role name for ROLE_MANAGEMENT events).
- `zscaler.resource → ocsf.resource.uid` removed (resource is a
name, not a uid).
- `zscaler.subcategory` (not `category`) → `ocsf.resource.type`.
- Authentication `recordid` retargeted to `ocsf.metadata.uid`
(recordid doesn't persist across a login session). Tunnel kept
as `session.uid` (real tunnel session).
- `zscaler.appname` / `applicationname` retargeted from
`ocsf.actor.app_name` to `ocsf.dst_endpoint.name` (SaaS app is
the destination, not the actor).
- `zscaler.extownername` retargeted from `ocsf.dst_endpoint.name`
to `ocsf.file.owner.name` (it's the file's external owner).
- Endpoint DLP `data_security.detection_system_id` corrected from
`Endpoint`/`1` (EDR) to `Data Loss Prevention`/`2`.
- Endpoint DLP `data_lifecycle_state` filters now match the
documented NSS activitytype enum (Download, Email Sent, File
Copy, File Read, File Write, Print, Upload — lowercase per the
test fixture's `email_sent`).
- DLP severity filters now match the documented NSS labels
(`"Info Severity"`, `"Low Severity"`, `"Medium Severity"`,
`"High Severity"`, `"Critical Severity"`) instead of the
PascalCase variants the previous filters used.
- Firewall threat severity filters lowercased
(`informational`/`low`/`medium`/`high`/`critical`) to match the
pre-OCSF lowercase normalization of `zscaler.threatseverity`.
- Firewall protocol mapper now reads from both `zscaler.proto` (the
field name in the documented NSS feed) and `zscaler.ipproto`.
- Tunnel `tunnel_type` mappings removed entirely — Zscaler tunnel
types (IPSEC IKEV1, GRE, etc.) don't cleanly map to OCSF's
Split/Full enum. `zscaler.tunneltype` preserved as a queryable
vendor-namespaced field.
- Tunnel TUNNEL_DOWN no longer mapped to `status: Failure` (close
isn't a failure — falls through to Other/99).
- Tunnel severity simplified to always-Informational (NSS doesn't
emit a tunnel severity field).
- CASB activity_id categories now in numeric order (Access 3,
Delete 4, Upload 7, Download 8, Share 12, Other 99).
- Removed CASB `Add ocsf.metadata.product.feature.name`
string-builder (already set in pre-transformations).
- Account Change and Authentication `severity_id` now always
Informational/1 (audit events aren't security findings; NSS
doesn't emit severity for audit).
## Alert sub-pipeline (replaces ZIA Alert)
The previous "ZIA Alert" handling assumed alerts arrive as a
synthetic webhook payload with `alertId`, `description`,
`createTime` and a fabricated `sourcetype=zscalernss-alert`. None
of that reflects real Zscaler data flow.
Removed:
- Pre-pipeline that fabricated `sourcetype=zscalernss-alert` from
`-@sourcetype:* @alertid:*`.
- Pre-OCSF "Alert" pipeline that ran four description grok-parsers.
- OCSF "ZIA Alert" sub-pipeline that targeted the fake sourcetype.
Added new "OCSF sub pipeline for class Detection Finding [2004]
(Alert)" placed at the FRONT of the OCSF chain (right after
pre-transformations, before Firewall Threat). Filter:
@sourcetype:(zscalernss-web OR zscalernss-fw)
AND @zscaler.threatseverity:* -@zscaler.threatseverity:(None OR none)
(Restricted to web/fw — DLP and CASB findings have a more specific
OCSF home in Data Security Finding [2006] and shouldn't be
swept into the generic Alert class.)
Added exclusion clauses to HTTP Activity, FW Threat, and FW
Network Activity sub-pipelines so they only fire when severity is
absent or `None`/`none`. Severity-bearing web/fw logs route to
Alert.
## CASB DLP sub-pipeline (new)
Added "OCSF sub pipeline for class Data Security Finding [2006]
(CASB DLP)" between Endpoint DLP and Account Change, filtering on
CASB events with a non-None severity. Routes severity-bearing
CASB events to Data Security Finding 2006 instead of File Hosting
Activity 6006 — DLP-flagged CASB events have OCSF-DLP-specific
fields (data_security.policy, data_security.detection_pattern,
data_security.detection_system_id) that don't exist on File
Hosting Activity. detection_system_id = 8 (Cloud Access Security
Broker per OCSF 1.5.0). CASB File Hosting kept its no-severity
exclusion so it only fires for non-DLP CASB events.
## ocsf.file.hashes array construction
OCSF `file.hashes` is an array of `fingerprint_dict` objects
{algorithm, algorithm_id, value}. Built across all four
file-bearing sub-pipelines (Email DLP, Endpoint DLP, CASB DLP,
File Hosting CASB) using a filtered nested pipeline pattern:
build temp object via string-builders + attribute-remapper, cast
algorithm_id to integer, array-processor `append` to
ocsf.file.hashes, schema-processor self-map. MD5 from
`zscaler.filemd5`, SHA-256 from `zscaler.filesha`.
## Final state
- 13 OCSF tests, 13/13 VALID at 100% required fields.
- 4 pre-existing warnings (Other-suspicious siblings on tests 5
and 7 — kept per Datadog Other/99 convention).
- 2 unresolved review threads with reviewer:
- L1417 (`is_alert: "true"` string-builder kept — boolean, not
an enum, schema-category-mapper doesn't fit).
- L2204 (Endpoint DLP `itemname → file.name` — kept with inline
comment per Crash's reply asking Jonah for guidance).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* zscaler: drop createTime grok-parser; standardize audit samples on `datetime`
The `createTime` epoch-seconds field was for the synthetic
`zscalernss-alert` UEBA webhook payload that's been removed. No
real Zscaler NSS feed emits `createTime`. Drops the
`Scale createTime to epoch ms ocsf.time` grok-parser.
The audit test fixtures were authored with `"time": "..."` instead
of `"datetime": "..."` — the documented NSS audit feed format
(zscaler/README.md) uses `datetime` like every other sourcetype.
Converting the three audit fixtures to use `datetime` lets the
existing `Parse zscaler.datetime to ocsf.time` grok-parser handle
them; no separate `zscaler.time` parser needed.
Renamed the surviving grok-parser to be more descriptive about
what it produces: `Parse zscaler.datetime to epoch ms in ocsf.time`.
The Datadog grok `date()` matcher already produces epoch
milliseconds — the test expectations show `time: 1754150969000`
(13-digit ms) — so no behavior change, just clearer naming.
All 13 tests still VALID at 100% required fields.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Update pipeline name
* zscaler: replace 3 audit test fixtures with validate-logs CI's expected output
The CI `validate-logs` job (run 73445747243) reported the 3 audit
test fixtures (SIGN_IN / UPDATE / CREATE — tests 11, 12, 13) as
not matching the actual pipeline output. Locally the OCSF
validator CLI accepts them as VALID, but the CI runner uses
Jackson's default pretty-printer (with HashMap field iteration
order in JSON sample/message blocks and slightly different YAML
quoting and tag-list indenting), so the two tools disagree on
"canonical format" even though the OCSF data is identical.
Replaced the 3 audit test entries with the CI's "actual pipeline
output" verbatim. This brings the file format byte-identical to
what CI expects.
Local OCSF validator CLI still passes (13/13 VALID at 100%
required fields).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* zscaler: address codex review feedback
- FW Threat (2004) and FW Network Activity (4001) sub-pipeline filters
now exclude `ipsrulelabel`/`threatname` placeholder values "None"/"none"
from the threat side and admit them on the network-activity side, so
normal policy logs that carry placeholder strings (per Zscaler feed
format) are routed correctly instead of falling into Detection
Finding while also being excluded from Network Activity.
- Restore `Total Bytes`, `Zscaler Request Size`, and `Zscaler Response
Size` facet names plus their `type: integer` and `unit: {family: bytes,
name: byte}` metadata. These were inadvertently collapsed to a generic
`name: byte` with no type/unit, breaking range filtering and
byte-formatted display.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* zscaler: drop Detection Finding [2004] sub-pipelines for web/fw traffic
- Remove the severity-based Alert sub-pipeline and the FW Threat
sub-pipeline. zscalernss-web and zscalernss-fw logs now always route
to HTTP Activity [4002] / Network Activity [4001] regardless of
threatseverity or ipsrulelabel/threatname.
- Simplify HTTP and Network Activity sub-pipeline filters back to plain
sourcetype matches (no threatseverity / placeholder exclusions, since
those existed only to avoid double-routing alongside the 2004
pipelines).
- Refresh test expectations via --check-all --write; all 13 tests
validate against their assigned classes.
Detection-grade DLP/CASB events still route to Data Security Finding
[2006] as before.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* zscaler: preserve vendor label via schema-category-mapper fallback
Add `fallback:` blocks to the four catch-alls that previously emitted a
generic `Other` value where the source field carried a more informative
label. The fallback block triggers when the catch-all fires and copies
the source field's literal value into the OCSF *_name target, replacing
the generic "Other" string while keeping the id as 99.
Catch-alls covered:
- HTTP Activity activity_id (source: zscaler.requestmethod)
- File Hosting activity_id (source: zscaler.activity_type_name)
- Endpoint DLP device.type_id (source: zscaler.devicetype)
- Endpoint DLP file.type_id (source: zscaler.itemtype +
zscaler.filetypecategory)
This clears the OCSF validator's `attribute_enum_sibling_suspicious_other`
warnings on the affected logs (HTTP Activity, File Hosting, Endpoint DLP)
without losing the catch-all behavior.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* peer review changes
* shift around severity processing
* final fixes
* final fixes
* Add missing fallbacks
* unnest pipelines
* Update tests file
* Fix last pipeline and ocsf flag
* Update Changelog
* Update Changelog
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent 6fc8e50 commit f12a916
3 files changed
Lines changed: 3729 additions & 323 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
3 | 66 | | |
4 | 67 | | |
5 | 68 | | |
| |||
0 commit comments