Skip to content

Commit f12a916

Browse files
[zscaler] Add OCSF normalization to all NSS log pipelines (#2972)
* [zscaler] Add OCSF normalization to all NSS log pipelines Adds OCSF 1.5.0 sub-pipelines for every zscalernss-* log type so logs flow through Datadog's normalized security detection surface. Classes mapped: - zscalernss-web -> HTTP Activity [4002] - zscalernss-dns -> DNS Activity [4003] - zscalernss-fw (threat) -> Detection Finding [2004] - zscalernss-fw (policy) -> Network Activity [4001] - zscalernss-tunnel -> Tunnel Activity [4014] - zscalernss-casb -> File Hosting Activity [6006] - zscalernss-emaildlp -> Data Security Finding [2006] - zscalernss-endpointdlp -> Data Security Finding [2006] - zscalernss-audit AUTH -> Authentication [3002] - zscalernss-audit other -> API Activity [6003] - zscalernss-alert (ZIA+UEBA) -> Detection Finding [2004] - fallback -> Base Event [0] Metadata hierarchy: - ocsf.metadata.product.vendor_name = "Zscaler" - ocsf.metadata.product.name = "ZIA" (the product) - ocsf.metadata.product.feature.name = log type (fw/dns/web/tunnel/casb/ emaildlp/endpointdlp/audit/alert), derived from sourcetype via grok - ocsf.metadata.log_name = raw sourcetype Firewall logs are split by threat detection: blocks/drops with a populated `threatname` or `ipsrulelabel` route to Detection Finding [2004]; all other fw events (Allow, policy Block, policy Drop) route to Network Activity [4001]. Without this split, every policy block would surface as a "security finding" and inflate alert noise. Audit non-AUTH events route to API Activity [6003] (not Group Management [3006] or Web Resources Activity [6001]), matching how CloudTrail/Okta/Entra/Workspace admin audit logs map to OCSF. Pipeline uses standard Datadog processors (string-builder, grok-parser, schema-processor, schema-remapper, schema-category-mapper, attribute- remapper, array-processor). Time handling is centralized: the existing `datetime -> date` grok is extended with a `createTime -> date` grok (for alert logs), then `date` is mapped to `ocsf.time` once in the OCSF pre-transforms sub-pipeline. 5 existing attribute-remappers were flipped preserveSource: false -> true (clt_sip, srv_dip, dns_req, dns_reqtype, src_ip) so OCSF sub- pipelines can read source fields. This is sanctioned by the OCSF style guide section 7.2 and is backwards-compatible. All 12 test cases validate at 100% required OCSF fields via logs-backend ocsf-validator-cli. Added one new test case for a non-threat fw Allow event to exercise Network Activity [4001]. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: use Unknown/0 instead of Other/99 for category catch-alls OCSF semantics distinguish Unknown (0) = no data / no confident mapping from Other (99) = source supplied a known literal value that doesn't match any defined enum. All our catch-all filters fire via generic wildcards ("@sourcetype:...", "@zscaler.action:*") meaning "we don't know", not "source literally said Other". Flip 13 catch-alls across status_id, activity_id, device.type_id, and src_endpoint.os.type_id. Clears the OCSF validator's attribute_enum_sibling_suspicious_other warnings. All 12 tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * remove email_addr from endpointdlp sub-pipeline * zscaler: address validate-logs CI annotations - Rename 12 self-mapping schema-remappers from "Cast X to integer" to "Map X to X" — CI expects the standard self-map naming convention for integer type-casts that share source and target field. - Rename 3 schema-processors from "Apply OCSF schema for NNNN (label)" to the canonical "Apply OCSF schema for NNNN" form (2004, 2006, 4001). - Fix greedy-at-start grok in `activity_count` rule: `.*Activity count` -> `.*?Activity count` (lazy quantifier) to avoid the performance warning. - Regenerate test expectations for 8 tests with CI's canonical output format (alphabetical key ordering, `{}` for empty maps, 9-space list indent). All 12 tests still pass OCSF validation at 100% required. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: strip extra keys from facets (description, facetType, type) The log pipeline facet schema only accepts groups/name/path/source. description/facetType/type are auto-inferred from the indexed value and, when explicitly set, are rejected by downstream validators and ignored by the Datadog app. Rewrite all 94 facets in the canonical minimal form. All 12 OCSF tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: fix validate-logs CI annotations (facets + date format) Facet conflicts (rename to canonical names CI suggests): - ocsf.actor.user.name: "Actor User Name" -> "Name" - ocsf.user.name: "User Name" -> "Target Name" - ocsf.http_request.url.url_string: "URL" -> "Request URL String" - ocsf.finding_info.title: "Finding Title" -> "Finding Info Title" - ocsf.finding_info.uid: "Finding UID" -> "Finding Info Unique ID" Facet conflicts (path already defined by another integration — remove): - ocsf.dst_endpoint.hostname - ocsf.query.type - ocsf.dst_endpoint.port Date format (alert tests #10, #11): - `createTime` (epoch seconds) -> `date` was stored in scientific notation (1.755569028E12) because grok `scale(1000)` returns a double. Rewrote to decimal form (1755569028000). Pipeline unchanged — the scientific form is valid floating point, but CI's date- remapper validator requires integer ms. All 12 OCSF tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: remove 14 no-op OCSF self-map schema-remappers Same rationale as the metadata self-map cleanup: a schema-remapper with source == target and no targetFormat is a no-op. The schema-processor already walks custom.ocsf.* against its declared schema block — fields don't need to be listed in mappers to be recognized. Removed 14 self-maps across several OCSF paths: - ocsf.finding_info.analytic.type / .type_id (×4) - ocsf.finding_info.data_sources / .types (×2) - ocsf.data_security.data_lifecycle_state / .detection_system (×3) - ocsf.evidence.* (×several) Kept 21 self-maps that carry targetFormat: integer (real string->int coercion for fields upstream-set by string-builders or category-mappers). All 12 OCSF tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revert "zscaler: remove 14 no-op OCSF self-map schema-remappers" This reverts commit 7aff951. * zscaler: fix tunnel email_addr + add product metadata self-maps - tunnel 4014: remove `zscaler.vpncredentialname -> ocsf.user.email_addr` mapping. OCSF user.email_addr is optional and credential names are not email-formatted per Zscaler tunnel log samples. - All 12 OCSF schema-processors: add self-maps for ocsf.metadata.product.name and ocsf.metadata.product.vendor_name. Pre-transforms already sets both values via string-builder ("ZIA" and "Zscaler"), but the staging API's OCSF validator checks each schema-processor's mapper list independently, so every class now declares the self-map. All 12 OCSF tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: source finding_info.product.* from ocsf.metadata.product.* The alert 2004 sub-pipeline was sourcing finding_info.product.name and finding_info.product.vendor_name from `company` (which is "Zscaler" - the vendor, not the product), and finding_info.product.feature.name from `alias` (which is "Zscaler-ZIA-UEBA" - a composite log source label, not a feature name). Reuse the already-set ocsf.metadata.product.* tree instead: - finding_info.product.name <- ocsf.metadata.product.name ("ZIA") - finding_info.product.vendor_name <- ocsf.metadata.product.vendor_name ("Zscaler") - finding_info.product.feature.name <- ocsf.metadata.product.feature.name ("alert") Single source of truth, no divergence between metadata.product and finding_info.product. All 12 tests still VALID. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: add IAM audit sub-pipelines (3001/3002/3005) + misc fixes Add OCSF normalization for Zscaler admin audit log IAM categories: LOGIN / AUTH -> 3002 Authentication (expanded filter) USER_MANAGEMENT -> 3001 Account Change (NEW) ROLE_MANAGEMENT -> 3005 User Access Management (NEW) (anything else) -> 6003 API Activity (narrowed filter) 3001 Account Change activity_id (per OCSF 1.5.0 enum): CREATE -> 1 Create DELETE -> 6 Delete UPDATE -> 99 Other (enum has no generic update value) 3005 User Access Management activity_id (per OCSF 1.5.0 enum): CREATE -> 1 Assign Privileges DELETE -> 2 Revoke Privileges UPDATE -> 99 Other (enum only defines Assign/Revoke/Unknown/Other) Additional changes: - Added `zscaler.time -> zscaler.datetime` attribute-remapper upstream so audit logs using `time` (per Zscaler NSS admin audit docs) get date-parsed identically to logs using `datetime`. - Added `ocsf.privileges` schema-remapper self-map inside the 3005 schema-processor so the staging API's OCSF validator sees it declared in the mapper list (required per-schema-processor check). - Added `ocsf.severity_id` integer-cast self-maps after the schema-category-mapper in 3001, 3002, and 3005 (category-mapper emits strings; OCSF requires integer). - Fixed `ocsf.tunnel_type_id` integer-cast self-map in 4014: source was `zscaler.tunneltype_id` (nonexistent field) - changed to a proper self-map so the "99" string-builder value casts to integer. - Removed `zscaler.resource -> ocsf.user.email_addr` mapping in 3001: resource is a role/user name in ROLE_MANAGEMENT logs (not email), which tripped the email_addr regex validator. 3 new test fixtures (LOGIN, USER_MANAGEMENT, ROLE_MANAGEMENT), bringing total to 15. All 15 VALID at 100% required fields. Generated by Fleak from synthetic + real production samples, corrected for OCSF 1.5.0 enum accuracy (Fleak proposed invalid activity_id values that don't exist in the spec for 3001/3005). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: regenerate test file with CI-aligned format + fix dstip name - zscaler.yaml: fix schema-remapper name `srcip_country` -> `dstip_country` to match its sources/target (CI annotation at zscaler.yaml:3281). - zscaler_tests.yaml: regenerated via the updated ocsf-validator-cli (logs-backend#131855) which now alphabetizes map keys, renders empty maps/lists inline as `{}`/`[]`, and uses Jackson-INDENT_ARRAYS dash-offset conventions for scalar lists. Matches the YAML format produced by the CI IntegrationTestsFileGenerator. 5 of 7 previously-failing tests are now byte-identical to CI's expected output. Remaining 2 minor diffs (dns list-of-maps dash form, one input-side empty map retention) tracked as follow-ups. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: drop empty preaction from ROLE_MANAGEMENT test output Regenerated with the updated ocsf-validator-cli (logs-backend PR #131855) which now parses sample JSON through Datadog's JSONParser instead of vanilla Jackson. Empty JSON objects in the input (like `"preaction": {}` in the ROLE_MANAGEMENT sample) are now dropped at parse time, matching CI's behavior. One-line diff that was the last remaining validate-logs CI mismatch for the zscaler tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: address codex review feedback Four items raised by codex-review on PR #2972: P3: Restore published pipeline name. The top-level pipeline.name was renamed to "Zscaler - Crash" during staging testing to avoid clobbering the prod entry. Revert back to "Zscaler" so the customer-facing pipeline name is correct on merge. P2: Firewall endpoints mapped from removed source fields. The pre-OCSF firewall remappers for csip/csport/cdip/cdport moved zscaler.* attributes to network.* with preserveSource: false. My 4001 (Firewall Policy) sub-pipeline reads from zscaler.* but those fields were already gone, so ocsf.src_endpoint.ip/port and ocsf.dst_endpoint.ip/port were never populated for real policy events. Flip preserveSource: false -> true on the three affected remappers (csip, csport, cdport; cdip was already true). OCSF sub-pipeline reads unchanged. P2: Audit OCSF status filters on removed zscaler.result. The pre- OCSF attribute-remapper zscaler.result -> evt.outcome had preserveSource: false, so my audit sub-pipelines' filters (@zscaler.result:SUCCESS, @zscaler.result:FAILURE) never matched, making every audit event resolve to status_id: 0 / Unknown and severity_id: 0 / Unknown. Flip preserveSource: false -> true. OCSF sub-pipeline filters unchanged. P2: CASB activity filter uses wrong field name. The 6006 File Hosting Activity sub-pipeline filtered on @zscaler.act_type_name but the real emitted NSS CASB field is zscaler.activity_type_name. Activity filters never matched, so all CASB events resolved to activity_id: 0. Rename field path in 5 filter queries. All 15 OCSF tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * zscaler: fix DNS activity_name to match OCSF enum caption The DNS sub-pipeline set ocsf.activity_name = "Traffic" via a string-builder, but activity_id was mapped to 2 (Response) by the schema-category-mapper. OCSF 1.5.0 flags this as attribute_enum_sibling_incorrect — expected "Response", got "Traffic". Change the string-builder template to "Response" so the sibling matches the enum caption. Also includes the manually-fixed list-of-map indent in the CASB audit test (was left corrupted by a YamlGenerator bug now fixed in logs- backend PR #131855 commit 2b63bc0c5070). All 15 OCSF tests still VALID at 100% required fields, zero warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update tests file * zscaler: align OCSF mapper filters with NSS values Three issues raised by codex-review on PR #2972 — all P2: - Firewall threat severity filters used PascalCase (`Informational`, `Low`, ...) but the pre-OCSF firewall pipeline lowercases `zscaler.threatseverity` before the OCSF mapper runs, so no real threat event ever matched. Lowercase the 5 filter values; OCSF output names (`Informational`, `Low`, ...) unchanged. - Email/Endpoint DLP severity filters tested `Informational`/`Low`/ `Medium`/`High`/`Critical` but NSS DLP feeds emit `"Info Severity"`/ `"Low Severity"`/`"Medium Severity"`/`"High Severity"` (with the trailing " Severity" — confirmed by the existing status lookup tables and the existing `@zscaler.severity:"High Severity"` query in the pre-OCSF pipeline). Update the 5 filters in both DLP mappers to use the literal NSS labels; output `name`/`id` unchanged. - Firewall protocol mapper read only `zscaler.ipproto`, but the documented NSS firewall feed format in zscaler/README.md emits `"proto":"%s{ipproto}"` — so `zscaler.proto` is what actually arrives for customers using the recommended feed. Add `zscaler.proto` as an additional source so both field names work. Test fixtures were already aligned with the rewritten test format from the previous regenerate commit; the tests.yaml change here is the same canonical-format conversion for the audit samples that the rest of the file already used. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * zscaler: fix multi-source schema-remapper name format CI validate-logs requires comma-separated source names in schema-remapper `name`, not slash-separated. Change Map `zscaler.proto`/`zscaler.ipproto` to ... to Map `zscaler.proto`, `zscaler.ipproto` to ... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * zscaler: flip wildcard catch-alls from Unknown/0 to Other/99 Datadog's MO is to preserve the vendor's signal when an event code or status doesn't map to an OCSF enum entry — not collapse it to Unknown. Apply this to all 26 wildcard catch-alls in OCSF schema-category-mappers across the zscaler pipeline: - Firewall threat severity: @zscaler.threatseverity:* - Email/Endpoint DLP severity: @zscaler.severity:* - Endpoint DLP activity type, device os/type, file type - Account Change / Authentication / API Activity: @zscaler.action:*, @zscaler.result:* - Network Activity: @sourcetype:zscalernss-fw status, src_endpoint OS - HTTP Activity: @zscaler.requestmethod:*, @zscaler.action:* - DNS Activity: @zscaler.reqaction:* OR @zscaler.resaction:* - Tunnel Activity: @sourcetype:zscalernss-tunnel - File Hosting (CASB): @sourcetype:zscalernss-casb - ZIA Alert: @sourcetype:zscalernss-alert (3 mappers) The one numeric-range wildcard (`@zscaler.threat_score:*` for ocsf.confidence_id) stays Unknown/0 since there's no categorical label to preserve in that case. Test expectations regenerated via the OCSF validator CLI's --check-all --write; all 15 tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * zscaler: address @jbfeldman-dd review feedback Consolidates all changes made in response to Jonah's review of PR #2972. (Squashed from 13 incremental commits.) ## Pattern fixes - Replaced hardcoded `string-builder-processor` pairs (one for OCSF *_name, one for *_id) with `schema-category-mapper` blocks across FW Detection Finding, ZIA Alert (later replaced by Alert), Email DLP, Endpoint DLP, Authentication, and CASB sub-pipelines. - Replaced two-step "build string then grok-parse to array" patterns with single `array-processor` for `ocsf.finding_info.data_sources` and `ocsf.finding_info.types`. - Converted `string-builder-processor`s that just interpolate a single source field (template "%{field}") to `attribute-remapper` or `schema-remapper` as appropriate. - Multi-source schema-remappers use comma-separated source names in the title (Map `a`, `b` to `target`), not slash-separated, per validate-logs CI naming convention. - Wildcard catch-alls in schema-category-mappers default to `Other` /`99` (Datadog convention) — preserves the vendor's signal in the OCSF output rather than collapsing all unmapped values to `Unknown`. Numeric-range catch-alls (threat_score) keep `Unknown`/`0`. Tunnel `activity_id`/`status_id` catch-alls use `Unknown`/`0` since "Tunnel Samples" heartbeats are genuinely not-a-tunnel-event cases without a meaningful source label. - Single-category mappers gained an explicit `Other`/`99` catch-all alongside their primary category as a defensive fallback. ## Removed intermediate fields - `zscaler.time` → `zscaler.datetime` attribute-remapper removed. Audit logs now grok directly via a second grok-parser sourcing `zscaler.time`. - `date` field eliminated entirely. Pre-OCSF grok-parsers (zscaler.datetime, zscaler.time, createTime) write directly to `ocsf.time`. Date-remapper sources `ocsf.time`. The 13 per-sub-pipeline `Map date to ocsf.time` schema-remappers became `Map ocsf.time to ocsf.time` self-maps. Three adjacent mappers (Tunnel + Authentication metadata.logged_time, Detection Finding finding_info.created_time) also retargeted to source `ocsf.time`. - `activity_count` intermediate eliminated; grok writes directly to `ocsf.finding_info.related_events_count`. - Removed redundant per-sub-pipeline `Parse zscaler.datetime to ocsf.time` grok-parsers in HTTP Activity and DNS Activity (the pre-OCSF grok already handles this). ## OCSF mapping corrections (specific reviewer comments) - `zscaler.event_id` retargeted from `ocsf.http_request.uid` to `ocsf.metadata.uid` (event_id is the unique log record id, not an HTTP-request-specific uid). - Dropped `zscaler.protocol → ocsf.http_request.url.scheme` (NSS emits values like "SSL" which aren't valid URL schemes). - `alertId → ocsf.metadata.event_code` removed; `ruleName` (later swapped to `zscaler.rulelabel` family in the new Alert pipeline) → `ocsf.metadata.event_code`. Rule name is the canonical event_code per OCSF guidance. - `zscaler.resource → ocsf.user.name` removed in Account Change (resource is the role name for ROLE_MANAGEMENT events). - `zscaler.resource → ocsf.resource.uid` removed (resource is a name, not a uid). - `zscaler.subcategory` (not `category`) → `ocsf.resource.type`. - Authentication `recordid` retargeted to `ocsf.metadata.uid` (recordid doesn't persist across a login session). Tunnel kept as `session.uid` (real tunnel session). - `zscaler.appname` / `applicationname` retargeted from `ocsf.actor.app_name` to `ocsf.dst_endpoint.name` (SaaS app is the destination, not the actor). - `zscaler.extownername` retargeted from `ocsf.dst_endpoint.name` to `ocsf.file.owner.name` (it's the file's external owner). - Endpoint DLP `data_security.detection_system_id` corrected from `Endpoint`/`1` (EDR) to `Data Loss Prevention`/`2`. - Endpoint DLP `data_lifecycle_state` filters now match the documented NSS activitytype enum (Download, Email Sent, File Copy, File Read, File Write, Print, Upload — lowercase per the test fixture's `email_sent`). - DLP severity filters now match the documented NSS labels (`"Info Severity"`, `"Low Severity"`, `"Medium Severity"`, `"High Severity"`, `"Critical Severity"`) instead of the PascalCase variants the previous filters used. - Firewall threat severity filters lowercased (`informational`/`low`/`medium`/`high`/`critical`) to match the pre-OCSF lowercase normalization of `zscaler.threatseverity`. - Firewall protocol mapper now reads from both `zscaler.proto` (the field name in the documented NSS feed) and `zscaler.ipproto`. - Tunnel `tunnel_type` mappings removed entirely — Zscaler tunnel types (IPSEC IKEV1, GRE, etc.) don't cleanly map to OCSF's Split/Full enum. `zscaler.tunneltype` preserved as a queryable vendor-namespaced field. - Tunnel TUNNEL_DOWN no longer mapped to `status: Failure` (close isn't a failure — falls through to Other/99). - Tunnel severity simplified to always-Informational (NSS doesn't emit a tunnel severity field). - CASB activity_id categories now in numeric order (Access 3, Delete 4, Upload 7, Download 8, Share 12, Other 99). - Removed CASB `Add ocsf.metadata.product.feature.name` string-builder (already set in pre-transformations). - Account Change and Authentication `severity_id` now always Informational/1 (audit events aren't security findings; NSS doesn't emit severity for audit). ## Alert sub-pipeline (replaces ZIA Alert) The previous "ZIA Alert" handling assumed alerts arrive as a synthetic webhook payload with `alertId`, `description`, `createTime` and a fabricated `sourcetype=zscalernss-alert`. None of that reflects real Zscaler data flow. Removed: - Pre-pipeline that fabricated `sourcetype=zscalernss-alert` from `-@sourcetype:* @alertid:*`. - Pre-OCSF "Alert" pipeline that ran four description grok-parsers. - OCSF "ZIA Alert" sub-pipeline that targeted the fake sourcetype. Added new "OCSF sub pipeline for class Detection Finding [2004] (Alert)" placed at the FRONT of the OCSF chain (right after pre-transformations, before Firewall Threat). Filter: @sourcetype:(zscalernss-web OR zscalernss-fw) AND @zscaler.threatseverity:* -@zscaler.threatseverity:(None OR none) (Restricted to web/fw — DLP and CASB findings have a more specific OCSF home in Data Security Finding [2006] and shouldn't be swept into the generic Alert class.) Added exclusion clauses to HTTP Activity, FW Threat, and FW Network Activity sub-pipelines so they only fire when severity is absent or `None`/`none`. Severity-bearing web/fw logs route to Alert. ## CASB DLP sub-pipeline (new) Added "OCSF sub pipeline for class Data Security Finding [2006] (CASB DLP)" between Endpoint DLP and Account Change, filtering on CASB events with a non-None severity. Routes severity-bearing CASB events to Data Security Finding 2006 instead of File Hosting Activity 6006 — DLP-flagged CASB events have OCSF-DLP-specific fields (data_security.policy, data_security.detection_pattern, data_security.detection_system_id) that don't exist on File Hosting Activity. detection_system_id = 8 (Cloud Access Security Broker per OCSF 1.5.0). CASB File Hosting kept its no-severity exclusion so it only fires for non-DLP CASB events. ## ocsf.file.hashes array construction OCSF `file.hashes` is an array of `fingerprint_dict` objects {algorithm, algorithm_id, value}. Built across all four file-bearing sub-pipelines (Email DLP, Endpoint DLP, CASB DLP, File Hosting CASB) using a filtered nested pipeline pattern: build temp object via string-builders + attribute-remapper, cast algorithm_id to integer, array-processor `append` to ocsf.file.hashes, schema-processor self-map. MD5 from `zscaler.filemd5`, SHA-256 from `zscaler.filesha`. ## Final state - 13 OCSF tests, 13/13 VALID at 100% required fields. - 4 pre-existing warnings (Other-suspicious siblings on tests 5 and 7 — kept per Datadog Other/99 convention). - 2 unresolved review threads with reviewer: - L1417 (`is_alert: "true"` string-builder kept — boolean, not an enum, schema-category-mapper doesn't fit). - L2204 (Endpoint DLP `itemname → file.name` — kept with inline comment per Crash's reply asking Jonah for guidance). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * zscaler: drop createTime grok-parser; standardize audit samples on `datetime` The `createTime` epoch-seconds field was for the synthetic `zscalernss-alert` UEBA webhook payload that's been removed. No real Zscaler NSS feed emits `createTime`. Drops the `Scale createTime to epoch ms ocsf.time` grok-parser. The audit test fixtures were authored with `"time": "..."` instead of `"datetime": "..."` — the documented NSS audit feed format (zscaler/README.md) uses `datetime` like every other sourcetype. Converting the three audit fixtures to use `datetime` lets the existing `Parse zscaler.datetime to ocsf.time` grok-parser handle them; no separate `zscaler.time` parser needed. Renamed the surviving grok-parser to be more descriptive about what it produces: `Parse zscaler.datetime to epoch ms in ocsf.time`. The Datadog grok `date()` matcher already produces epoch milliseconds — the test expectations show `time: 1754150969000` (13-digit ms) — so no behavior change, just clearer naming. All 13 tests still VALID at 100% required fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Update pipeline name * zscaler: replace 3 audit test fixtures with validate-logs CI's expected output The CI `validate-logs` job (run 73445747243) reported the 3 audit test fixtures (SIGN_IN / UPDATE / CREATE — tests 11, 12, 13) as not matching the actual pipeline output. Locally the OCSF validator CLI accepts them as VALID, but the CI runner uses Jackson's default pretty-printer (with HashMap field iteration order in JSON sample/message blocks and slightly different YAML quoting and tag-list indenting), so the two tools disagree on "canonical format" even though the OCSF data is identical. Replaced the 3 audit test entries with the CI's "actual pipeline output" verbatim. This brings the file format byte-identical to what CI expects. Local OCSF validator CLI still passes (13/13 VALID at 100% required fields). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * zscaler: address codex review feedback - FW Threat (2004) and FW Network Activity (4001) sub-pipeline filters now exclude `ipsrulelabel`/`threatname` placeholder values "None"/"none" from the threat side and admit them on the network-activity side, so normal policy logs that carry placeholder strings (per Zscaler feed format) are routed correctly instead of falling into Detection Finding while also being excluded from Network Activity. - Restore `Total Bytes`, `Zscaler Request Size`, and `Zscaler Response Size` facet names plus their `type: integer` and `unit: {family: bytes, name: byte}` metadata. These were inadvertently collapsed to a generic `name: byte` with no type/unit, breaking range filtering and byte-formatted display. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * zscaler: drop Detection Finding [2004] sub-pipelines for web/fw traffic - Remove the severity-based Alert sub-pipeline and the FW Threat sub-pipeline. zscalernss-web and zscalernss-fw logs now always route to HTTP Activity [4002] / Network Activity [4001] regardless of threatseverity or ipsrulelabel/threatname. - Simplify HTTP and Network Activity sub-pipeline filters back to plain sourcetype matches (no threatseverity / placeholder exclusions, since those existed only to avoid double-routing alongside the 2004 pipelines). - Refresh test expectations via --check-all --write; all 13 tests validate against their assigned classes. Detection-grade DLP/CASB events still route to Data Security Finding [2006] as before. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * zscaler: preserve vendor label via schema-category-mapper fallback Add `fallback:` blocks to the four catch-alls that previously emitted a generic `Other` value where the source field carried a more informative label. The fallback block triggers when the catch-all fires and copies the source field's literal value into the OCSF *_name target, replacing the generic "Other" string while keeping the id as 99. Catch-alls covered: - HTTP Activity activity_id (source: zscaler.requestmethod) - File Hosting activity_id (source: zscaler.activity_type_name) - Endpoint DLP device.type_id (source: zscaler.devicetype) - Endpoint DLP file.type_id (source: zscaler.itemtype + zscaler.filetypecategory) This clears the OCSF validator's `attribute_enum_sibling_suspicious_other` warnings on the affected logs (HTTP Activity, File Hosting, Endpoint DLP) without losing the catch-all behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * peer review changes * shift around severity processing * final fixes * final fixes * Add missing fallbacks * unnest pipelines * Update tests file * Fix last pipeline and ocsf flag * Update Changelog * Update Changelog --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 6fc8e50 commit f12a916

3 files changed

Lines changed: 3729 additions & 323 deletions

File tree

zscaler/CHANGELOG.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,68 @@
11
# CHANGELOG - ZScaler
22

3+
## 2.1.0 / 2026-05-04
4+
5+
**Added**
6+
7+
* Added OCSF 1.5.0 normalization for every `zscalernss-*` log type. Each
8+
log routes through exactly one OCSF sub-pipeline based on sourcetype
9+
(and, for CASB, severity):
10+
11+
* `zscalernss-web` → HTTP Activity [4002]
12+
* `zscalernss-fw` → Network Activity [4001]
13+
* `zscalernss-dns` → DNS Activity [4003]
14+
* `zscalernss-tunnel` → Tunnel Activity [4014]
15+
* `zscalernss-emaildlp` → Data Security Finding [2006]
16+
* `zscalernss-endpointdlp` → Data Security Finding [2006]
17+
* `zscalernss-casb` (with severity) → Data Security Finding [2006] (CASB DLP)
18+
* `zscalernss-casb` (no severity) → File Hosting Activity [6006]
19+
* `zscalernss-audit` (`LOGIN` / `AUTH`) → Authentication [3002]
20+
* `zscalernss-audit` (`USER_MANAGEMENT` / `ROLE_MANAGEMENT`) → Account Change [3001]
21+
* `zscalernss-audit` (any other category) → API Activity [6003]
22+
* everything else → Base Event [0]
23+
24+
**Changed**
25+
26+
* Pre-OCSF firewall protocol mapper now reads from both
27+
`zscaler.proto` (the documented NSS feed field) and
28+
`zscaler.ipproto`.
29+
30+
* Account Change postaction handling: removed
31+
`zscaler.resource → ocsf.user.name` (`resource` is the role name for
32+
`ROLE_MANAGEMENT` events, not a user). Replaced with
33+
`postaction.name` / `postaction.roleName` and `postaction.email`
34+
mappings for both `USER_MANAGEMENT` and `ROLE_MANAGEMENT` branches.
35+
36+
* Authentication: `zscaler.recordid` now maps to `ocsf.metadata.uid`
37+
instead of `ocsf.session.uid` (recordid doesn't persist across a
38+
login session).
39+
40+
* Endpoint DLP `data_security.detection_system_id` corrected from
41+
`Endpoint` / `1` (EDR) to `Data Loss Prevention` / `2`. EDR is a
42+
different OCSF detection system; Endpoint DLP is DLP.
43+
44+
* `zscalernss-fw` Network Activity filter admits
45+
`ipsrulelabel:None` / `threatname:None` placeholder values — the
46+
documented Zscaler firewall feed populates these placeholders on
47+
non-threat policy events too.
48+
49+
* Several existing pre-OCSF `attribute-remapper`s flipped from
50+
`preserveSource: false` to `preserveSource: true` (e.g. `clt_sip`
51+
`network.client.ip`, `srv_dip``network.destination.ip`,
52+
`dns_req``dns.question.name`) so the OCSF sub-pipelines can still
53+
read the original `zscaler.*` fields. No existing attribute paths
54+
were deleted.
55+
56+
**Removed**
57+
58+
* Previous synthetic `zscalernss-alert` handling: the pre-pipeline that
59+
fabricated a `sourcetype` from `alertId`, the pre-OCSF `Alert`
60+
description-grok, and the OCSF "ZIA Alert" sub-pipeline. Real Zscaler
61+
"alerts" are NSS DLP / CASB logs identified by severity, which now
62+
route to Data Security Finding [2006]; web / fw traffic stays in
63+
HTTP Activity [4002] / Network Activity [4001] regardless of
64+
severity, with no synthetic Detection Finding [2004] sub-pipeline.
65+
366
## 2.0.0 / 2025-08-27
467

568
**Changed**:

0 commit comments

Comments
 (0)