Skip to content

Commit 525c79d

Browse files
Nate Smalleyclaude
andcommitted
pipelines: backfill ingest_mode and auth_type on transform_ocsf/ entries
Adds the new metadata fields introduced by #59 to all 129 existing transform_ocsf/ pipeline metadata.yaml files. The fields are inserted immediately after the existing ingestion_method line in each file. No serializer logic, no pipeline JSON, no other metadata changed. Values were derived per entry by combining: 1. Bound parser metadata (parsers/community/<source_name>/metadata.yaml) when the parser declares format=syslog/CEF/RFC/w3c/custom-syslog or ingestion_method containing "Syslog" or "HEC" -- the parser is authoritative when its declaration is unambiguous. 2. Vendor and product knowledge for the ~90 entries where the parser metadata is unclear (gron format with "streaming" or "unknown" ingestion_method, or no parser binding at all). Examples: - Cisco network kit (firewalls, ASA, Meraki, ISE, etc.) -> Syslog - Microsoft 365 / Entra / Defender management surfaces -> API Call (OAuth) - AWS managed services delivering to S3 (CloudTrail, ELB, Route53 Resolver, GuardDuty export, VPC flow) -> Other - {object store with SQS notifications} (IAM Role) - Azure Event Hub-delivered streams (signin, defender email) -> Other - {Azure Event Hub stream (AMQP/Kafka protocol)} (OAuth) - SaaS REST APIs (Okta, Snyk, Wiz, Tenable, Mimecast, Netskope, Proofpoint, GitHub, Google Workspace, Cloudflare, etc.) -> API Call with the vendor's typical auth (Bearer Token, API Key & Secret, or OAuth) Confidence per entry is recorded in .reorg-prep/inventory/transform_ocsf_classifications.tsv as one of high (103), medium (17), or low (9). Low-confidence entries are genuinely generic placeholders (json_generic_logs, sample_test_logs, microservice_tracing_logs, etc.) where a more specific value is not derivable; they use Other - {Explain: ...} with the reason inline. palo_alto_networks_firewall/ is intentionally not modified because it is being removed in PR #60 (open). Resulting distribution: Syslog 56 API Call 39 Other - {object store / Event Hub / agent / etc.} 34 Auth distribution: N/A (syslog / file-based / generic) 75 API Key & Secret 20 OAuth 18 IAM Role 8 Bearer Token 7 Other (Kafka SASL) 1 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 79a947d commit 525c79d

129 files changed

Lines changed: 258 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

pipelines/community/transform_ocsf/agent_metrics_logs/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ metadata_details:
1313
format: json
1414
ocsf_version: 1.3.0
1515
ingestion_method: "Observo OCSFSerializer (Lua-based transform)"
16+
ingest_mode: "Other - {Explain: SentinelOne agent self-reported telemetry}"
17+
auth_type: "N/A"
1618
ocsf_mapping:
1719
class_uid: 5001
1820
class_name: "Device Inventory Info"

pipelines/community/transform_ocsf/akamai_cdn/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Syslog"
18+
auth_type: "N/A"
1719
sample_record: "{\n \"raw\": \"2026-04-20T02:26:52Z AkamaiCDN streamId=\\\"stream-735\\\" cp=\\\"87876\\\
1820
\" reqId=\\\"tsuzt53unx\\\" statusCode=304 cliIP=\\\"176.105.197.188\\\" reqHost=\\\"img.example.com\\\
1921
\" reqMethod=\\\"GET\\\" reqPath=\\\"/js/app.js\\\" bytes=525284 cacheStatus=\\\"TCP_MISS\\\" turnAroundTimeMSec=331\

pipelines/community/transform_ocsf/akamai_dns/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Syslog"
18+
auth_type: "N/A"
1719
sample_record: "{\n \"raw\": \"2026-04-19T19:07:52Z AkamaiDNS streamId=\\\"dns-662\\\" cliIP=\\\"4.61.218.110\\\
1820
\" resolverIP=\\\"8.8.8.8\\\" domain=\\\"app.example.net\\\" recordType=\\\"AAAA\\\" responseCode=\\\
1921
\"REFUSED\\\" answer=\\\"\\\" edge=\\\"edge-nyc\\\" ttl=0 bytes=64\"\n}"

pipelines/community/transform_ocsf/akamai_general/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Syslog"
18+
auth_type: "N/A"
1719
sample_record: "{\n \"raw\": \"2026-04-19T09:18:52Z AkamaiSecurity clientIP=\\\"144.165.201.238\\\"\
1820
\ host=\\\"blog.example.com\\\" path=\\\"/login\\\" ruleId=\\\"925798\\\" attackType=\\\"Command_Injection\\\
1921
\" action=\\\"rate_limited\\\" httpMethod=\\\"HEAD\\\" status=400 userAgent=\\\"Googlebot/2.1\\\"\

pipelines/community/transform_ocsf/akamai_sitedefender/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Syslog"
18+
auth_type: "N/A"
1719
sample_record: "{\n \"type\": \"akamai_siem\",\n \"attackData\": {\n \"clientIP\": \"198.51.100.2\"\
1820
,\n \"configId\": \"20933\",\n \"policyId\": \"p_10245\",\n \"rules\": []\n },\n \"httpMessage\"\
1921
: {\n \"method\": \"DELETE\",\n \"host\": \"api.example.com\",\n \"path\": \"/search\",\n\

pipelines/community/transform_ocsf/apache_http_logs/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Other - {Explain: file-based agent ingestion (Apache access/error log)}"
18+
auth_type: "N/A"
1719
sample_record: "{\n \"raw\": \"10.29.72.231 - - [20/Apr/2026:03:40:52 +0000] \\\"HEAD /settings HTTP/1.1\\\
1820
\" 200 5305\"\n}"
1921
dependency_summary: Requires Observo OCSFSerializer template with Lua runtime (lupa). Source events

pipelines/community/transform_ocsf/aws_cloudtrail/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Other - {Explain: object store (S3) with SQS/SNS notifications}"
18+
auth_type: "IAM Role"
1719
sample_record: "{\n \"eventCategory\": \"Management\",\n \"eventName\": \"CreateUser\",\n \"eventSource\"\
1820
: \"iam.amazonaws.com\",\n \"eventTime\": \"2026-04-20T03:40:52Z\",\n \"eventVersion\": \"1.09\"\
1921
,\n \"eventID\": \"4ad68099-cad0-4172-8711-dd15c4d352c9\",\n \"eventType\": \"AwsApiCall\",\n \"\

pipelines/community/transform_ocsf/aws_elasticloadbalancer_logs/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Other - {Explain: object store (S3) for ELB access logs}"
18+
auth_type: "IAM Role"
1719
sample_record: "{\n \"type\": \"https\",\n \"time\": \"2026-04-20T03:40:52.700664Z\",\n \"alb\":\
1820
\ \"corporate-alb-3\",\n \"client_ip\": \"192.168.10.200\",\n \"client_port\": 41655,\n \"backend_ip\"\
1921
: \"172.16.1.50\",\n \"backend_port\": 443,\n \"request_processing_time\": 0.029082,\n \"backend_processing_time\"\

pipelines/community/transform_ocsf/aws_guardduty/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Other - {Explain: AWS EventBridge or S3 export of GuardDuty findings}"
18+
auth_type: "IAM Role"
1719
sample_record: "{\n \"schemaVersion\": \"2.0\",\n \"accountId\": \"222708836859\",\n \"region\":\
1820
\ \"us-east-1\",\n \"partition\": \"aws\",\n \"id\": \"8db850b8-f1b1-4dfd-8676-efd7b4e8ee85\",\n\
1921
\ \"arn\": \"arn:aws:guardduty:us-east-1::84378c5c8013403891eb51ada1b2a47b:detector/84378c5c8013403891eb51ada1b2a47b/finding/8db850b8-f1b1-4dfd-8676-efd7b4e8ee85\"\

pipelines/community/transform_ocsf/aws_guardduty_logs/metadata.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ metadata_details:
1414
format: source-specific JSON/KV/syslog
1515
ocsf_version: 1.3.0
1616
ingestion_method: Observo OCSFSerializer (Lua-based transform)
17+
ingest_mode: "Other - {Explain: AWS EventBridge or S3 export of GuardDuty findings}"
18+
auth_type: "IAM Role"
1719
sample_record: "{\n \"schemaVersion\": \"2.0\",\n \"accountId\": \"200759122295\",\n \"region\":\
1820
\ \"ap-south-1\",\n \"partition\": \"aws\",\n \"id\": \"eb19bf82-4550-40d2-a0b8-ae97533cc0f2\",\n\
1921
\ \"arn\": \"arn:aws:guardduty:ap-south-1::e5485011576b45629ceb37d38e001440:detector/e5485011576b45629ceb37d38e001440/finding/eb19bf82-4550-40d2-a0b8-ae97533cc0f2\"\

0 commit comments

Comments
 (0)