diff --git a/CHANGELOG.md b/CHANGELOG.md index 6abe922..7680f36 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,33 @@ All notable changes to the AI-SIEM repository will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [Unreleased] + +### Changed - pipelines/ reorganization + +The `pipelines/` directory has been restructured around ingestion mode rather +than contributor provenance. New layout: + +- `pipelines/push/syslog///` +- `pipelines/push/hec///` +- `pipelines/pull/api///` +- `pipelines/pull/object_store///` +- `pipelines/community/transform_ocsf///` + +`metadata.yaml` for pipelines now includes `ingest_mode` and `auth_type` fields. +The new schema applies to new pipelines added after this release; existing +entries in `transform_ocsf/` will be backfilled in a follow-up. See +`pipelines/community/README.md` for the full schema and naming conventions. + +### Removed - orphan PAN-OS serializer + +`pipelines/community/serializers/Palo Alto Networks/serializer.lua` has been +removed. It is functionally subsumed by +`pipelines/community/transform_ocsf/paloalto_logs/`, which is signed off with +100% required-field coverage and produces the same OCSF class (Network +Activity, `class_uid=4001`) for a broader range of log types. The now-empty +`pipelines/community/serializers/` umbrella has been removed alongside it. + ## [1.3.0] - 2025-10-28 ### Added diff --git a/README.md b/README.md index fe6c8d1..fe90206 100644 --- a/README.md +++ b/README.md @@ -23,8 +23,15 @@ ai-siem/ # AI SIEM core structure (260+ components) ├── detections/ # Detection rules (8 detections with metadata) │ └── community/ # Community-contributed detection rules ├── monitors/ # Python monitoring scripts for Dataset Agent (log_gen, maxmind, powerquery) - ├── pipelines/ # Observo Pipeline Templates for data transformation (5 pipelines) - │ └── community/ # AWS S3, Cisco Duo, Netskope, Okta, ProofPoint + ├── pipelines/ # Observo pipeline templates + │ ├── push/ # Vendor pushes to us (syslog/CEF/LEEF/KV or direct HEC) + │ │ ├── syslog/// + │ │ └── hec/// + │ ├── pull/ # We fetch from the vendor (REST API or object store) + │ │ ├── api/// + │ │ └── object_store/// + │ └── community/ + │ └── transform_ocsf/// # OCSF normalization overlays ├── parsers/ # Parsing logic and configurations (165 parsers) │ ├── community/ # 148 community parsers (*.conf + metadata) │ └── sentinelone/ # 17 official marketplace parsers (*.conf + metadata) @@ -113,50 +120,32 @@ The monitors directory contains Python scripts for use with the Dataset Agent: --- -## Pipelines Installation Guide - -### Observo Pipeline Integration -The pipelines directory contains pre-configured Observo pipeline templates for ingesting and transforming data from various sources: - -#### Available Pipeline Templates -1. **AWS S3 CloudTrail** (`aws_s3_cloudtrail/`) - - Ingests CloudTrail logs from S3 buckets via SQS/SNS - - Transforms to OCSF format with extensive field mapping - - **Required credentials:** - - `auth.assume_role`: `arn:aws:iam:::role/` - - `auth.external_id`: Your external ID for role assumption - -2. **Cisco Duo Logs** (`cisco_duo_logs/`) - - Collects authentication, administrator, and telephony logs - - Supports checkpointing for incremental data collection - - **Required credentials:** - - `DUO_API_HOST`: `.duosecurity.com` - - `DUO_INTEGRATION_KEY`: Your integration key - - `DUO_SECRET_KEY`: Your secret key - -3. **Netskope Alerts** (`netskope_alerts/`) - - Ingests Netskope security alerts - - Transforms to OCSF format - -4. **Okta Log Collector** (`okta_log_collector/`) - - Collects Okta identity and access management logs - - Supports incremental log collection - -5. **ProofPoint Logs** (`proofpoint_log/`) - - Ingests ProofPoint email security logs - - OCSF transformation included - -### Pipeline Installation Steps -1. Import the JSON configuration file into your Observo instance -2. Update authentication credentials with your specific values -3. Configure the SentinelOne AI SIEM destination endpoint -4. Deploy and activate the pipeline - -### Configuration Requirements -All pipelines require: -- **SentinelOne HEC Token**: Replace `********` with your actual token -- **Endpoint URL**: Verify the correct region endpoint (default: `https://ingest.us1.sentinelone.net`) -- **Source-specific credentials**: See individual pipeline requirements above +## Pipelines + +The `pipelines/` directory holds Observo pipeline templates for SentinelOne +AI SIEM, organized by ingest mode: + +- `pipelines/push/{syslog,hec}///` — vendor pushes events to us +- `pipelines/pull/{api,object_store}///` — we fetch from the vendor +- `pipelines/community/transform_ocsf///` — OCSF normalization + overlays that run on top of upstream-ingested data + +The full directory taxonomy, required `metadata.yaml` fields, and naming +conventions are documented in [`pipelines/community/README.md`](pipelines/community/README.md). + +### Installing a community pipeline + +1. Navigate to the relevant `pipelines/{push,pull}////` + or `pipelines/community/transform_ocsf///` directory. +2. Import the JSON template into your Observo instance, or apply the Lua + serializer to the appropriate transform stage. +3. Update authentication credentials per the `metadata.yaml` `dependencies` + block. +4. Configure the SentinelOne AI SIEM HEC destination: + - **HEC token** — replace the placeholder in the import. + - **Endpoint URL** — verify regional endpoint + (default `https://ingest.us1.sentinelone.net`). +5. Deploy and activate. --- @@ -242,4 +231,29 @@ metadata_details: expected_behavior: "Describe the action or alert that should result" tags: "Optional tagging" version: "v1.0" + +# Pipelines +# File: metadata.yaml +# Schema applies to new pipelines; existing entries will be backfilled in a follow-up. +# Top-level `grade:` block is produced by the automated grader — do not hand-author. +metadata_details: + vendor: "" # lowercase, underscored + product: "" # lowercase, underscored + ingest_mode: "HEC | Syslog | API Call | Other - {Explain, e.g. websocket, object store}" + auth_type: "N/A | HEC Token | OAuth | API Key & Secret | Bearer Token | Basic | mTLS | IAM Role | Other - {Explain}" + syslog_format: "CEF | LEEF | RFC5424 | RFC3164 | Vendor KV" # optional, push/syslog/ only + purpose: "What the pipeline ingests/transforms and into which OCSF classes" + source_template: "Source template name as it appears in the pipeline manager" + source_vendor: "Vendor display name" + destination_template: "SentinelOne AI SIEM" + destination_type: "SPLUNK_HEC_LOGS" + transform_templates: "Description of OCSF / Lua serializer logic" + input_schema: "Expected input record fields" + output_schema: "Resulting OCSF event shape" + scheduling: "Polling interval / event-driven / N/A" + retry_behavior: "Backoff and failure handling" + dependencies: "Auth credentials, IAM, queues, etc." + performance_impact: "Throughput and tuning notes" + tags: "Optional tagging" + version: "v1.0" ``` diff --git a/pipelines/community/README.md b/pipelines/community/README.md new file mode 100644 index 0000000..9e8a1a5 --- /dev/null +++ b/pipelines/community/README.md @@ -0,0 +1,118 @@ +# pipelines/community/ + +Community-contributed Observo pipeline templates for SentinelOne AI SIEM. + +This directory holds parser/transform pipelines that bridge a vendor's log +format to OCSF and the AI SIEM HEC endpoint. + +--- + +## Layout + +``` +pipelines/ +├── push/ # vendor pushes events to us +│ ├── syslog/// # vendor-specific syslog/CEF/LEEF/KV +│ └── hec/// # vendors that POST direct to HEC +├── pull/ # we fetch events from the vendor +│ ├── api/// # REST/HTTP API polling +│ └── object_store/// # S3 / GCS / Azure Blob +├── community/ +│ └── transform_ocsf/// # OCSF normalization overlays +``` + +Each leaf (`/`) contains a `metadata.yaml` and (for ingestion +templates) one Observo pipeline export JSON, or (for `transform_ocsf/` +overlays) the serializer Lua plus metadata. + +--- + +## What this directory accepts + +1. **Ingestion templates** — pipelines that get a vendor's events into the + AI SIEM. Belongs under `push/` or `pull/` based on which side initiates + the connection. + +2. **OCSF transform overlays** — Lua serializers that normalize already- + ingested data into OCSF. Belongs under `community/transform_ocsf/`. + +3. **Vendor-specific HEC shaping** — pipelines for vendors POSTing to HEC + that need vendor-specific batch/retry/field-handling logic. Belongs + under `push/hec/`. + +--- + +## `metadata.yaml` schema + +> **New fields (`ingest_mode`, `auth_type`) apply to new pipelines added +> after this PR.** Existing entries in `transform_ocsf/` will be backfilled +> in a follow-up sweep — they should not be considered out of compliance +> until then. + +In addition to the existing top-level `grade:` block (produced by the +automated grader; do not author by hand), each pipeline declares: + +```yaml +metadata_details: + vendor: "" # lowercase, underscored + product: "" # lowercase, underscored + + ingest_mode: "..." # see enum below + auth_type: "..." # see enum below + + # Optional, only when relevant + syslog_format: "CEF | LEEF | RFC5424 | RFC3164 | Vendor KV" + + # Plus the standard pipeline narrative fields + purpose: ... + source_template: ... + source_vendor: ... + destination_template: "SentinelOne AI SIEM" + destination_type: "SPLUNK_HEC_LOGS" + transform_templates: ... + input_schema: ... + output_schema: ... + scheduling: ... + retry_behavior: ... + dependencies: ... + performance_impact: ... + tags: [...] + version: "v1.0" +``` + +### `ingest_mode` enum + +The directory the pipeline lives in encodes push-vs-pull; `ingest_mode` +records the protocol/mechanism. + +| Value | Meaning | +|--------------------------------|------------------------------------------------------| +| `HEC` | HTTP Event Collector | +| `Syslog` | Vendor syslog (RFC5424/3164, CEF, LEEF, vendor KV) | +| `API Call` | REST/HTTP API | +| `Other - {Explain: ...}` | Anything else — e.g. websocket, object store (S3/GCS/Azure Blob), gRPC. Spell out the mechanism in the braces. | + +### `auth_type` enum + +| Value | Meaning | +|--------------------------------|------------------------------------------------------| +| `N/A` | No auth on the wire (raw syslog over UDP, etc.) | +| `HEC Token` | Splunk-style HEC bearer | +| `OAuth` | OAuth 2.0 client credentials / authorization code | +| `API Key & Secret` | Two-part credential (key + shared secret) | +| `Bearer Token` | Static bearer token (non-HEC) | +| `Basic` | HTTP Basic auth | +| `mTLS` | Mutual TLS (client cert) | +| `IAM Role` | AWS-style assume-role (typical for object stores) | +| `Other - {Explain: ...}` | Anything else — spell out the mechanism in braces | + +--- + +## Naming conventions + +- Vendor and product directories: lowercase, underscored, no spaces + (`palo_alto/panos/`, never `Palo Alto Networks/PANOS/`) +- File names: snake_case +- One vendor's pipelines may live under multiple subtrees (e.g., + `push/syslog/palo_alto/panos/` for firewall syslog and + `pull/api/palo_alto/cortex_xdr/` for the Cortex XDR API) diff --git a/pipelines/community/serializers/Palo Alto Networks/serializer.lua b/pipelines/community/serializers/Palo Alto Networks/serializer.lua deleted file mode 100644 index 4271a0e..0000000 --- a/pipelines/community/serializers/Palo Alto Networks/serializer.lua +++ /dev/null @@ -1,325 +0,0 @@ --- palo_alto.lua --- Maps pre-parsed palo_alto.* fields to OCSF schema --- CSV parsing happens upstream; this script only does field mapping - ------------------------------------------------------------------------- --- Helper: set nested field by dot-path ------------------------------------------------------------------------- -local function set_field(obj, path, value) - if not obj or not path or path == "" or value == nil or value == "" then - return - end - local current = obj - local segments = {} - for seg in path:gmatch("[^%.]+") do - local name, idx = seg:match("^(.-)%[(%d+)%]$") - if name and name ~= "" then - segments[#segments + 1] = name - segments[#segments + 1] = tonumber(idx) - elseif name then - segments[#segments + 1] = tonumber(idx) - else - segments[#segments + 1] = seg - end - end - for i = 1, #segments - 1 do - local s = segments[i] - if current[s] == nil then current[s] = {} end - current = current[s] - end - current[segments[#segments]] = value -end - -local function get_field(obj, path) - if not obj or not path or path == "" then return nil end - local current = obj - for seg in path:gmatch("[^%.]+") do - if type(current) ~= "table" then return nil end - current = current[seg] - if current == nil then return nil end - end - return current -end - -local function to_int(val) - if val == nil or val == "" then return nil end - return tonumber(val) -end - ------------------------------------------------------------------------- --- Mapping tables: palo_alto field -> OCSF field ------------------------------------------------------------------------- - -local THREAT_MAP = { - -- Core fields - {"receive_time", "time"}, - {"serial_number", "device.hw_info.serial_number"}, - {"log_type", "metadata.log_name"}, - {"log_subtype", "unmapped.sub_type"}, - {"generated_time", "metadata.original_time"}, - {"src_ip", "src_endpoint.ip"}, - {"dest_ip", "dst_endpoint.ip"}, - {"src_translated_ip", "src_endpoint.intermediate_ips[0]"}, - {"dest_translated_ip", "dst_endpoint.intermediate_ips[0]"}, - {"rule", "unmapped.rule_matched"}, - {"src_user", "actor.user.name"}, - {"dest_user", "unmapped.dst_user"}, - {"app", "app_name"}, - {"vsys", "unmapped.vsys"}, - {"src_zone", "unmapped.from_zone"}, - {"dest_zone", "unmapped.to_zone"}, - {"src_interface", "unmapped.inbound_if"}, - {"dest_interface", "unmapped.outbound_if"}, - {"log_forwarding_profile", "unmapped.log_action"}, - {"session_id", "actor.session.uid"}, - {"repeat_count", "unmapped.repeat_count"}, - {"src_port", "src_endpoint.port"}, - {"dest_port", "dst_endpoint.port"}, - {"src_translated_port", "unmapped.nat_src_port"}, - {"dest_translated_port", "unmapped.nat_dst_port"}, - {"session_flags", "unmapped.flags"}, - {"transport", "connection_info.protocol_name"}, - {"action", "unmapped.action"}, - {"misc", "unmapped.file"}, - {"threat", "unmapped.threat_id"}, - {"raw_category", "unmapped.url_category"}, - {"severity", "unmapped.severity"}, - {"direction", "unmapped.direction_of_attack"}, - {"sequence_number", "metadata.uid"}, - {"action_flags", "unmapped.action_flags"}, - {"src_location", "src_endpoint.location.region"}, - {"dest_location", "dst_endpoint.location.region"}, - {"content_type", "unmapped.contenttype"}, - {"pcap_id", "unmapped.pcap_id"}, - {"file_hash", "unmapped.file_digest"}, - {"cloud_address", "cloud.account_uid"}, - {"url_index", "unmapped.url_idx"}, - {"user_agent", "unmapped.user_agent"}, - {"file_type", "unmapped.file_type"}, - {"xff", "src_endpoint.intermediate_ips[1]"}, - {"referrer", "unmapped.referrer"}, - {"sender", "unmapped.sender_of_email"}, - {"subject", "unmapped.subject_of_email"}, - {"recipient", "unmapped.receipent_of_email"}, - {"report_id", "unmapped.report_id"}, - {"devicegroup_level1", "unmapped.dg_hier_level_1"}, - {"devicegroup_level2", "unmapped.dg_hier_level_2"}, - {"devicegroup_level3", "unmapped.dg_hier_level_3"}, - {"devicegroup_level4", "unmapped.dg_hier_level_4"}, - {"vsys_name", "unmapped.vsys_name"}, - {"dvc_name", "device.hostname"}, - {"http_method", "unmapped.http_method"}, - {"tunnel_id", "unmapped.tunnel_id"}, - {"tunnel_monitor_tag", "unmapped.tunnel_monitor_tag"}, - {"tunnel_session_id", "unmapped.tunnel_session_id"}, - {"tunnel_start_time", "unmapped.tunnel_start_time"}, - {"tunnel_type", "unmapped.tunnel_type"}, - {"threat_category", "unmapped.threat_category"}, - {"content_version", "unmapped.content_version"}, -} - -local TRAFFIC_MAP = { - {"receive_time", "time"}, - {"serial_number", "device.hw_info.serial_number"}, - {"log_type", "metadata.log_name"}, - {"log_subtype", "unmapped.sub_type"}, - {"generated_time", "metadata.original_time"}, - {"src_ip", "src_endpoint.ip"}, - {"dest_ip", "dst_endpoint.ip"}, - {"src_translated_ip", "src_endpoint.intermediate_ips[0]"}, - {"dest_translated_ip", "dst_endpoint.intermediate_ips[0]"}, - {"rule", "unmapped.rule_matched"}, - {"src_user", "actor.user.name"}, - {"dest_user", "unmapped.dst_user"}, - {"app", "app_name"}, - {"vsys", "unmapped.vsys"}, - {"src_zone", "unmapped.from_zone"}, - {"dest_zone", "unmapped.to_zone"}, - {"src_interface", "unmapped.inbound_if"}, - {"dest_interface", "unmapped.outbound_if"}, - {"log_forwarding_profile", "unmapped.log_action"}, - {"session_id", "actor.session.uid"}, - {"repeat_count", "unmapped.repeat_count"}, - {"src_port", "src_endpoint.port"}, - {"dest_port", "dst_endpoint.port"}, - {"src_translated_port", "unmapped.nat_src_port"}, - {"dest_translated_port", "unmapped.nat_dst_port"}, - {"session_flags", "unmapped.flags"}, - {"transport", "connection_info.protocol_name"}, - {"action", "unmapped.action"}, - {"dvc_name", "device.hostname"}, - {"src_location", "src_endpoint.location.region"}, - {"dest_location", "dst_endpoint.location.region"}, -} - -local THREAT_CONSTANTS = { - {"activity_name", "THREAT"}, - {"class_uid", 4001}, - {"activity_id", 99}, - {"category_uid", 4}, - {"type_uid", 400199}, - {"type_name", "Network Activity: Other"}, - {"class_name", "Network Activity"}, - {"category_name", "Network Activity"}, - {"metadata.version", "1.0.0-rc.3"}, - {"event.type", "THREAT"}, - {"status_id", 99}, - {"status", "Other"}, - {"connection_info.direction_id", 99}, - {"device.type_id", 99}, - {"dataSource.category", "security"}, - {"dataSource.name", "Palo Alto Networks Firewall"}, - {"dataSource.vendor", "Palo Alto Networks"}, - {"metadata.product.name", "Palo Alto Networks Firewall"}, - {"metadata.product.vendor_name", "Palo Alto Networks"}, -} - -local TRAFFIC_CONSTANTS = { - {"class_uid", 4001}, - {"category_uid", 4}, - {"severity_id", 0}, - {"class_name", "Network Activity"}, - {"category_name", "Network Activity"}, - {"metadata.version", "1.0.0-rc.3"}, - {"metadata.log_name", "TRAFFIC"}, - {"status_id", 99}, - {"status", "Other"}, - {"connection_info.direction_id", 99}, - {"device.type_id", 99}, - {"dataSource.category", "security"}, - {"dataSource.name", "Palo Alto Networks Firewall"}, - {"dataSource.vendor", "Palo Alto Networks"}, - {"metadata.product.name", "Palo Alto Networks Firewall"}, - {"metadata.product.vendor_name", "Palo Alto Networks"}, -} - -local THREAT_COND_CONSTANTS = { - {"severity_id", 1, "unmapped.severity", "informational"}, - {"severity_id", 2, "unmapped.severity", "low"}, - {"severity_id", 3, "unmapped.severity", "medium"}, - {"severity_id", 4, "unmapped.severity", "high"}, - {"severity_id", 5, "unmapped.severity", "critical"}, - {"status_id", 1, "unmapped.action", "allow"}, - {"status_id", 2, "unmapped.action", "deny"}, - {"status", "Success", "unmapped.action", "allow"}, - {"status", "Failure", "unmapped.action", "deny"}, -} - -local TRAFFIC_COND_CONSTANTS = { - {"activity_id", 1, "unmapped.sub_type", "start"}, - {"activity_id", 2, "unmapped.sub_type", "end"}, - {"activity_id", 4, "unmapped.sub_type", "drop"}, - {"activity_id", 5, "unmapped.sub_type", "deny"}, - {"activity_name", "Open", "unmapped.sub_type", "start"}, - {"activity_name", "Close", "unmapped.sub_type", "end"}, - {"activity_name", "Fail", "unmapped.sub_type", "drop"}, - {"activity_name", "Refuse", "unmapped.sub_type", "deny"}, - {"status_id", 1, "unmapped.action", "allow"}, - {"status_id", 2, "unmapped.action", "deny"}, - {"status", "Success", "unmapped.action", "allow"}, - {"status", "Failure", "unmapped.action", "deny"}, -} - -local COPIES = { - {"src_endpoint.ip", "observables[0].value"}, - {"dst_endpoint.ip", "observables[1].value"}, - {"device.hostname", "observables[2].value"}, -} - -local OBSERVABLES_CONSTANTS = { - {"observables[0].type_id", 2}, - {"observables[0].type", "IP Address"}, - {"observables[0].name", "src_endpoint.ip"}, - {"observables[1].type_id", 2}, - {"observables[1].type", "IP Address"}, - {"observables[1].name", "dst_endpoint.ip"}, - {"observables[2].type_id", 1}, - {"observables[2].type", "Hostname"}, - {"observables[2].name", "device.hostname"}, -} - ------------------------------------------------------------------------- --- Apply mapping from palo_alto.* to event ------------------------------------------------------------------------- -local function apply_map(pa, event, map) - for _, m in ipairs(map) do - local src, dst = m[1], m[2] - local val = pa[src] - if val ~= nil and val ~= "" then - set_field(event, dst, val) - end - end -end - -local function apply_constants(event, constants) - for _, c in ipairs(constants) do - set_field(event, c[1], c[2]) - end -end - -local function apply_cond_constants(event, cond_constants) - for _, c in ipairs(cond_constants) do - local actual = get_field(event, c[3]) - if actual == c[4] then - set_field(event, c[1], c[2]) - end - end -end - -local function apply_copies(event, copies) - for _, c in ipairs(copies) do - local val = get_field(event, c[1]) - if val then - set_field(event, c[2], val) - end - end -end - -local function cast_int_fields(event, fields) - for _, path in ipairs(fields) do - local val = get_field(event, path) - if val ~= nil and val ~= "" then - set_field(event, path, to_int(val)) - end - end -end - ------------------------------------------------------------------------- --- Main processEvent ------------------------------------------------------------------------- -function processEvent(event) - local pa = event.palo_alto - if not pa then - return event - end - - local log_type = pa.log_type - if not log_type then - return event - end - - log_type = log_type:upper() - - if log_type == "THREAT" then - apply_map(pa, event, THREAT_MAP) - apply_constants(event, THREAT_CONSTANTS) - apply_constants(event, OBSERVABLES_CONSTANTS) - apply_cond_constants(event, THREAT_COND_CONSTANTS) - apply_copies(event, COPIES) - cast_int_fields(event, {"src_endpoint.port", "dst_endpoint.port"}) - - elseif log_type == "TRAFFIC" then - apply_map(pa, event, TRAFFIC_MAP) - apply_constants(event, TRAFFIC_CONSTANTS) - apply_constants(event, OBSERVABLES_CONSTANTS) - apply_cond_constants(event, TRAFFIC_COND_CONSTANTS) - apply_copies(event, COPIES) - cast_int_fields(event, {"src_endpoint.port", "dst_endpoint.port"}) - - else - set_field(event, "metadata.log_name", log_type) - set_field(event, "unmapped.log_type", log_type) - end - - return event -end diff --git a/pipelines/pull/api/.README.md b/pipelines/pull/api/.README.md new file mode 100644 index 0000000..46f52b0 --- /dev/null +++ b/pipelines/pull/api/.README.md @@ -0,0 +1,28 @@ +# pipelines/pull/api/ + +Pipelines that **poll a vendor REST/HTTP API** on a schedule. + +## Layout + +``` +pull/api/// + ├── metadata.yaml + └── observo_export_pipeline_*.json +``` + +## Required metadata + +```yaml +metadata_details: + vendor: "" + product: "" + ingest_mode: "API Call" + auth_type: "OAuth | Bearer Token | API Key & Secret | Basic | mTLS" + # ... plus the standard pipeline fields including scheduling.interval_secs +``` + +## See also + +- `../object_store/` — for pulls from S3 / GCS / Azure Blob +- `../../push/syslog/` — for push patterns +- `../../community/transform_ocsf/` — for OCSF normalization overlays diff --git a/pipelines/pull/object_store/.README.md b/pipelines/pull/object_store/.README.md new file mode 100644 index 0000000..d0c4b84 --- /dev/null +++ b/pipelines/pull/object_store/.README.md @@ -0,0 +1,23 @@ +# pipelines/pull/object_store/ + +Pipelines that **read events from an object store** (S3, GCS, Azure Blob), +typically driven by SQS/SNS/Pub-Sub notifications. + +## Layout + +``` +pull/object_store/// + ├── metadata.yaml + └── observo_export_pipeline_*.json +``` + +## Required metadata + +```yaml +metadata_details: + vendor: "" + product: "" + ingest_mode: "Other - {Explain: object store, e.g. S3 / GCS / Azure Blob}" + auth_type: "IAM Role | API Key & Secret | OAuth" + # ... plus the standard pipeline fields +``` diff --git a/pipelines/push/hec/.README.md b/pipelines/push/hec/.README.md new file mode 100644 index 0000000..3aa26d4 --- /dev/null +++ b/pipelines/push/hec/.README.md @@ -0,0 +1,24 @@ +# pipelines/push/hec/ + +Pipelines for vendors that POST events **directly to an HEC endpoint** with +vendor-specific shaping (token handling, field unwrapping, batching, retry +semantics). + +## Layout + +``` +push/hec/// + ├── metadata.yaml + └── observo_export_pipeline_*.json +``` + +## Required metadata + +```yaml +metadata_details: + vendor: "" + product: "" + ingest_mode: "HEC" + auth_type: "HEC Token | Bearer Token | API Key & Secret | Basic" + # ... plus the standard pipeline fields +``` diff --git a/pipelines/push/syslog/.README.md b/pipelines/push/syslog/.README.md new file mode 100644 index 0000000..9a9c221 --- /dev/null +++ b/pipelines/push/syslog/.README.md @@ -0,0 +1,34 @@ +# pipelines/push/syslog/ + +Pipelines for vendors that **push syslog** (RFC5424, RFC3164, CEF, LEEF, or +vendor key/value format) to a collector or HEC endpoint, where vendor-specific +parsing or transformation is required. + +## Layout + +``` +push/syslog/// + ├── metadata.yaml + └── observo_export_pipeline_*.json +``` + +## Required metadata + +```yaml +metadata_details: + vendor: "" # lowercase, underscored, e.g. palo_alto + product: "" # lowercase, underscored, e.g. panos + ingest_mode: "Syslog" + auth_type: "N/A" # most syslog senders are authless on the wire + # (use "mTLS" if certs are required) + syslog_format: "CEF | LEEF | RFC5424 | RFC3164 | Vendor KV" + # ... plus the standard pipeline fields (purpose, source_template, + # destination_template, transform_templates, scheduling, retry, + # dependencies, performance_impact, tags, version) +``` + +## See also + +- `../hec/` — for vendors that POST directly to HEC +- `../../pull/api/` — for API polling +- `../../community/transform_ocsf/` — for OCSF normalization overlays