Skip to content

Commit 3fe3c27

Browse files
committed
docs(connector-linter): add README with usage and check reference
1 parent 192ed42 commit 3fe3c27

1 file changed

Lines changed: 390 additions & 0 deletions

File tree

shared/connector_linter/README.md

Lines changed: 390 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,390 @@
1+
# OpenCTI Connector Verified Linter
2+
3+
A flake8-style linter that validates whether an OpenCTI connector meets the **Verified** status criteria. Each check has a unique error code (e.g. `VC101`) and provides actionable suggestions for fixing violations.
4+
5+
## Installation
6+
7+
The linter is managed with [uv](https://docs.astral.sh/uv/) and lives in `shared/connector_linter/`.
8+
9+
```bash
10+
# From the connector_linter directory
11+
cd shared/connector_linter
12+
13+
# Install with uv (recommended)
14+
uv sync
15+
16+
# Or install with pip
17+
pip install -e .
18+
```
19+
20+
## Usage
21+
22+
### Check a connector
23+
24+
```bash
25+
# Basic check (all rules)
26+
connector-linter check ../../external-import/mandiant
27+
28+
# Or run via uv
29+
uv run connector-linter check ../../external-import/mandiant
30+
```
31+
32+
### Output formats
33+
34+
```bash
35+
# Colored terminal output (default — paths relative to CLI argument)
36+
connector-linter check ./connector --format text
37+
38+
# JSON output (for CI pipelines — always uses absolute paths)
39+
connector-linter check ./connector --format json
40+
41+
# GitHub Actions annotations (paths relative to repo root)
42+
connector-linter check ./connector --format github
43+
```
44+
45+
### Filtering checks
46+
47+
```bash
48+
# Run only specific checks
49+
connector-linter check ./connector --select VC101 --select VC102
50+
51+
# Run an entire category (prefix matching)
52+
connector-linter check ./connector --select VC1xx # all configuration checks
53+
connector-linter check ./connector --select VC3xx # all code checks
54+
connector-linter check ./connector --select VC5xx # all deprecation checks
55+
56+
# Ignore specific checks
57+
connector-linter check ./connector --ignore VC306 --ignore VC307
58+
59+
# Only show failures (hide passed checks; warnings/info still shown)
60+
connector-linter check ./connector --quiet
61+
62+
# Filter by severity level
63+
connector-linter check ./connector --severity warning # show warnings and errors
64+
connector-linter check ./connector --severity error # errors only
65+
66+
# Use absolute paths in text output (JSON always uses absolute paths)
67+
connector-linter check ./connector --abspath
68+
```
69+
70+
### Inline suppression (`# noqa`)
71+
72+
Suppress specific checks on individual lines using `# noqa` comments — same syntax as flake8:
73+
74+
```python
75+
# Suppress all checks on this line
76+
self.helper.log_info(msg) # noqa
77+
78+
# Suppress a specific check
79+
self.helper.log_info(msg) # noqa: VC503
80+
81+
# Suppress multiple checks
82+
confidence=80, # noqa: VC504, VC302
83+
```
84+
85+
Works in any file that uses `#` for comments (Python, YAML, Dockerfile, `.env`).
86+
87+
To ignore all `# noqa` directives (useful for CI audits):
88+
89+
```bash
90+
connector-linter check ./connector --disable-noqa
91+
```
92+
93+
### List all checks
94+
95+
```bash
96+
connector-linter list
97+
```
98+
99+
## Exit codes
100+
101+
| Exit code | Meaning |
102+
|-----------|---------|
103+
| `0` | All checks passed |
104+
| `1` | One or more checks failed (ERROR severity) |
105+
106+
WARNING-severity checks never cause a non-zero exit code.
107+
108+
---
109+
110+
## Check Reference
111+
112+
### VC1xx — Configuration
113+
114+
Validates config files (`docker-compose.yml`, `.env.sample`, `config.yml.sample`).
115+
116+
| Code | Severity | Name | Description |
117+
|------|----------|------|-------------|
118+
| VC101 | ERROR | `config-token-default` | `OPENCTI_TOKEN` must default to `ChangeMe` (exact case) |
119+
| VC102 | ERROR | `config-url-default` | `OPENCTI_URL` must default to `http://localhost` (no port) |
120+
| VC103 | ERROR | `config-variable-prefix` | Env vars must use `OPENCTI_`, `CONNECTOR_`, or `<CONNECTOR_NAME>_` prefix |
121+
| VC104 | ERROR | `config-file-samples` | Must have `config.yml.sample` + `docker-compose.yml` or `.env.sample`; `ChangeMe` for values without defaults; defaults must be commented |
122+
| VC105 | ERROR | `no-absolute-import-date` | Import start dates must use ISO 8601 duration (`P30D`), not absolute dates (`2020-01-01`) |
123+
124+
### VC2xx — Metadata
125+
126+
Validates connector metadata (manifest, identity).
127+
128+
| Code | Severity | Name | Description |
129+
|------|----------|------|-------------|
130+
| VC201 | ERROR | `manifest-verified-date` | `connector_manifest.json` must have `"verified": true` and a valid `"last_verified_date"` in YYYY-MM-DD format |
131+
| VC202 | ERROR | `manifest-container-image` | `container_version` must be `"rolling"`, `container_image` must match `opencti/connector-<dirname>` |
132+
133+
### VC3xx — Code
134+
135+
Validates Python source code patterns. Uses AST analysis for structural checks.
136+
137+
| Code | Severity | Scope | Name | Description |
138+
|------|----------|-------|------|-------------|
139+
| VC301 | ERROR | Common | `author-defined` | Connector must define an author Identity (Organization) |
140+
| VC302 | ERROR | Common | `author-referenced-on-entities` | Author must be referenced on STIX entities via `created_by_ref` |
141+
| VC303 | ERROR | Common | `connector-type-hardcoded` | `CONNECTOR_TYPE` must be hardcoded in code, not read from env |
142+
| VC304 | ERROR | Enrichment | `markings-checked` | TLP markings must be checked via `check_max_tlp` before processing |
143+
| VC305 | ERROR | Common | `sdk-base-settings` | Connector must use `BaseConnectorSettings` from connectors-sdk |
144+
| VC306 | WARNING | Common | `log-level-default-error` | Log level should default to `error` |
145+
| VC307 | WARNING | Common | `except-logging-level` | Except blocks should use `error`/`warning` logging, not `debug`/`info` |
146+
| VC308 | ERROR | Common | `main-traceback` | `main.py` must use `traceback` for error handling |
147+
| VC309 | ERROR | Common | `absolute-imports-only` | No relative imports — use absolute imports only |
148+
| VC310 | ERROR | Common | `external-references-not-default` | External references must not be added by default to all entities; only on Identity |
149+
| VC311 | WARNING | Common | `tlp-markings-on-entities` | STIX entities should include TLP markings |
150+
| VC312 | ERROR | Common | `cleanup-inconsistent-bundle` | `send_stix2_bundle()` must use `cleanup_inconsistent_bundle=True` |
151+
| VC313 | ERROR | Common | `pycti-generate-id` | STIX SDO/SRO objects must use `pycti.XXX.generate_id()` for deterministic IDs |
152+
| VC314 | ERROR | External Import | `auto-backpressure` | Must use `schedule_process()` or `schedule_iso()` for scheduling |
153+
| VC315 | ERROR | External Import | `work-initiated` | Must call `initiate_work()` before processing |
154+
| VC316 | ERROR | External Import | `work-closed` | Must close work with `to_processed()` after processing |
155+
| VC317 | WARNING | External Import | `initiate-work-conditional` | `initiate_work` should only be called when data is available |
156+
| VC318 | ERROR | Enrichment | `helper-listen` | Must use `self.helper.listen()` for message callback |
157+
| VC319 | WARNING | Enrichment | `scope-fallback-bundle` | Must return original bundle when entity is not in scope |
158+
| VC320 | ERROR | Enrichment | `tlp-access-control` | Must enforce TLP access control (extract → check → reject) |
159+
| VC321 | ERROR | Enrichment | `playbook-compatible` | Must set `playbook_compatible=True` in helper constructor |
160+
| VC322 | ERROR | Enrichment | `former-bundle-read` | Must read `data['stix_objects']` for playbook compatibility |
161+
| VC323 | ERROR | Stream | `helper-listen-stream` | Must use `self.helper.listen_stream()` |
162+
| VC324 | WARNING | Common | `relationship-start-stop-time` | Relationship should not set both `start_time` and `stop_time` (overloads Redis with time-bucketed duplicates) |
163+
164+
### VC4xx — Docker
165+
166+
Validates Dockerfile and docker-compose configuration.
167+
168+
| Code | Severity | Name | Description |
169+
|------|----------|------|-------------|
170+
| VC401 | ERROR | `docker-compose-image` | Image must use `:latest` tag and name must match directory (`opencti/connector-<dirname>:latest`) |
171+
| VC402 | ERROR | `no-entrypoint-sh` | Dockerfile must not use `entrypoint.sh` — use direct `ENTRYPOINT ["python", ...]` |
172+
173+
### VC5xx — Deprecation
174+
175+
Detects deprecated patterns that must be removed for Verified status.
176+
177+
| Code | Severity | Name | Description |
178+
|------|----------|------|-------------|
179+
| VC501 | ERROR | `no-legacy-interval` | Must use `CONNECTOR_DURATION_PERIOD` (ISO 8601), not `*_INTERVAL` variables or `schedule_unit()` |
180+
| VC502 | ERROR | `no-deprecated-report-status` | Must not use `x_opencti_report_status` (deprecated, non-functional). `x_opencti_workflow_id` emits a WARNING |
181+
| VC503 | ERROR | `no-deprecated-helper-logger` | Must use `helper.connector_logger.{level}()` instead of `helper.log_{level}()` |
182+
| VC504 | ERROR | `no-deprecated-confidence` | Must not use `confidence` level (deprecated since OpenCTI 6.0 — managed by platform policies) |
183+
| VC505 | WARNING | `no-direct-api-calls` | Should not use `helper.api.*` for direct GraphQL calls (except `api.work`, `api.vocabulary`, `api.label`, etc.) |
184+
| VC506 | ERROR | `no-update-existing-data` | Must not use `UPDATE_EXISTING_DATA` (no longer in helper). Exception: `opencti` datasets connector |
185+
186+
---
187+
188+
## Architecture
189+
190+
```
191+
connector_linter/
192+
├── __init__.py # Version
193+
├── __main__.py # Click CLI (check, list commands)
194+
├── models.py # ConnectorContext, CheckFinding, CheckResult, Severity
195+
├── registry.py # CheckRegistry — decorator-based registration
196+
├── runner.py # Auto-discovers and executes checks
197+
├── formatters.py # Output: text (ANSI), JSON, GitHub Actions
198+
└── checks/ # All check modules, auto-discovered
199+
├── vc1xx_config/ # Configuration checks
200+
│ ├── _helpers.py # Env var parsing, config file utilities
201+
│ ├── vc101_*.py
202+
│ └── ...
203+
├── vc2xx_metadata/ # Metadata checks
204+
├── vc3xx_code/ # Code structure checks
205+
│ ├── _helpers.py # AST + regex helpers
206+
│ ├── vc301_*.py
207+
│ └── ...
208+
├── vc4xx_docker/ # Docker checks
209+
└── vc5xx_deprecation/ # Deprecation checks
210+
```
211+
212+
### Key concepts
213+
214+
- **`ConnectorContext`** — Loaded once per connector. Contains the path, connector type (auto-detected from parent dir), manifest, file lists, and structural flags.
215+
- **`CheckRegistry`** — Singleton registry. Checks register themselves via the `@CheckRegistry.register()` decorator.
216+
- **`CheckFinding`** — Lightweight dataclass returned by check functions. Contains only check-specific data: `message`, `passed`, `file_path`, `line`, `suggestion`, and an optional `severity` override.
217+
- **`CheckResult`** — Full result produced by the runner. The runner hydrates each `CheckFinding` with `code`, `name`, and `severity` from the `CheckDescriptor`, so checks never repeat those fields.
218+
- **Auto-discovery**`runner.py` uses `pkgutil.walk_packages()` to find all modules under `checks/`. Modules prefixed with `_` (like `_helpers.py`) are skipped.
219+
220+
### Severity semantics
221+
222+
| Severity | `passed=True` | `passed=False` |
223+
|----------|--------------|----------------|
224+
| ERROR | Check passes ✓ | Blocking failure ✗ (causes exit code 1) |
225+
| WARNING | Non-blocking info | Non-blocking warning (never causes exit code 1) |
226+
227+
---
228+
229+
## Adding a new check
230+
231+
### 1. Choose the right category and code
232+
233+
| Category | Prefix | For |
234+
|----------|--------|-----|
235+
| Configuration | `VC1xx` | Config files (env vars, YAML, settings) |
236+
| Metadata | `VC2xx` | Manifest, connector identity |
237+
| Code | `VC3xx` | Python source patterns (AST/regex) |
238+
| Docker | `VC4xx` | Dockerfile, docker-compose |
239+
| Deprecation | `VC5xx` | Deprecated patterns to remove |
240+
241+
Pick the next number in the category (e.g., if the last is `VC324`, use `VC325`).
242+
243+
### 2. Create the check file
244+
245+
Create a new file in the appropriate package:
246+
247+
```bash
248+
# Example: new deprecation check
249+
touch connector_linter/checks/vc5xx_deprecation/vc507_my_new_check.py
250+
```
251+
252+
### 3. Write the check
253+
254+
```python
255+
"""VC507 — Short description of what this check validates.
256+
257+
Longer explanation of why this matters, what the correct pattern is,
258+
and any references to PRs/issues.
259+
260+
Scope: Common | EXTERNAL_IMPORT | INTERNAL_ENRICHMENT | STREAM
261+
"""
262+
263+
from connector_linter.models import CheckFinding, ConnectorContext, Severity
264+
from connector_linter.registry import CheckRegistry
265+
266+
267+
@CheckRegistry.register(
268+
code="VC507",
269+
name="my-check-name", # kebab-case short name
270+
description="One-line description", # shown in `list` output
271+
severity=Severity.ERROR, # ERROR or WARNING
272+
)
273+
def check_my_new_check(ctx: ConnectorContext) -> list[CheckFinding]:
274+
"""Implement the check logic here."""
275+
276+
# Use ctx.path, ctx.connector_type, ctx.manifest, ctx.src_files, etc.
277+
278+
# Scope to specific connector types if needed:
279+
if ctx.connector_type and ctx.connector_type != "EXTERNAL_IMPORT":
280+
return [] # skip — not applicable
281+
282+
# Return PASS result
283+
return [
284+
CheckFinding(
285+
message="Everything looks good ✓",
286+
passed=True,
287+
)
288+
]
289+
290+
# Or return FAIL with suggestion
291+
return [
292+
CheckFinding(
293+
message="Problem description",
294+
passed=False,
295+
file_path=ctx.path / "src/connector.py", # Path object (relative joined with ctx.path)
296+
line=42,
297+
suggestion="How to fix this issue.",
298+
)
299+
]
300+
```
301+
302+
### 4. Update the package `__init__.py`
303+
304+
Add the new check to the docstring in the category's `__init__.py`:
305+
306+
```python
307+
"""VC5xx — Deprecation checks.
308+
...
309+
VC506 no-update-existing-data Must not use deprecated UPDATE_EXISTING_DATA
310+
VC507 my-check-name One-line description
311+
"""
312+
```
313+
314+
### 5. Available helpers
315+
316+
#### Configuration helpers (`vc1xx_config/_helpers.py`)
317+
318+
```python
319+
from connector_linter.checks.vc1xx_config._helpers import (
320+
extract_all_env_vars, # Returns list[EnvVar] from docker-compose + .env.sample
321+
extract_env_vars_from_docker_compose,
322+
extract_env_vars_from_env_sample,
323+
derive_connector_prefixes, # dirname → ["ABUSE_SSL", "ABUSESSL"]
324+
find_config_yml_sample, # Returns path to config.yml.sample
325+
find_bad_changeme_values, # Finds wrong-case ChangeMe
326+
)
327+
```
328+
329+
#### Code helpers (`vc3xx_code/_helpers.py`)
330+
331+
```python
332+
from connector_linter.checks.vc3xx_code._helpers import (
333+
read_all_python_sources, # Returns dict[Path, source_code]
334+
parse_sources, # Returns dict[Path, ast.Module]
335+
find_pattern_locations, # Regex search across all sources → list[tuple[Path, int, str]]
336+
find_imports, # Find import statements by module/name pattern → list[ImportInfo]
337+
find_classes, # Find class definitions by base class → list[ClassInfo]
338+
find_calls_in_stmts, # Find function/method calls (with receiver) → list[CallInfo]
339+
find_field_defaults, # Find default values in class fields → list[FieldDefaultInfo]
340+
find_except_blocks, # Find except blocks with their logging → list[ExceptBlockInfo]
341+
)
342+
```
343+
344+
### 6. Test your check
345+
346+
```bash
347+
# Run against a known connector
348+
uv run connector-linter check ../../external-import/mandiant --select VC507
349+
350+
# Run against a connector that should fail
351+
uv run connector-linter check ../../external-import/alienvault --select VC507
352+
353+
# Verify it shows up in the list
354+
uv run connector-linter list | grep VC507
355+
```
356+
357+
### Tips
358+
359+
- **Use AST over regex** when checking Python code structure (function calls, keyword arguments, class definitions). It's more reliable and won't match inside strings or comments.
360+
- **Use regex** for simple text patterns in non-Python files (config, Dockerfile, YAML).
361+
- **Scope checks** to specific connector types by checking `ctx.connector_type` early and returning `[]` if not applicable.
362+
- **Multiple results** — a check can return multiple `CheckFinding` items (e.g., one per file where a violation was found). The runner enriches each with `code`/`name`/`severity` from the descriptor.
363+
- **Cross-package imports** are fine — deprecation checks commonly use helpers from `vc1xx_config/_helpers.py` and `vc3xx_code/_helpers.py`.
364+
- **`_`-prefixed modules** are ignored by auto-discovery — use this for helper modules.
365+
366+
---
367+
368+
## Running the linter in CI
369+
370+
### GitHub Actions
371+
372+
```yaml
373+
- name: Lint connector
374+
run: |
375+
cd shared/connector_linter
376+
uv run connector-linter check ../../external-import/${{ matrix.connector }} --format github
377+
```
378+
379+
The `--format github` output produces `::error` / `::warning` annotations that appear inline in PR diffs.
380+
381+
### JSON output for scripting
382+
383+
```bash
384+
connector-linter check ./connector --format json | jq '.score_pct'
385+
```
386+
387+
The JSON output includes:
388+
- `results`: array of check results with `code`, `passed`, `message`, `severity`, `suggestion`
389+
- `summary`: `total`, `passed`, `failed`, `errors`, `warnings`
390+
- `score_pct`: percentage score (0–100)

0 commit comments

Comments
 (0)