|
| 1 | +--- |
| 2 | +id: api_validation |
| 3 | +title: OpenAPI Schema Validation |
| 4 | +sidebar_position: 5 |
| 5 | +--- |
| 6 | + |
| 7 | +The Application Security Component can validate incoming HTTP requests against an [OpenAPI 3](https://swagger.io/specification/) schema you provide. Requests that do not conform to the schema (unknown route, unexpected method, missing or malformed parameters, invalid request body, missing/invalid authentication credentials, …) can be rejected before they ever reach the protected application. |
| 8 | + |
| 9 | +This is a positive-security model layered on top of the negative-security model implemented by the WAF rules: instead of describing what an attacker looks like, you describe what a valid client looks like and reject everything else. |
| 10 | + |
| 11 | +## How it works |
| 12 | + |
| 13 | +Schema validation is exposed through the [hooks](hooks.md) system: |
| 14 | + |
| 15 | +- An `on_load` hook loads one or more OpenAPI schemas at startup, each under a short string `ref`. |
| 16 | +- A `pre_eval` hook calls `ValidateRequestWithSchema(ref)` to validate the current request. The function returns `true` when the request is valid, `false` otherwise. |
| 17 | +- When validation fails, structured details about the failure are published to `hook_vars` so the same hook (or a later one) can build a meaningful drop reason, enrich an event, etc. |
| 18 | + |
| 19 | +## Storing schemas |
| 20 | + |
| 21 | +Schemas are loaded from the `schemas/` subdirectory of the CrowdSec [`data_dir`](/configuration/crowdsec_configuration.md#data_dir) (typically `/var/lib/crowdsec/data/schemas/`). |
| 22 | + |
| 23 | +Filenames passed to the loader **must be relative** to that directory. |
| 24 | + |
| 25 | +``` |
| 26 | +/var/lib/crowdsec/data/schemas/ |
| 27 | +├── users-api.yaml |
| 28 | +└── billing-api.yaml |
| 29 | +``` |
| 30 | + |
| 31 | +OpenAPI 3.0 and Swagger schemas in YAML or JSON are both accepted. |
| 32 | + |
| 33 | +## Loading schemas (`on_load`) |
| 34 | + |
| 35 | +Loading is done from an `on_load` hook using one of two helpers: |
| 36 | + |
| 37 | +| Helper | Description | |
| 38 | +| ------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | |
| 39 | +| `LoadAPISchemaWithName(ref str, filename str)` | Load `<data_dir>/schemas/<filename>` and register it under `ref`, with default policies. | |
| 40 | +| `LoadAPISchemaWithOptions(ref str, filename str, opts map)` | Same as above, but lets you override per-schema policies (see below). | |
| 41 | +| `RegisterAPISchemaBodyDecoder(content_type str, decoder str)` | Enable a non-default body decoder for a given Content-Type (see below). | |
| 42 | + |
| 43 | +`ref` is an arbitrary string you choose; you will use it later in `pre_eval` to refer to this schema. A schema name cannot be loaded twice. |
| 44 | + |
| 45 | +```yaml |
| 46 | +name: custom/my-appsec-config |
| 47 | +inband_rules: |
| 48 | + - crowdsecurity/base-config |
| 49 | +on_load: |
| 50 | + - apply: |
| 51 | + - LoadAPISchemaWithName("users_api", "users-api.yaml") |
| 52 | + - LoadAPISchemaWithName("billing_api", "billing-api.yaml") |
| 53 | +``` |
| 54 | +
|
| 55 | +If the schema file is missing, malformed, or not a valid OpenAPI 3 document, the datasource will fail to start and log the underlying error. |
| 56 | +
|
| 57 | +### Schema options |
| 58 | +
|
| 59 | +`LoadAPISchemaWithOptions` accepts the following keys, all strings: |
| 60 | + |
| 61 | +| Key | Values | Default | Effect | |
| 62 | +| -------------------------------- | ----------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | |
| 63 | +| `on_route_not_found` | `drop` / `ignore` | `drop` | What to do when no path in the schema matches the request URL. | |
| 64 | +| `on_method_not_allowed` | `drop` / `ignore` | `drop` | What to do when a path matches but the method does not (e.g. schema only declares `GET`, request is `POST`). | |
| 65 | +| `on_unsupported_security_scheme` | `drop` / `ignore` | `drop` | What to do when an unsupported security schema is encountered (`openid`, `oauth2`). If `ignore`, the security schema will not be validated when checking a request | |
| 66 | + |
| 67 | +`drop` (the default) treats the unmatched route as a validation failure — `ValidateRequestWithSchema` returns `false` and the validation error is surfaced via `hook_vars`. `ignore` lets the request through the validator without inspection (the function returns `true`), which is useful when your schema only covers a subset of your API. |
| 68 | + |
| 69 | +```yaml |
| 70 | +on_load: |
| 71 | + - apply: |
| 72 | + - > |
| 73 | + LoadAPISchemaWithOptions("public_api", "public-api.yaml", { |
| 74 | + "on_route_not_found": "ignore", |
| 75 | + "on_method_not_allowed": "drop", |
| 76 | + }) |
| 77 | +``` |
| 78 | + |
| 79 | +### Body decoders |
| 80 | + |
| 81 | +The validator uses the request `Content-Type` to pick a decoder for the body. By default, only the following Content-Types are decoded: |
| 82 | + |
| 83 | +- `application/json` and the JSON variants `application/json-patch+json`, `application/merge-patch+json`, `application/ld+json`, `application/hal+json`, `application/vnd.api+json`, `application/problem+json` |
| 84 | +- `application/x-www-form-urlencoded` |
| 85 | +- `multipart/form-data` |
| 86 | + |
| 87 | +A request whose Content-Type is not in this list will fail validation if the matching operation in the schema declares a request body. |
| 88 | + |
| 89 | +To enable validation of additional Content-Types, register a decoder from `on_load`: |
| 90 | + |
| 91 | +```yaml |
| 92 | +on_load: |
| 93 | + - apply: |
| 94 | + - RegisterAPISchemaBodyDecoder("application/yaml", "yaml") |
| 95 | + - RegisterAPISchemaBodyDecoder("text/csv", "csv") |
| 96 | +``` |
| 97 | + |
| 98 | +Available decoder names: |
| 99 | + |
| 100 | +| Decoder | Use for | |
| 101 | +| ------------ | ----------------------------------------------------- | |
| 102 | +| `json` | JSON payloads | |
| 103 | +| `urlencoded` | `application/x-www-form-urlencoded` | |
| 104 | +| `multipart` | `multipart/form-data` | |
| 105 | +| `yaml` | YAML payloads | |
| 106 | +| `csv` | CSV payloads | |
| 107 | +| `plain` | `text/plain` | |
| 108 | +| `file` | Raw binary uploads (`application/octet-stream`, etc.) | |
| 109 | + |
| 110 | +:::warning |
| 111 | +Body decoders are registered process-wide. If you run several AppSec datasources in the same CrowdSec process, they share the same set of registered decoders. |
| 112 | +::: |
| 113 | + |
| 114 | +## Validating requests (`pre_eval`) |
| 115 | + |
| 116 | +In a `pre_eval` hook, call `ValidateRequestWithSchema(ref)` with the `ref` you used at load time. It returns `true` if the request matches the schema, `false` otherwise. |
| 117 | + |
| 118 | +| Helper | Type | Description | |
| 119 | +| --------------------------- | -------------------- | -------------------------------------------------------------------------------------------------- | |
| 120 | +| `ValidateRequestWithSchema` | `func(ref str) bool` | Validate the current request against the schema registered under `ref`. Returns `true` on success. | |
| 121 | + |
| 122 | +A typical pattern is to fail closed — on validation failure, drop the request and use the failure details to build a human-readable reason: |
| 123 | + |
| 124 | +```yaml |
| 125 | +name: custom/my-appsec-config |
| 126 | +on_load: |
| 127 | + - apply: |
| 128 | + - LoadAPISchemaWithName("users_api", "users-api.yaml") |
| 129 | +inband: |
| 130 | + pre_eval: |
| 131 | + - filter: req.URL.Path startsWith "/users" && !ValidateRequestWithSchema("users_api") |
| 132 | + apply: |
| 133 | + - | |
| 134 | + DropRequest("schema validation failed: " + hook_vars.validation_error_message) |
| 135 | +``` |
| 136 | + |
| 137 | +You can also use the result to pick a softer remediation, send a custom event, etc. |
| 138 | + |
| 139 | +### Validation result variables |
| 140 | + |
| 141 | +When `ValidateRequestWithSchema` returns `false`, the following keys are set on `hook_vars`. They are available to the `apply` block of the same hook, to later hooks in the same request, and to `on_match` / `post_eval` hooks. The same keys are also propagated to the resulting CrowdSec event. |
| 142 | + |
| 143 | +| `hook_vars` key | Description | |
| 144 | +| --------------------------- | ---------------------------------------------------------------------------------------------------------------- | |
| 145 | +| `validation_error` | Full human-readable error string (combination of reason, field and message). | |
| 146 | +| `validation_error_reason` | Failure category — `parameter`, `request_body`, `security`, `route_not_found`, `method_not_allowed`, `internal`. | |
| 147 | +| `validation_error_field` | Name of the offending field (e.g. query parameter, header, body property) when applicable. | |
| 148 | +| `validation_error_message` | The underlying error message from the validator. | |
| 149 | +| `validation_error_value` | The offending value, truncated to 100 characters. | |
| 150 | +| `validation_error_expected` | Short description of what the schema expected (e.g. `type: integer, min: 18`). | |
| 151 | + |
| 152 | +On success these keys are absent. |
| 153 | + |
| 154 | +## Authentication |
| 155 | + |
| 156 | +If your OpenAPI schema declares a `security` requirement on an operation, the validator enforces it as part of validation. Failure to satisfy the security requirement is reported as a `security` reason in `hook_vars`. |
| 157 | + |
| 158 | +| Security scheme | Supported | Notes | |
| 159 | +| ------------------------- | --------- | ---------------------------------------------------------------------------------------------- | |
| 160 | +| `http` `basic` | Yes | Checks that an `Authorization: Basic …` header is present and non-empty. | |
| 161 | +| `http` `bearer` | Yes | Checks that an `Authorization: Bearer …` header is present and non-empty. | |
| 162 | +| `apiKey` (`header`) | Yes | Checks that the named header is present and non-empty. | |
| 163 | +| `apiKey` (`query`) | Yes | Checks that the named query parameter is present and non-empty. | |
| 164 | +| `apiKey` (`cookie`) | Yes | Checks that the named cookie is present and non-empty. | |
| 165 | +| `oauth2`, `openIdConnect` | No | A warning is logged at schema load. Any request guarded by such a scheme will fail validation. | |
| 166 | + |
| 167 | +The validator only verifies that the credential **is present and well-formed** — it does not verify the credential against any backing store. |
| 168 | + |
| 169 | +## End-to-end example |
| 170 | + |
| 171 | +`/var/lib/crowdsec/data/schemas/users-api.yaml`: |
| 172 | + |
| 173 | +```yaml |
| 174 | +openapi: 3.0.0 |
| 175 | +info: |
| 176 | + title: Users API |
| 177 | + version: "1.0.0" |
| 178 | +paths: |
| 179 | + /users: |
| 180 | + post: |
| 181 | + requestBody: |
| 182 | + required: true |
| 183 | + content: |
| 184 | + application/json: |
| 185 | + schema: |
| 186 | + type: object |
| 187 | + required: [username, email] |
| 188 | + additionalProperties: false |
| 189 | + properties: |
| 190 | + username: |
| 191 | + type: string |
| 192 | + minLength: 3 |
| 193 | + maxLength: 20 |
| 194 | + email: |
| 195 | + type: string |
| 196 | + format: email |
| 197 | + responses: |
| 198 | + "201": |
| 199 | + description: created |
| 200 | +``` |
| 201 | + |
| 202 | +AppSec configuration: |
| 203 | + |
| 204 | +```yaml |
| 205 | +name: custom/my-appsec-config |
| 206 | +on_load: |
| 207 | + - apply: |
| 208 | + - LoadAPISchemaWithName("users_api", "users-api.yaml") |
| 209 | +inband: |
| 210 | + pre_eval: |
| 211 | + - filter: req.URL.Path startsWith "/users" && !ValidateRequestWithSchema("users_api") |
| 212 | + apply: |
| 213 | + - | |
| 214 | + DropRequest("API schema violation on '" + hook_vars.validation_error_field + "': " + hook_vars.validation_error_message) |
| 215 | +``` |
| 216 | + |
| 217 | +With this configuration: |
| 218 | + |
| 219 | +- `POST /users` with `{"username": "ab", "email": "x"}` is dropped (`username` too short, `email` malformed). |
| 220 | +- `POST /users` with a valid body passes validation and is then evaluated by the WAF rules as usual. |
| 221 | +- `GET /users` is dropped with reason `method_not_allowed` (default policy). |
| 222 | +- `POST /admin` is dropped with reason `route_not_found` (default policy). |
| 223 | + |
| 224 | +## Metrics |
| 225 | + |
| 226 | +Two Prometheus counters are exposed: |
| 227 | + |
| 228 | +| Metric | Labels | Description | |
| 229 | +| ----------------------------------- | ------------------------------------------------- | -------------------------------------------------------------- | |
| 230 | +| `cs_appsec_validation_ok_total` | `source`, `appsec_engine`, `schema_ref` | Requests that passed schema validation. | |
| 231 | +| `cs_appsec_validation_failed_total` | `source`, `appsec_engine`, `schema_ref`, `reason` | Requests that failed schema validation, broken down by reason. | |
| 232 | + |
| 233 | +`reason` values match `validation_error_reason`: `parameter`, `request_body`, `security`, `route_not_found`, `method_not_allowed`, `internal`. |
0 commit comments