diff --git a/.fleetControl/configurationDefinitions.yml b/.fleetControl/configurationDefinitions.yml index 2c4ea16e34..b82112290e 100644 --- a/.fleetControl/configurationDefinitions.yml +++ b/.fleetControl/configurationDefinitions.yml @@ -17,4 +17,5 @@ configurationDefinitions: description: Python agent configuration type: agent-config version: 1.0.0 -# will add schema information here later + schema: ./schemas/config.json + format: ini diff --git a/.fleetControl/schemaGeneration/README.md b/.fleetControl/schemaGeneration/README.md new file mode 100644 index 0000000000..0e54c9a6ac --- /dev/null +++ b/.fleetControl/schemaGeneration/README.md @@ -0,0 +1,291 @@ +# Agent Config Schema Generator + +This directory contains the Python scripts that walk +`newrelic.core.config.global_settings()` to produce a JSON Schema +(`../schemas/config.json`) and to manage version bumps in +`../configurationDefinitions.yml` for Fleet Control. + +## Files + +| File | Description | +| --- | --- | +| `generate-schema.py` | Per-push regenerator. Reads the live agent settings tree, writes `config.json`. Never touches `configurationDefinitions.yml`. | +| `bump-schema-version.py` | Release-time version bumper. Compares the schema at a prior git ref to the current schema and writes a new version into `configurationDefinitions.yml`. | +| `schema_diff.py` | Shared library (no `main`). Holds the diff classification (`classify_changes`), bump arithmetic (`recommend_bump`, `apply_bump`, `bump_version`), and schema loading (`load_existing`). Imported by both top-level scripts. | +| `dump-settings.py` | Dev helper. Lists every leaf in `global_settings()` and how it appears in the generated schema (or why it was excluded). Not part of the workflow. | +| `tests/test_generate_schema.py` | Tests for the generator (`infer_type`, `make_property`, `build_properties`, `generate_schema`, anyOf helpers). | +| `tests/test_schema_diff.py` | Tests for the shared library (`classify_changes`, `recommend_bump`, `apply_bump`, `bump_version`, `load_existing`). | +| `tests/test_bump_schema_version.py` | Tests for the bump script (parsing helpers + main bootstrap/happy paths with mocked `git_show`). | +| `../schemas/config.json` | Generated JSON Schema (Draft 2020-12). | +| `../configurationDefinitions.yml` | Fleet Control metadata, including the schema's semver version. Bumped only at release time. | + +## How the generator works + +The agent's live settings tree (`newrelic.core.config.global_settings()`) +is the source of truth for which keys exist, their types, and their +defaults. `newrelic/newrelic.ini` is consulted only for descriptions +(comments adjacent to `key = value` lines in the `[newrelic]` section). + +`generate-schema.py`: + +1. Imports `newrelic.core.config` and walks every leaf whose containing + class name ends in `Settings`. +2. Loads descriptions from `newrelic/newrelic.ini`. +3. For every leaf, applies (in order): `TYPE_OVERRIDES`, `ENUM_OVERRIDES`, + set-typed auto-anyOf, type inference from the live value. +4. Skips leaves whose path matches `EXCLUDE_KEYS` (exact or `prefix.*`). +5. Validates the result against the JSON Schema Draft 2020-12 meta-schema. +6. Deep-merges the freshly generated schema into the existing on-disk + `config.json` so the published schema only ever grows. +7. Writes `config.json` and prints a classified diff summary. + +The generator does **not** touch `configurationDefinitions.yml` -- +version bumps live in the next section. + +## How versioning works + +Schema regeneration runs **per push** on feature branches via +`.github/workflows/fleet-control-schema.yml`. It writes `config.json` +and nothing else. Reviewers see schema diffs in PRs. + +Version bumps run **manually before each release** via +`.github/workflows/fleet-control-schema-bump.yml`, which is +`workflow_dispatch`-only. The bump workflow: + +1. Finds the latest `v*` tag on `main` (overridable via the + `since_ref` workflow input). +2. Reads the historical `configurationDefinitions.yml` from that tag -- + the version stored there is the **starter version** for the bump. +3. Reads the historical schema using the path declared in that file's + `schema:` field. +4. Compares the historical schema to the current `config.json` on `main`, + classifies the cumulative diff, and applies the recommended bump kind + (major/minor/patch). +5. Opens a PR titled `chore: bump agent config schema version` for team + review. + +If the latest release tag predates the schema (the `.fleetControl/` +directory or the `schema:` field in `configurationDefinitions.yml`), +`bump-schema-version.py` exits 0 with a bootstrap message and no PR +is opened. The first release that includes the schema ships at whatever +version is currently in `configurationDefinitions.yml`. + +### Release ordering -- run the bump workflow before cutting the tag + +The bump PR is a separate review/merge step from the agent's `vX.Y.Z` +release tag. Run the workflows in this order: + +1. Trigger `Fleet Control Config Schema Bump` (manual `workflow_dispatch`). +2. Wait for the PR to open (or the workflow to report that no bump is needed). +3. Review and merge the bump PR if one was opened. +4. Cut the GitHub Release from the post-merge `main`. + +If the release tag is cut before the bump PR merges, the tag's +`configurationDefinitions.yml` will still say the pre-bump version, +even though the schema itself (`config.json`) at that tag reflects the +new keys. Consumers see a mismatch. The next release will compute its +bump correctly from this tag's metadata, but the tag itself ships +mismatched. + +## Quick start + +Regenerate the schema, run tests, and surface excluded settings in one +command: + +```bash +tox -e fleet-schema +``` + +This runs the unit tests, regenerates `.fleetControl/schemas/config.json` +(deep-merged into the existing schema), prints the classified diff, and +dumps any settings that didn't make it into the schema. Exit code matches +`generate-schema.py`: `0` if no schema changes, `1` if the schema changed +(commit before pushing), `2` on a hard failure. + +> **Run with a clean shell.** `import newrelic.core.config` reads +> `NEW_RELIC_*` env vars at import time and bakes them into the live +> defaults. The tox env unsets them defensively; if you invoke the +> generator directly, do the same (`env -i PATH="$PATH" HOME="$HOME" +> python3 ...`). + +### Lower-level commands + +If you want to invoke a single step directly without going through tox: + +```bash +# Regenerate schema only (from repo root) +python3 .fleetControl/schemaGeneration/generate-schema.py + +# Force-regenerate without comparing to existing on-disk schema +python3 .fleetControl/schemaGeneration/generate-schema.py --force + +# Dry-run a release-time bump against a tag +python3 .fleetControl/schemaGeneration/bump-schema-version.py --since=v10.21.0 + +# Apply a release-time bump (writes configurationDefinitions.yml) +python3 .fleetControl/schemaGeneration/bump-schema-version.py --since=v10.21.0 --ci + +# Dump every live setting alongside how it appears in the schema +python3 .fleetControl/schemaGeneration/dump-settings.py + +# Filter the dump to settings missing from the schema +python3 .fleetControl/schemaGeneration/dump-settings.py --missing +``` + +## Adding new configuration keys + +When new settings land in `newrelic/core/config.py`, the generator +picks them up automatically on the next push -- no manual schema edit +needed for the common case. + +**Special handling is required for certain key types**, configured via +override maps in `generate-schema.py`. + +### Array-or-string keys (`_environ_as_set`-backed) + +Many agent config keys accept either a structured array OR a delimited +string (the INI form): + +```ini +attributes.include = request.parameters.* response.headers.content-type +``` + +These keys are parsed via `_environ_as_set` / +`_environ_as_comma_separated_set` in `newrelic/core/config.py`. For the +JSON Schema to correctly represent both forms, set-typed live values +auto-detect into the right shape -- you don't need a per-key entry +unless the live default is empty (in which case `set` vs. `list` cannot +be distinguished from the value alone). Empty defaults need an explicit +override: + +```python +'new_feature.include': string_array_or_delimited(default=[]), +'new_feature.exclude': string_array_or_delimited(default=[]), +``` + +The auto-detection covers the long tail (the seven `*.attributes.*` +subtrees, `opentelemetry.traces.*`, etc.) because their live values +arrive as Python `set` objects. + +### Status code keys + +Keys that accept integers, arrays of integers, or range strings (e.g., +`"100-102 200-208 226 300-308 404"`) should use: + +```python +'error_collector.new_status_codes': status_code_array_or_range(), +'error_collector.new_status_codes_with_default': status_code_array_or_range(default=[404]), +``` + +### Enum keys + +Keys with a fixed set of allowed values should be added to `ENUM_OVERRIDES`: + +```python +ENUM_OVERRIDES = { + 'new_feature.mode': ['option1', 'option2', 'option3'], +} +``` + +### None-defaulted leaves + +Settings whose live default is `None` cannot have their type inferred, +so they're skipped from the schema with a warning. To surface them, add +an explicit type in `TYPE_OVERRIDES`: + +```python +'proxy_user': {'type': 'string'}, +``` + +## Excluding keys + +Add keys to `EXCLUDE_KEYS` to drop them from the schema: + +```python +EXCLUDE_KEYS = { + 'agent_run_id', # exact match + 'cross_application_tracer.*', # subtree exclusion +} +``` + +The `.*` suffix matches both the prefix itself and any descendant. + +## Checklist for new config keys + +1. Add the setting to `newrelic/core/config.py` as you would any other. +2. **Run the generator locally** (`python3 .fleetControl/schemaGeneration/generate-schema.py`) to pick up the new key. +3. **Check the inferred type** in the generated schema. +4. **If the live default is `None`** → add an entry to `TYPE_OVERRIDES`. +5. **If the key uses `_environ_as_set` and the default is empty** → add to `TYPE_OVERRIDES` with `string_array_or_delimited()`. +6. **If the key has enum values** → add to `ENUM_OVERRIDES`. +7. **If the key should be hidden** → add to `EXCLUDE_KEYS`. +8. **Run the generator again**; verify the schema entry looks correct. +9. **Run the tests** (`python3 -m unittest discover .fleetControl/schemaGeneration/tests`). +10. The next release will pick up the bump when the maintainer runs the + bump workflow as part of release prep. + +## CLI options + +### Generator CLI (`generate-schema.py`) + +| Option | Description | +| --- | --- | +| `--force` | Overwrite the schema without comparing to the existing one. Always exits 0. | + +### Bumper CLI (`bump-schema-version.py`) + +| Option | Description | +| --- | --- | +| `--since=` | Required. Compare the current schema to the schema at `` and recommend a bump. | +| `--ci` | Write the bumped version to `configurationDefinitions.yml`. Without this, the script just prints the recommendation. | + +## Exit codes + +### Generator exit codes (`generate-schema.py`) + +| Code | Meaning | +| --- | --- | +| 0 | No schema changes (or first run, or `--force` mode). | +| 1 | Schema regenerated and on-disk differed (CI should commit). | +| 2 | Generator failure (invalid schema, malformed inputs). | + +### Bumper exit codes (`bump-schema-version.py`) + +| Code | Meaning | +| --- | --- | +| 0 | No bump needed (no schema diff, or bootstrap case where `` predates the schema). | +| 1 | Bump applied (`--ci`) or recommended (without `--ci`). | +| 2 | Bump failure (uncaught exception, missing args, malformed historical inputs). | + +## Version bumping rules + +`bump-schema-version.py` classifies each schema change and the bump +kind is the highest severity across all changes: + +| Change type | Severity | Bump | +| --- | --- | --- | +| Property removed | Breaking | Major | +| Type changed | Breaking | Major | +| Enum value removed | Breaking | Major | +| Enum newly introduced | Breaking | Major | +| Required field added | Breaking | Major | +| `additionalProperties` tightened (true → false) | Breaking | Major | +| Property added | Additive | Minor | +| Enum value added | Additive | Minor | +| Enum removed entirely | Additive | Minor | +| Required field removed | Additive | Minor | +| Default changed | Additive | Minor | +| `additionalProperties` loosened (false → true) | Additive | Minor | +| Description changed | Cosmetic | Patch | + +## Running the tests + +```bash +# All schema-generation tests in one shot +python3 -m unittest discover .fleetControl/schemaGeneration/tests + +# Individual files +python3 -m unittest .fleetControl.schemaGeneration.tests.test_generate_schema +python3 -m unittest .fleetControl.schemaGeneration.tests.test_schema_diff +python3 -m unittest .fleetControl.schemaGeneration.tests.test_bump_schema_version +``` diff --git a/.fleetControl/schemaGeneration/bump-schema-version.py b/.fleetControl/schemaGeneration/bump-schema-version.py new file mode 100644 index 0000000000..cfae39b465 --- /dev/null +++ b/.fleetControl/schemaGeneration/bump-schema-version.py @@ -0,0 +1,194 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Fleet Control Config Schema Version Bumper -- Python Agent + +Reads the schema and metadata at a prior git ref (typically the latest +release tag), diffs against the current schema, classifies the cumulative +changes, and bumps the version in .fleetControl/configurationDefinitions.yml. + +Splits the bump responsibility out of generate-schema.py so the +per-push regen workflow doesn't churn the version on every PR. + +Run standalone: + python3 bump-schema-version.py --since=v10.21.0 # dry-run + python3 bump-schema-version.py --since=v10.21.0 --ci # apply + +Bootstrap case: + If the given ref predates .fleetControl/ entirely, OR predates the + `schema:` field in configurationDefinitions.yml, the script exits 0 + with a "bootstrap" message and writes nothing. The first release that + ships the schema does so at whatever version is currently in + configurationDefinitions.yml. + +Exit codes: + 0 -- no bump needed (no schema diff, or bootstrap case) + 1 -- bump applied (--ci) or recommended (without --ci) + 2 -- hard failure (missing args, malformed inputs, git invocation failed) +""" + +import argparse +import json +import re +import subprocess +import sys +from pathlib import Path + +SCRIPT_DIR = Path(__file__).resolve().parent +FLEET_CONTROL_DIR = SCRIPT_DIR.parent +REPO_ROOT = FLEET_CONTROL_DIR.parent +SCHEMA_PATH = FLEET_CONTROL_DIR / "schemas" / "config.json" +CONFIG_DEF_PATH = FLEET_CONTROL_DIR / "configurationDefinitions.yml" + +# Reuse the diff helpers and version-bump rewriter from the shared module. +sys.path.insert(0, str(SCRIPT_DIR)) +from schema_diff import apply_bump, bump_version, classify_changes, print_changes, recommend_bump # noqa: E402 + +# Path inside the repo (POSIX, since git uses forward slashes regardless +# of platform). Used when invoking `git show :`. +CONFIG_DEF_GIT_PATH = ".fleetControl/configurationDefinitions.yml" + + +def git_show(ref, path): + """Return file contents at the given ref, or None if the path is + absent at that ref. Raises on any other git error. + """ + try: + # S603/S607: invoking git with a partial path is intentional for a + # CI tool that runs in environments where git is always on PATH. + # The `ref` and `path` arguments are validated by git itself and + # are not used as shell input (no shell=True). + result = subprocess.run( # noqa: S603 + ["git", "show", f"{ref}:{path}"], # noqa: S607 + cwd=REPO_ROOT, + capture_output=True, + text=True, + check=False, + ) + except FileNotFoundError as exc: + raise RuntimeError("git executable not found on PATH") from exc + + if result.returncode == 0: + return result.stdout + + # Distinguish "path didn't exist at this ref" (which is the bootstrap + # signal we want to handle gracefully) from any other git failure. + stderr = result.stderr or "" + if "exists on disk, but not in" in stderr or "does not exist" in stderr or "fatal: path" in stderr: + return None + raise RuntimeError(f"git show {ref}:{path} failed: {stderr.strip()}") + + +# Used to find the schema-relative path inside an unparsed YAML blob. This +# avoids pulling in a YAML dependency for what is a single-line read. +_SCHEMA_LINE_RE = re.compile(r"(?m)^\s*schema:\s*(\S+)\s*$") + + +def parse_schema_path(yaml_text): + """Return the path string from the `schema:` line in + configurationDefinitions.yml, or None if no such line exists. + """ + m = _SCHEMA_LINE_RE.search(yaml_text) + return m.group(1) if m else None + + +def historical_schema_path_in_repo(schema_field): + """Translate a schema: value (relative to .fleetControl/) into a + repo-root-relative POSIX path suitable for `git show`. + """ + schema_field = schema_field.lstrip("./") + return f".fleetControl/{schema_field}" + + +def main(argv=None): + parser = argparse.ArgumentParser(description="Compute and optionally apply a schema version bump.") + parser.add_argument( + "--since", required=True, metavar="REF", help="Git ref (tag or commit) to compare the current schema against." + ) + parser.add_argument( + "--ci", + action="store_true", + help="Write the bumped version into configurationDefinitions.yml. " + "Without this flag the script just prints the recommendation.", + ) + args = parser.parse_args(argv) + + # Step 1: read configurationDefinitions.yml at the historical ref. + historical_def = git_show(args.since, CONFIG_DEF_GIT_PATH) + if historical_def is None: + print(f"Bootstrap: {CONFIG_DEF_GIT_PATH} did not exist at {args.since}. No bump computed.") + return 0 + + # Step 2: extract the historical schema path. If the schema: field + # was added later, that is also a bootstrap case. + schema_field = parse_schema_path(historical_def) + if schema_field is None: + print(f"Bootstrap: configurationDefinitions.yml at {args.since} has no `schema:` field. No bump computed.") + return 0 + + historical_schema_path = historical_schema_path_in_repo(schema_field) + + # Step 3: read the historical schema. Same bootstrap treatment if the + # path is absent (e.g. schema: was set but the file hadn't landed + # yet at the reference). + historical_schema_text = git_show(args.since, historical_schema_path) + if historical_schema_text is None: + print(f"Bootstrap: {historical_schema_path} did not exist at {args.since}. No bump computed.") + return 0 + + try: + old_schema = json.loads(historical_schema_text) + except json.JSONDecodeError as e: + print(f"error: historical schema at {args.since} is not valid JSON: {e}", file=sys.stderr) + return 2 + + # Step 4: read the current schema from disk. + if not SCHEMA_PATH.exists(): + print(f"error: current schema not found at {SCHEMA_PATH}", file=sys.stderr) + return 2 + new_schema = json.loads(SCHEMA_PATH.read_text(encoding="utf-8")) + + # Step 5: classify changes and recommend a bump. + changes = classify_changes(old_schema, new_schema) + print_changes(changes, header=f"Schema changes since {args.since}") + + bump = recommend_bump(changes) + + # Step 6: apply (or dry-run print) the bump against the *current* + # configurationDefinitions.yml -- not the historical one. The + # historical doc is only the source of the diff baseline. + old_v, new_v = bump_version(CONFIG_DEF_PATH, bump, args.ci) + + if bump == "none": + print(f"\nNo bump needed (current version: {old_v}).") + return 0 + + if args.ci: + if new_v == old_v: + # apply_bump returns the input version when bump == 'none'; + # this branch covers the rare case where bump != 'none' but + # the version was already at the bumped value (manual edit). + print(f"\nRecommended bump: {bump}, but {old_v} already reflects it. No write.") + return 0 + print(f"\nApplied bump: {bump} ({old_v} -> {new_v})") + print(f"Wrote: {CONFIG_DEF_PATH}") + else: + print(f"\nRecommended bump: {bump} ({old_v} -> {apply_bump(old_v, bump)}). Re-run with --ci to apply.") + + return 1 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/.fleetControl/schemaGeneration/dump-settings.py b/.fleetControl/schemaGeneration/dump-settings.py new file mode 100644 index 0000000000..a4699b28da --- /dev/null +++ b/.fleetControl/schemaGeneration/dump-settings.py @@ -0,0 +1,213 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Dev helper: dump every leaf in the agent's live settings tree alongside +how it shows up in .fleetControl/schemas/config.json. + +This isn't part of the schema-generation workflow -- it's a debugging +tool so you can spot-check what made it into the schema and why +something didn't. + +Run from the repo root: + + python3 .fleetControl/schemaGeneration/dump-settings.py + +Each row is one setting in the agent's `global_settings()` tree: + + PATH LIVE SCHEMA + ---- ---- ------ + account_id NoneType -- EXCLUDED (server-set) + app_name str='Python App...' string, default='Python Application' REQ + license_key NoneType string, minLength=1 [hardcoded] REQ + log_level int=20 string, enum, default='info' [enum override] + +Filters: + --missing Only show settings that did NOT make it into the schema. + --grep PATTERN Only show paths matching the substring (case-insensitive). +""" + +import argparse +import importlib.util +import json +import sys +from pathlib import Path + +SCRIPT_DIR = Path(__file__).resolve().parent +FLEET_CONTROL_DIR = SCRIPT_DIR.parent +REPO_ROOT = FLEET_CONTROL_DIR.parent +SCHEMA_PATH = FLEET_CONTROL_DIR / "schemas" / "config.json" + +# Make the agent importable. +sys.path.insert(0, str(REPO_ROOT)) + +# Load generate-schema.py so we can reuse its walk/exclude logic and the +# override tables. The hyphen in the filename means we can't just import +# it as a module -- importlib.util is the standard workaround. +_GEN_PATH = SCRIPT_DIR / "generate-schema.py" +_spec = importlib.util.spec_from_file_location("gen", _GEN_PATH) +gen = importlib.util.module_from_spec(_spec) +_spec.loader.exec_module(gen) + + +def _truncate(s, n=24): + s = str(s) + return s if len(s) <= n else s[: n - 3] + "..." + + +def _live_summary(value): + """Compact "type=value" string for the live default column.""" + t = type(value).__name__ + if value is None: + return t + if isinstance(value, str): + return f"{t}={_truncate(repr(value))}" + if isinstance(value, (list, set, tuple, dict)): + return f"{t}={_truncate(repr(value))}" + return f"{t}={value!r}" + + +def _schema_summary(prop): + """Compact one-line summary of a schema property.""" + parts = [] + t = prop.get("type") + if t: + parts.append(t) + if "anyOf" in prop: + # Render anyOf as the union of its branch types so the column + # reads like "anyOf=array(string)|string" rather than collapsing + # to just the default. + branches = [] + for branch in prop["anyOf"]: + bt = branch.get("type", "?") + if bt == "array": + inner = branch.get("items", {}).get("type", "?") + branches.append(f"array({inner})") + else: + branches.append(bt) + parts.append("anyOf=" + "|".join(branches)) + if "enum" in prop: + parts.append(f"enum={prop['enum']}") + if "default" in prop: + parts.append(f"default={prop['default']!r}") + if "minLength" in prop: + parts.append(f"minLength={prop['minLength']}") + if "items" in prop: + parts.append(f"items={prop['items'].get('type', '?')}") + return ", ".join(parts) or "(empty)" + + +def _why_missing(path): + """Explain why a path didn't end up in the schema.""" + if gen.is_excluded(path, gen.EXCLUDE_KEYS): + # Find which exclude entry matched, to surface the rationale. + for entry in gen.EXCLUDE_KEYS: + if entry == path: + return f"-- EXCLUDED (matched: {entry})" + if entry.endswith(".*"): + prefix = entry[:-2] + if path == prefix or path.startswith(prefix + "."): + return f"-- EXCLUDED (matched: {entry})" + return "-- EXCLUDED" + return "-- SKIPPED (None default + no TYPE_OVERRIDE; add one to surface it)" + + +def _override_note(path, value): + """Tag schema entries with the override that produced them, if any.""" + if path == "license_key": + return "[hardcoded]" + if path in gen.TYPE_OVERRIDES: + return "[type override]" + if path in gen.ENUM_OVERRIDES: + return "[enum override]" + if path == "log_level": + return "[log_level int->str]" + return "" + + +def main(argv=None): + parser = argparse.ArgumentParser(description="Dump live settings and how they appear in the generated schema.") + parser.add_argument( + "--missing", action="store_true", help="Only show settings absent from the schema (excluded or skipped)." + ) + parser.add_argument( + "--grep", metavar="PATTERN", help="Filter to paths containing PATTERN (case-insensitive substring)." + ) + args = parser.parse_args(argv) + + if not SCHEMA_PATH.exists(): + print(f"error: {SCHEMA_PATH} not found. Run generate-schema.py first.", file=sys.stderr) + return 2 + schema = json.loads(SCHEMA_PATH.read_text(encoding="utf-8")) + schema_props = schema.get("properties", {}) + required = set(schema.get("required", [])) + + settings = gen.load_settings() + + rows = [] + for path, value in gen.walk_settings(settings): + in_schema = path in schema_props + if args.missing and in_schema: + continue + if args.grep and args.grep.lower() not in path.lower(): + continue + + live = _live_summary(value) + if in_schema: + schema_col = _schema_summary(schema_props[path]) + note = _override_note(path, value) + if note: + schema_col = f"{schema_col} {note}" + else: + schema_col = _why_missing(path) + + if path in required: + schema_col += " REQ" + + rows.append((path, live, schema_col)) + + # license_key gets a hardcoded entry that walk_settings won't yield + # if its live value is None (it does have a property in the schema + # though). Surface it explicitly so the dump shows the override row. + if not args.missing and (not args.grep or args.grep.lower() in "license_key"): + if "license_key" in schema_props and not any(r[0] == "license_key" for r in rows): + rows.append( + ( + "license_key", + "NoneType", + f"{_schema_summary(schema_props['license_key'])} [hardcoded]" + + (" REQ" if "license_key" in required else ""), + ) + ) + rows.sort() + + if not rows: + print("(no rows match)") + return 0 + + # Compute column widths from the data, capped to keep the output usable. + path_w = min(max(len(r[0]) for r in rows), 60) + live_w = min(max(len(r[1]) for r in rows), 30) + + print(f"{'PATH'.ljust(path_w)} {'LIVE'.ljust(live_w)} SCHEMA") + print(f"{'-' * path_w} {'-' * live_w} ------") + for path, live, schema_col in rows: + print(f"{path.ljust(path_w)} {live.ljust(live_w)} {schema_col}") + + print(f"\n{len(rows)} rows", file=sys.stderr) + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/.fleetControl/schemaGeneration/generate-schema.py b/.fleetControl/schemaGeneration/generate-schema.py new file mode 100644 index 0000000000..4162db1074 --- /dev/null +++ b/.fleetControl/schemaGeneration/generate-schema.py @@ -0,0 +1,769 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Fleet Control Config Schema Generator -- Python Agent + +Walks `newrelic.core.config.global_settings()` and writes JSON Schema +Draft 2020-12 to .fleetControl/schemas/config.json. + +Source of truth: + The agent's live settings tree (`global_settings()`) is the source of + truth for *which* keys exist, their *types*, and their *default values*. + Descriptions still come from `newrelic/newrelic.ini` because that is + the only place the agent ships human-readable explanations of the + settings; settings not documented there are emitted without a + description. + +Environment requirement: + `import newrelic.core.config` reads NEW_RELIC_* environment variables + to populate certain defaults at import time (NEW_RELIC_LICENSE_KEY, + NEW_RELIC_LOG, NEW_RELIC_ENABLED, etc.). RUN THIS GENERATOR WITH NO + NEW_RELIC_* VARIABLES SET, otherwise their values leak into the + generated schema's defaults. The CI workflow unsets them explicitly + before invocation; locally, ensure your shell does not export any. + +Merge behavior: + The generator never starts fresh -- each run deep-merges the freshly + generated schema into whatever already exists at config.json. Properties + are union'd (keys present only in the old schema are preserved); leaf + nodes and the top-level `required` list take the new run's values. This + guarantees the published schema only ever grows, so a config that + validated against an older agent's schema continues to validate against + the current one. + +Version bumping is NOT this script's concern. See bump-schema-version.py +for the release-time bump path. This script only writes config.json; it +never touches configurationDefinitions.yml. + +Diff helpers (classify_changes, recommend_bump, etc.) live in schema_diff.py +and are imported when needed for the per-run change report. + +Exit codes: + 0 -- no schema changes (or first run, or --force mode) + 1 -- schema changed and on-disk differed (CI should commit) + 2 -- hard failure (invalid schema, malformed inputs) + +Run standalone: + python3 generate-schema.py + +Force-regenerate without comparing: + python3 generate-schema.py --force + +--------------------------------------------------------------------------- +Why the schema emits `additionalProperties: true` +--------------------------------------------------------------------------- +The generator sets `additionalProperties: true` at the root. This is +intentional and serves two purposes: + + 1. Forward compatibility. The agent ships new config keys in every + release. A Fleet Control deployment may be validating against a + schema generated from an older agent -- strict validation would + reject any newer key, breaking users who upgrade the agent before + the schema is republished. + + 2. Coverage gaps. Some keys are deliberately excluded (see EXCLUDE_KEYS) + and some shapes the generator can't represent faithfully (settings + with None defaults and no TYPE_OVERRIDE entry). Permitting unknown + properties means a config that uses those still validates instead + of being flagged as malformed. + +If a future requirement calls for strict validation (catch typos, reject +unknown keys), flip this to `false` -- but doing so should be paired with +a release process that republishes the schema in lockstep with the agent. + +--------------------------------------------------------------------------- +Why list-typed settings emit `anyOf [array, string]` +--------------------------------------------------------------------------- +The Python agent's INI format documents many list values as space- or +comma-separated *strings* (e.g. `attributes.include = foo bar baz`). +The agent parses these via `_environ_as_set` / `_environ_as_comma_separated_set` +in newrelic/core/config.py, then exposes them as Python `set` objects in +`global_settings()`. + +A schema that emits `{"type": "array", ...}` for these would reject every +legitimate INI configuration -- the user can't write a JSON array in an +INI file. Emitting `anyOf [array, string]` instead means both the +structured form (used by configuration formats that *do* support it, +like Fleet Control's structured backend) and the INI string form validate. + +The generator handles this in two layers: + + 1. Explicit overrides via `string_array_or_delimited()` in TYPE_OVERRIDES. + Used for keys whose live default is an empty list (so set vs. list + can't be inferred from the value alone) or that need a special + description. + + 2. Auto-detection in `make_property`: any leaf whose live value is a + Python `set` is emitted as `anyOf [array, string]` regardless of + whether it's in TYPE_OVERRIDES. This catches the long tail of + `_environ_as_set`-backed settings (the seven `*.attributes.*` + subtrees, opentelemetry.traces.*, heroku.dyno_name_prefixes_to_shorten, + etc.) without requiring a per-key allowlist. +""" + +import argparse +import re +import sys +from pathlib import Path + +# --------------------------------------------------------------------------- +# Paths -- all resolved relative to this script. Script lives at +# /.fleetControl/schemaGeneration/ so the repo root is two +# levels up. +# --------------------------------------------------------------------------- +SCRIPT_DIR = Path(__file__).resolve().parent +FLEET_CONTROL_DIR = SCRIPT_DIR.parent +REPO_ROOT = FLEET_CONTROL_DIR.parent +SCHEMA_DIR = FLEET_CONTROL_DIR / "schemas" +SCHEMA_PATH = SCHEMA_DIR / "config.json" +CONFIG_DEF_PATH = FLEET_CONTROL_DIR / "configurationDefinitions.yml" +DEFAULT_INI_PATH = REPO_ROOT / "newrelic" / "newrelic.ini" + +# Make `import newrelic` resolve to this repo's source rather than any +# globally-installed agent. +sys.path.insert(0, str(REPO_ROOT)) + +# Schema-diff helpers live in their own module so bump-schema-version.py +# can reuse them. Use the explicit relative-import dance because this +# script is loaded as __main__ (not a package). +sys.path.insert(0, str(SCRIPT_DIR)) +from schema_diff import classify_changes, load_existing, print_changes # noqa: E402 + +# --------------------------------------------------------------------------- +# Type override helpers -- factory functions for common shape patterns. +# --------------------------------------------------------------------------- + + +def string_array_or_delimited(default=None, item_type="string"): + """Schema for keys that accept either a YAML/JSON array OR a delimited + string (space- or comma-separated). + + The Python agent parses these via `_environ_as_set` / + `_environ_as_comma_separated_set`, so the INI string form is + documented as the user-facing format. The structured array form is + accepted for Fleet Control consumers that emit structured config. + """ + schema = {"anyOf": [{"type": "array", "items": {"type": item_type}}, {"type": "string"}]} + if default is not None: + schema["default"] = default + return schema + + +def status_code_array_or_range(default=None): + """Schema for status code keys that accept an integer, an array of + integers, or a delimited string with optional range syntax + (e.g. "100-102 200-208 226 300-308 404"). + + Parsed by `_parse_status_codes` in newrelic/core/config.py. The + range string form is what newrelic.ini documents. + """ + schema = { + "anyOf": [ + {"type": "integer"}, + {"type": "array", "items": {"type": "integer"}}, + { + "type": "string", + "description": ( + 'Comma- or space-separated integers and ranges (e.g. "100-102 200-208 226 300-308 404")' + ), + }, + ] + } + if default is not None: + schema["default"] = default + return schema + + +# --------------------------------------------------------------------------- +# Enum overrides. The customer-facing form of a setting may differ from +# its in-memory representation -- log_level for example is stored as a +# Python logging int (20 == INFO) but customers configure it as a string. +# When an enum override matches a setting whose live default is not in +# the enum, the override emits the enum's first matching string and the +# inferred type is dropped in favor of `string`. +# --------------------------------------------------------------------------- +ENUM_OVERRIDES = { + "log_level": ["critical", "error", "warning", "info", "debug"], + "transaction_tracer.record_sql": ["off", "raw", "obfuscated"], +} + +# Settings whose live default is an int log-level but the schema should +# present a string. Used by make_property to pick the right enum default. +LOG_LEVEL_INT_TO_STRING = {50: "critical", 40: "error", 30: "warning", 20: "info", 10: "debug"} + +# --------------------------------------------------------------------------- +# Type overrides -- when the live default doesn't tell the full story. +# Three reasons to use this: +# (a) The leaf default is None, so we cannot infer the JSON Schema type +# (proxy_*, ca_bundle_path, audit_log_file, transaction_threshold). +# (b) The leaf is a list/set whose default is empty, so the items type +# cannot be inferred from contents -- and we want the explicit +# anyOf [array, string] shape used by INI list values. +# (c) The leaf needs a multi-form anyOf shape (status codes: +# int | int[] | range string). +# --------------------------------------------------------------------------- +TYPE_OVERRIDES = { + # --- INI string-or-array list values (parsed via _environ_as_set, + # _environ_as_comma_separated_set, or documented as space-separated + # in newrelic.ini). Empty defaults can't be inferred as set-typed + # from the live value, so they need the override here. The auto- + # detection in make_property covers the non-empty cases (sets in + # global_settings() get anyOf'd automatically). + "error_collector.ignore_classes": string_array_or_delimited(default=[]), + "error_collector.expected_classes": string_array_or_delimited(default=[]), + "transaction_tracer.function_trace": string_array_or_delimited(default=[]), + "transaction_tracer.generator_trace": string_array_or_delimited(default=[]), + "attributes.include": string_array_or_delimited(default=[]), + "attributes.exclude": string_array_or_delimited(default=[]), + # --- Status codes (integer | array of integers | range string) --- + "error_collector.ignore_status_codes": status_code_array_or_range(), + "error_collector.expected_status_codes": status_code_array_or_range(), + # None-defaulted leaves -- declare the type so the setting still appears. + "transaction_tracer.transaction_threshold": {"type": "string"}, # 'apdex_f' or float-as-string + "proxy_host": {"type": "string"}, + "proxy_port": {"type": "integer"}, + "proxy_user": {"type": "string"}, + "proxy_pass": {"type": "string"}, + "proxy_scheme": {"type": "string"}, + "ca_bundle_path": {"type": "string"}, + "audit_log_file": {"type": "string"}, + "log_file": {"type": "string"}, + "labels": {"type": "string"}, # documented as "name1:value1;name2:value2" + "api_key": {"type": "string"}, # deprecated but still documented in INI + "cloud.aws.account_id": {"type": "integer"}, # 12-digit number + "transaction_name.limit": {"type": "integer"}, + "transaction_name.naming_scheme": {"type": "string"}, + "browser_monitoring.loader_version": {"type": "string"}, + "browser_monitoring.ssl_for_http": {"type": "boolean"}, + "console.listener_socket": {"type": "string"}, + "debug.otlp_content_encoding": {"type": "string"}, + "agent_limits.data_compression_level": {"type": "integer"}, + "capture_params": {"type": "boolean"}, + "utilization.billing_hostname": {"type": "string"}, + # Distributed tracing sampler knobs -- all None-defaulted; types come + # from the leaf names (sampling_target -> int, ratio -> float). Listed + # explicitly so future additions are visible. + "distributed_tracing.sampler.root.adaptive.sampling_target": {"type": "integer"}, + "distributed_tracing.sampler.root.trace_id_ratio_based.ratio": {"type": "number"}, + "distributed_tracing.sampler.remote_parent_sampled.adaptive.sampling_target": {"type": "integer"}, + "distributed_tracing.sampler.remote_parent_sampled.trace_id_ratio_based.ratio": {"type": "number"}, + "distributed_tracing.sampler.remote_parent_not_sampled.adaptive.sampling_target": {"type": "integer"}, + "distributed_tracing.sampler.remote_parent_not_sampled.trace_id_ratio_based.ratio": {"type": "number"}, + "distributed_tracing.sampler.partial_granularity.root.adaptive.sampling_target": {"type": "integer"}, + "distributed_tracing.sampler.partial_granularity.root.trace_id_ratio_based.ratio": {"type": "number"}, + "distributed_tracing.sampler.partial_granularity.remote_parent_sampled.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.partial_granularity.remote_parent_sampled.trace_id_ratio_based.ratio": { + "type": "number" + }, + "distributed_tracing.sampler.partial_granularity.remote_parent_not_sampled.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.partial_granularity.remote_parent_not_sampled.trace_id_ratio_based.ratio": { + "type": "number" + }, +} + +# --------------------------------------------------------------------------- +# Keys to exclude from the customer-facing schema. +# +# Matching: an entry without a trailing `.*` matches a leaf path exactly; +# an entry ending in `.*` matches any descendant of that prefix. Easy to +# surface a setting later -- just delete its line. +# --------------------------------------------------------------------------- +EXCLUDE_KEYS = { + # Server- or runtime-assigned identity (assigned by the collector at + # connect time; not customer-tunable). + "agent_run_id", + "account_id", + "application_id", + "primary_application_id", + "trusted_account_ids", + "trusted_account_key", + "encoding_key", + "request_headers_map", + "feature_flag", # internal toggles, not part of the public surface + # Browser / RUM internals -- server-pushed, not customer-tunable. + "beacon", + "browser_key", + "error_beacon", + "episodes_url", + "js_agent_file", + "js_agent_loader", + # Other server-set / non-Settings-class objects on the tree. + "entity_guid", # server-assigned at connect + "attribute_filter", # AttributeFilter instance, not a config value + # Internal-only QA / debug toggles. Surfacing these as customer-tunable + # would let an end user disable TLS validation, suppress harvests, swap + # logging payloads, etc. -- all of which are support / development + # affordances, not configuration. + "developer_mode", + # Subtree exclusions (use `.*` suffix). + "cross_application_tracer.*", # legacy, replaced by distributed tracing + "process_host.*", # platform-derived (ip_address, display_name, etc.) + "debug.*", # internal QA toggles (cert-validation off-switch, verbose log dumps, etc.) +} + + +# --------------------------------------------------------------------------- +# Settings tree walk +# --------------------------------------------------------------------------- + + +def walk_settings(obj, prefix=""): + """Yield (dotted_path, value) for every leaf in a Settings tree. + + Recurses only into objects whose class name ends with 'Settings' -- + this is the convention in newrelic.core.config and avoids descending + into incidental objects (AttributeFilter, etc.) that may live on the + tree as instance attributes. + """ + for attr in sorted(vars(obj)): + if attr.startswith("_"): + continue + v = getattr(obj, attr) + full = f"{prefix}.{attr}" if prefix else attr + if hasattr(v, "__dict__") and type(v).__name__.endswith("Settings"): + yield from walk_settings(v, full) + else: + yield full, v + + +def is_excluded(path, exclude_keys): + """True if `path` matches an exact entry in `exclude_keys`, OR matches + any `prefix.*` entry by being equal to `prefix` or starting with + `prefix.`. + """ + if path in exclude_keys: + return True + for entry in exclude_keys: + if entry.endswith(".*"): + prefix = entry[:-2] + if path == prefix or path.startswith(prefix + "."): + return True + return False + + +# --------------------------------------------------------------------------- +# Type inference -- map a live Python value to a JSON Schema type. +# --------------------------------------------------------------------------- + + +def infer_type(value): + """Map a live Python value to a JSON Schema type string. + + Returns None for `None` (caller must consult TYPE_OVERRIDES). + """ + # bool MUST be checked before int -- bool is a subclass of int in Python. + if isinstance(value, bool): + return "boolean" + if isinstance(value, int): + return "integer" + if isinstance(value, float): + return "number" + if isinstance(value, str): + return "string" + if isinstance(value, (list, set, tuple)): + return "array" + if isinstance(value, dict): + return "object" + return None # None / unknown + + +def default_for(value, json_type): + """Convert a live Python value into a JSON-serializable default. + + Sets become sorted lists (sets unordered would produce nondeterministic + diffs). Tuples become lists. String-only lists also get sorted -- some + list defaults in newrelic.core.config are derived from `_environ_as_set` + (e.g. heroku.dyno_name_prefixes_to_shorten = list(set(...))), which + yields a list whose order varies across Python processes due to hash + randomization. Sorting makes the generated schema reproducible. + """ + if isinstance(value, set): + return sorted(value) + if isinstance(value, tuple): + return list(value) + if isinstance(value, list) and value and all(isinstance(v, str) for v in value): + return sorted(value) + return value + + +def make_property(path, value, description, enum_overrides, type_overrides): + """Build a JSON Schema property node for a single setting leaf. + + Resolution order: + 1. type_overrides[path] -- explicit override wins (used for None + defaults, anyOf shapes, and arrays whose items type cannot be + inferred). + 2. enum_overrides[path] -- enum forces type=string and translates + the live default if it is in the enum (or, for log_level, maps + the int default to its string form). + 3. Auto-anyOf for set values -- any non-explicitly-overridden leaf + whose live default is a Python `set` gets the + `anyOf [array, string]` shape used by INI list values. + 4. infer_type(value) -- otherwise. + """ + if path in type_overrides: + prop = dict(type_overrides[path]) + elif path in enum_overrides: + enum_vals = enum_overrides[path] + prop = {"type": "string", "enum": list(enum_vals)} + # Map int log levels to their string form; other settings just + # pass the value through if it's already in the enum. + default = LOG_LEVEL_INT_TO_STRING.get(value, value) if path == "log_level" else value + if isinstance(default, str) and default in enum_vals: + prop["default"] = default + elif isinstance(value, set): + # Auto-detect: sets in global_settings() come from + # _environ_as_set / _environ_as_comma_separated_set, which means + # the INI form is a delimited string. Emit anyOf so both forms + # validate. Item type comes from the first element if the set is + # non-empty; empty sets hit string_array_or_delimited's default + # of "string". + item_type = "string" + if value: + first = next(iter(value)) + inferred = infer_type(first) + if inferred: + item_type = inferred + prop = string_array_or_delimited(default=sorted(value), item_type=item_type) + else: + json_type = infer_type(value) + if json_type is None: + return None # caller will skip; no override and no inferable type + prop = {"type": json_type} + + if json_type == "array": + # List-typed leaves that aren't sets -- treat as plain arrays. + # Sets have already been routed to anyOf above; the only + # list-typed defaults that remain here are tuples-converted- + # to-lists or pre-sorted lists in the source. + if not value: + prop["items"] = {"type": "string"} + else: + first = next(iter(value)) + first_type = infer_type(first) + prop["items"] = {"type": first_type or "string"} + prop["default"] = default_for(value, json_type) + elif json_type == "object": + prop["additionalProperties"] = True + if value: + prop["default"] = value + else: + prop["default"] = default_for(value, json_type) + + if description: + prop["description"] = description.strip() + return prop + + +# --------------------------------------------------------------------------- +# INI parsing -- now ONLY for descriptions. We keep the same line-by-line +# scanner because it correctly handles commented-out config blocks (e.g. +# the proxy_* example block in newrelic.ini): the live key=value pair's +# preceding contiguous comment block becomes its description, and a blank +# line clears any pending block so commented-out examples don't bleed. +# --------------------------------------------------------------------------- + +# Section header: [newrelic] or [newrelic:production], etc. +_SECTION_RE = re.compile(r"^\s*\[([^\]]+)\]\s*$") +# Live key=value line. Keys may contain dots and underscores. +_KEY_RE = re.compile(r"^([a-zA-Z_][\w.\-]*)\s*=\s*(.*)$") + + +def parse_ini_descriptions(text, section="newrelic"): + """Return a {dotted_key: description} map for the named section.""" + comments = {} + pending = [] + current_section = None + + for raw_line in text.splitlines(): + line = raw_line.rstrip("\r") + stripped = line.strip() + + if stripped == "": + pending = [] + continue + + m = _SECTION_RE.match(line) + if m: + current_section = m.group(1).strip() + pending = [] + continue + + if line.lstrip().startswith("#"): + content = line.lstrip()[1:].removeprefix(" ") + pending.append(content.rstrip()) + continue + + if current_section != section: + pending = [] + continue + + km = _KEY_RE.match(line) + if km: + key = km.group(1) + if pending: + comments[key] = " ".join(p.strip() for p in pending if p.strip()) + pending = [] + else: + pending = [] + + return comments + + +def load_descriptions(): + """Load the {dotted_key: description} map from newrelic.ini, if present. + Returns an empty dict if the file is missing. + """ + if not DEFAULT_INI_PATH.exists(): + print(f" warning: {DEFAULT_INI_PATH} not found; schema will have no descriptions", file=sys.stderr) + return {} + return parse_ini_descriptions(DEFAULT_INI_PATH.read_text(encoding="utf-8")) + + +# --------------------------------------------------------------------------- +# Schema generation +# --------------------------------------------------------------------------- + +LICENSE_KEY_OVERRIDE = { + "type": "string", + "description": ( + "New Relic license key associated with your account. " + "Binds the agent's data to your account in the New Relic UI." + ), + "minLength": 1, +} + + +def build_properties(settings, descriptions, exclude_keys, enum_overrides, type_overrides): + """Walk the settings tree and build the JSON Schema `properties` map. + + Settings paths that match `exclude_keys` are skipped. Settings whose + live default is None and have no TYPE_OVERRIDE entry are skipped with + a warning -- their type cannot be inferred. + """ + properties = {} + skipped_none = [] + for path, value in walk_settings(settings): + if is_excluded(path, exclude_keys): + continue + prop = make_property(path, value, descriptions.get(path, ""), enum_overrides, type_overrides) + if prop is None: + # license_key gets a hardcoded override applied by the caller + # after this loop, so its None default isn't a gap -- suppress + # the warning to avoid implying it's missing from the schema. + if path != "license_key": + skipped_none.append(path) + continue + properties[path] = prop + + if skipped_none: + print( + f" warning: {len(skipped_none)} settings have None defaults and no " + f"TYPE_OVERRIDE; they were skipped from the schema:", + file=sys.stderr, + ) + for path in skipped_none: + print(f" - {path}", file=sys.stderr) + + return properties + + +def generate_schema(settings, descriptions, exclude_keys=None, enum_overrides=None, type_overrides=None): + """Generate a JSON Schema dict from a live Settings object.""" + if exclude_keys is None: + exclude_keys = EXCLUDE_KEYS + if enum_overrides is None: + enum_overrides = ENUM_OVERRIDES + if type_overrides is None: + type_overrides = TYPE_OVERRIDES + + properties = build_properties(settings, descriptions, exclude_keys, enum_overrides, type_overrides) + + # Hardcoded license_key: the live default is None and we always want + # this surfaced as a required, non-empty string. + properties["license_key"] = dict(LICENSE_KEY_OVERRIDE) + + return { + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "New Relic Python Agent Configuration", + "description": ( + "Fleet Control configuration schema for the New Relic Python agent. " + "Generated from newrelic.core.config.global_settings()." + ), + "type": "object", + "properties": properties, + "required": ["license_key", "app_name"], + "additionalProperties": True, + } + + +# --------------------------------------------------------------------------- +# Schema merge -- deep-merges a freshly generated schema into the existing +# one so the published schema only ever grows for forward-compatibility +# purposes. +# +# Caveat: the "only ever grows" promise does NOT extend to keys that the +# current generator deliberately excludes via EXCLUDE_KEYS. Filtering those +# out of the old schema before merge guarantees that newly-added exclusions +# actually take effect on the next regeneration, instead of being silently +# resurrected from the prior on-disk schema. +# --------------------------------------------------------------------------- + + +def filter_excluded(schema, exclude_keys): + """Return a copy of `schema` with any properties whose dotted path + matches `exclude_keys` removed. Operates on flat top-level property + paths only -- mirrors how is_excluded is used in build_properties. + """ + if not schema or "properties" not in schema: + return schema + filtered = dict(schema) + filtered["properties"] = {k: v for k, v in schema["properties"].items() if not is_excluded(k, exclude_keys)} + return filtered + + +def merge_schemas(old_s, new_s): + if not old_s: + return new_s + + merged = dict(new_s) + + old_props = old_s.get("properties") or {} + new_props = new_s.get("properties") or {} + if old_props or new_props: + merged["properties"] = merge_properties(old_props, new_props) + + return merged + + +def merge_properties(old_props, new_props): + result = {} + for key, new_val in new_props.items(): + if ( + key in old_props + and isinstance(new_val, dict) + and isinstance(old_props[key], dict) + and new_val.get("type") == "object" + and old_props[key].get("type") == "object" + ): + result[key] = merge_schemas(old_props[key], new_val) + else: + result[key] = new_val + for key, old_val in old_props.items(): + if key not in result: + result[key] = old_val + return result + + +# --------------------------------------------------------------------------- +# I/O +# --------------------------------------------------------------------------- + + +def load_settings(): + """Import the agent and return a fresh `global_settings()` snapshot. + + The import is deferred to function-call time so test code can stub + or override what `global_settings` returns by patching the module. + """ + from newrelic.core.config import global_settings + + return global_settings() + + +def write_schema(schema, path): + import json + + Path(path).parent.mkdir(parents=True, exist_ok=True) + Path(path).write_text(json.dumps(schema, indent=2) + "\n", encoding="utf-8") + + +def validate_meta_schema(schema): + """Validate against JSON Schema 2020-12. Soft-skip if `jsonschema` is + not installed; hard-fail (exit 2) only on actual schema invalidity. + """ + try: + import jsonschema + except ImportError: + print(" meta-schema check skipped: jsonschema not installed", file=sys.stderr) + return + try: + jsonschema.Draft202012Validator.check_schema(schema) + print("Meta-schema validation passed (Draft 2020-12)") + except jsonschema.exceptions.SchemaError as e: + print("Meta-schema validation FAILED:", file=sys.stderr) + print(f" {e.message}", file=sys.stderr) + sys.exit(2) + except Exception as e: + print(f" meta-schema check skipped: {type(e).__name__}: {e}", file=sys.stderr) + + +# --------------------------------------------------------------------------- +# Main +# --------------------------------------------------------------------------- + + +def main(argv=None): + parser = argparse.ArgumentParser( + description="Generate Fleet Control config schema. Writes config.json only; " + "version bumps live in bump-schema-version.py." + ) + parser.add_argument( + "--force", + action="store_true", + help="Overwrite the schema without comparing to the existing one. Always exits 0.", + ) + args = parser.parse_args(argv) + + print("Reading: newrelic.core.config.global_settings()") + settings = load_settings() + descriptions = load_descriptions() + print(f" {len(descriptions)} descriptions loaded from {DEFAULT_INI_PATH.name}") + + generated = generate_schema(settings, descriptions) + + old_schema = {} if args.force else load_existing(SCHEMA_PATH) + # Drop excluded paths from the prior schema before merging so newly-added + # entries in EXCLUDE_KEYS take effect instead of being preserved by the + # "schema only ever grows" merge. Keep the original around so the diff + # classifier can still surface those removals to reviewers. + filtered_old_schema = filter_excluded(old_schema, EXCLUDE_KEYS) + new_schema = merge_schemas(filtered_old_schema, generated) + + validate_meta_schema(new_schema) + + write_schema(new_schema, SCHEMA_PATH) + print(f"Wrote: {SCHEMA_PATH}") + + if args.force: + print("\n--force: schema written without diff comparison.") + return 0 + + if not old_schema: + print("\nFirst run -- schema created.") + return 0 + + changes = classify_changes(old_schema, new_schema) + print_changes(changes) + + return 1 if changes else 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/.fleetControl/schemaGeneration/schema_diff.py b/.fleetControl/schemaGeneration/schema_diff.py new file mode 100644 index 0000000000..a6bb553bda --- /dev/null +++ b/.fleetControl/schemaGeneration/schema_diff.py @@ -0,0 +1,278 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Schema diff + version bump helpers shared between generate-schema.py and +bump-schema-version.py. + +Pure functions for classifying schema changes, recommending a semver +bump, and rewriting the version line in configurationDefinitions.yml. + +No side effects beyond bump_version (which writes a YAML file when its +write= flag is true). +""" + +import json +import re +from pathlib import Path + +# --------------------------------------------------------------------------- +# Schema I/O +# --------------------------------------------------------------------------- + + +def load_existing(path): + """Load a JSON Schema from disk, returning {} if absent or unreadable. + + Used to diff a freshly generated schema against the on-disk one. A + malformed file is treated as "no prior schema" rather than a hard + failure -- the caller will overwrite it. + """ + if not Path(path).exists(): + return {} + try: + return json.loads(Path(path).read_text(encoding="utf-8")) + except json.JSONDecodeError: + return {} + + +# --------------------------------------------------------------------------- +# Schema diff classification +# --------------------------------------------------------------------------- + + +def render_change(c): + """Format a single change record as a one-line `+/-/~ path: detail` string.""" + kind = c["kind"] + sym = "+" if kind == "added" else ("-" if kind == "removed" else "~") + detail = c.get("detail") or "" + path = c["path"] + return f"{sym} {path}: {detail}" if detail else f"{sym} {path}" + + +def classify_changes(old_s, new_s, path=""): + """Walk two schemas in parallel, returning a list of change records. + + Each change record has keys: path, kind, severity (breaking/additive/ + cosmetic), detail. The caller turns severities into bump kinds via + recommend_bump. + """ + changes = [] + + old_req = set(old_s.get("required") or []) + new_req = set(new_s.get("required") or []) + changes.extend( + { + "path": f"{path}.{k}" if path else k, + "kind": "required_added", + "severity": "breaking", + "detail": "now required", + } + for k in sorted(new_req - old_req) + ) + changes.extend( + { + "path": f"{path}.{k}" if path else k, + "kind": "required_removed", + "severity": "additive", + "detail": "no longer required", + } + for k in sorted(old_req - new_req) + ) + + old_ap = old_s.get("additionalProperties", True) + new_ap = new_s.get("additionalProperties", True) + if old_ap is True and new_ap is False: + changes.append( + { + "path": path or "", + "kind": "additional_properties_tightened", + "severity": "breaking", + "detail": "additionalProperties: true -> false", + } + ) + elif old_ap is False and new_ap is True: + changes.append( + { + "path": path or "", + "kind": "additional_properties_loosened", + "severity": "additive", + "detail": "additionalProperties: false -> true", + } + ) + + old_props = old_s.get("properties") or {} + new_props = new_s.get("properties") or {} + for key in sorted(set(old_props.keys()) | set(new_props.keys())): + child_path = f"{path}.{key}" if path else key + if key not in old_props: + changes.append({"path": child_path, "kind": "added", "severity": "additive", "detail": "new property"}) + elif key not in new_props: + changes.append( + {"path": child_path, "kind": "removed", "severity": "breaking", "detail": "property removed"} + ) + else: + op = old_props[key] + np = new_props[key] + if op.get("type") == "object" and np.get("type") == "object": + changes.extend(classify_changes(op, np, child_path)) + else: + changes.extend(classify_leaf(op, np, child_path)) + return changes + + +def classify_leaf(op, np, path): + """Compare two leaf property nodes and return a list of change records.""" + changes = [] + + if op.get("type") != np.get("type"): + changes.append( + { + "path": path, + "kind": "type_changed", + "severity": "breaking", + "detail": f"type {op.get('type')} -> {np.get('type')}", + } + ) + + oe = op.get("enum") + ne = np.get("enum") + if oe is None and ne is not None: + changes.append( + { + "path": path, + "kind": "enum_introduced", + "severity": "breaking", + "detail": f"newly constrained to enum {ne}", + } + ) + elif oe is not None and ne is None: + changes.append( + {"path": path, "kind": "enum_removed_entirely", "severity": "additive", "detail": "enum constraint removed"} + ) + elif oe and ne and set(oe) != set(ne): + changes.extend( + {"path": path, "kind": "enum_value_removed", "severity": "breaking", "detail": f"enum value '{v}' removed"} + for v in sorted(set(oe) - set(ne)) + ) + changes.extend( + {"path": path, "kind": "enum_value_added", "severity": "additive", "detail": f"enum value '{v}' added"} + for v in sorted(set(ne) - set(oe)) + ) + + if op.get("default") != np.get("default"): + changes.append( + { + "path": path, + "kind": "default_changed", + "severity": "additive", + "detail": f"default {op.get('default')} -> {np.get('default')}", + } + ) + + if op.get("description") != np.get("description"): + changes.append( + {"path": path, "kind": "description_changed", "severity": "cosmetic", "detail": "description updated"} + ) + + return changes + + +# --------------------------------------------------------------------------- +# Semver bump +# --------------------------------------------------------------------------- + + +def recommend_bump(changes): + """Reduce a list of change records to a single bump kind. + + Returns the highest-severity bump implied by any change: a single + breaking change forces 'major'; otherwise any additive change forces + 'minor'; otherwise any cosmetic change forces 'patch'; otherwise 'none'. + """ + if any(c.get("severity") == "breaking" for c in changes): + return "major" + if any(c.get("severity") == "additive" for c in changes): + return "minor" + if any(c.get("severity") == "cosmetic" for c in changes): + return "patch" + return "none" + + +def apply_bump(version, bump): + """Return a new MAJOR.MINOR.PATCH string after applying the given bump kind.""" + if bump == "none": + return version + parts = version.split(".") + if len(parts) != 3 or not all(p.isdigit() for p in parts): + raise ValueError(f"version '{version}' is not semver MAJOR.MINOR.PATCH") + major, minor, patch = (int(p) for p in parts) + if bump == "major": + return f"{major + 1}.0.0" + if bump == "minor": + return f"{major}.{minor + 1}.0" + if bump == "patch": + return f"{major}.{minor}.{patch + 1}" + raise ValueError(f"unknown bump kind '{bump}'") + + +_VERSION_LINE_RE = re.compile(r"(?m)^(\s*version:\s*)(\S+)(\s*)$") + + +def bump_version(yaml_path, bump, write): + """Read the single `version:` line from yaml_path, apply the bump, and + optionally write the result back. Returns (old_version, new_version). + + Raises if the file does not contain exactly one `version:` line -- + catches ambiguity caused by future template additions to + configurationDefinitions.yml. + """ + text = Path(yaml_path).read_text(encoding="utf-8") + matches = list(_VERSION_LINE_RE.finditer(text)) + if len(matches) != 1: + raise RuntimeError(f"{yaml_path}: expected exactly 1 'version:' line, found {len(matches)}") + old_version = matches[0].group(2) + new_version = apply_bump(old_version, bump) + if write and new_version != old_version: + new_text = _VERSION_LINE_RE.sub(lambda m: f"{m.group(1)}{new_version}{m.group(3)}", text) + Path(yaml_path).write_text(new_text, encoding="utf-8") + return old_version, new_version + + +def print_changes(changes, *, header="Schema changes"): + """Pretty-print a classified change list grouped by severity. + + Lifted out of generate-schema.py's main so bump-schema-version.py can + reuse the same formatting. + """ + if not changes: + print("\nNo schema changes.") + return + + breaking = [c for c in changes if c["severity"] == "breaking"] + additive = [c for c in changes if c["severity"] == "additive"] + cosmetic = [c for c in changes if c["severity"] == "cosmetic"] + print(f"\n{header} ({len(changes)}):") + if breaking: + print(f" BREAKING ({len(breaking)}):") + for c in breaking: + print(f" {render_change(c)}") + if additive: + print(f" ADDITIVE ({len(additive)}):") + for c in additive: + print(f" {render_change(c)}") + if cosmetic: + print(f" COSMETIC ({len(cosmetic)}):") + for c in cosmetic: + print(f" {render_change(c)}") diff --git a/.fleetControl/schemaGeneration/tests/test_bump_schema_version.py b/.fleetControl/schemaGeneration/tests/test_bump_schema_version.py new file mode 100644 index 0000000000..cb03ea9198 --- /dev/null +++ b/.fleetControl/schemaGeneration/tests/test_bump_schema_version.py @@ -0,0 +1,201 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Unit tests for bump-schema-version.py. + +Covers the historical-ref reading helpers (parse_schema_path, +historical_schema_path_in_repo) and the main flow's bootstrap branches +(via mocked git_show). End-to-end git invocation is left untested because +mocking subprocess at that level provides no additional confidence and +the real workflow exercises it. +""" + +import importlib.util +import io +import json +import sys +import textwrap +import unittest +from pathlib import Path +from unittest import mock + +# bump-schema-version.py uses a hyphenated filename so we load it via importlib. +_SCRIPT = Path(__file__).resolve().parent.parent / "bump-schema-version.py" +_spec = importlib.util.spec_from_file_location("bump_schema_version", _SCRIPT) +bump_mod = importlib.util.module_from_spec(_spec) +_spec.loader.exec_module(bump_mod) + + +# --------------------------------------------------------------------------- +# parse_schema_path +# --------------------------------------------------------------------------- + + +class ParseSchemaPathTests(unittest.TestCase): + def test_finds_schema_line(self): + text = textwrap.dedent("""\ + configurationDefinitions: + - platform: KUBERNETESCLUSTER + schema: ./schemas/config.json + format: ini + """) + self.assertEqual(bump_mod.parse_schema_path(text), "./schemas/config.json") + + def test_no_schema_line_returns_none(self): + text = "configurationDefinitions:\n - platform: foo\n" + self.assertIsNone(bump_mod.parse_schema_path(text)) + + def test_handles_indentation(self): + # The regex must be tolerant of varying leading whitespace. + text = " schema: my/schema.json\n" + self.assertEqual(bump_mod.parse_schema_path(text), "my/schema.json") + + +# --------------------------------------------------------------------------- +# historical_schema_path_in_repo +# --------------------------------------------------------------------------- + + +class HistoricalSchemaPathInRepoTests(unittest.TestCase): + def test_strips_leading_dot_slash(self): + self.assertEqual( + bump_mod.historical_schema_path_in_repo("./schemas/config.json"), ".fleetControl/schemas/config.json" + ) + + def test_no_dot_slash(self): + self.assertEqual( + bump_mod.historical_schema_path_in_repo("schemas/config.json"), ".fleetControl/schemas/config.json" + ) + + +# --------------------------------------------------------------------------- +# main() bootstrap and happy-path branches. +# +# We mock git_show rather than running real git so the test is hermetic. +# --------------------------------------------------------------------------- + + +@mock.patch.object(bump_mod, "git_show") +class MainBootstrapTests(unittest.TestCase): + def _capture_stdout(self): + buf = io.StringIO() + self.addCleanup(setattr, sys, "stdout", sys.stdout) + sys.stdout = buf + return buf + + def test_bootstrap_when_config_def_absent(self, git_show): + git_show.return_value = None # configurationDefinitions.yml not at ref + buf = self._capture_stdout() + rc = bump_mod.main(["--since=v0.0.0"]) + self.assertEqual(rc, 0) + self.assertIn("Bootstrap", buf.getvalue()) + + def test_bootstrap_when_schema_field_missing(self, git_show): + git_show.return_value = "configurationDefinitions:\n - platform: foo\n" + buf = self._capture_stdout() + rc = bump_mod.main(["--since=v0.0.0"]) + self.assertEqual(rc, 0) + self.assertIn("`schema:` field", buf.getvalue()) + + def test_bootstrap_when_historical_schema_absent(self, git_show): + # First call returns the configurationDefinitions text; second + # call (for the schema file) returns None. + git_show.side_effect = ["schema: ./schemas/config.json\n", None] + buf = self._capture_stdout() + rc = bump_mod.main(["--since=v0.0.0"]) + self.assertEqual(rc, 0) + self.assertIn("Bootstrap", buf.getvalue()) + + def test_invalid_historical_json_exits_2(self, git_show): + git_show.side_effect = ["schema: ./schemas/config.json\n", "this is not valid json"] + # Stub stderr to silence; capture stdout for any messages. + self._capture_stdout() + with mock.patch.object(sys, "stderr", io.StringIO()): + rc = bump_mod.main(["--since=v0.0.0"]) + self.assertEqual(rc, 2) + + +# --------------------------------------------------------------------------- +# Happy-path: historical schema exists and differs from current; --ci writes. +# --------------------------------------------------------------------------- + + +class MainHappyPathTests(unittest.TestCase): + def setUp(self): + # Point the script at temp paths so we don't write into the real repo. + import tempfile + + self.tmp = tempfile.TemporaryDirectory() + self.addCleanup(self.tmp.cleanup) + tmp_path = Path(self.tmp.name) + + self.schema_path = tmp_path / "config.json" + self.config_def_path = tmp_path / "configurationDefinitions.yml" + + # Write a current schema with one extra property -- adds make the + # diff additive, so a 'minor' bump. + current = {"type": "object", "properties": {"old_key": {"type": "string"}, "new_key": {"type": "string"}}} + self.schema_path.write_text(json.dumps(current), encoding="utf-8") + + self.config_def_path.write_text( + textwrap.dedent("""\ + configurationDefinitions: + - platform: KUBERNETESCLUSTER + schema: ./schemas/config.json + version: 1.2.3 + format: ini + """), + encoding="utf-8", + ) + + # Re-point the module's path constants at our temp files. + self._patch_paths = mock.patch.multiple( + bump_mod, SCHEMA_PATH=self.schema_path, CONFIG_DEF_PATH=self.config_def_path + ) + self._patch_paths.start() + self.addCleanup(self._patch_paths.stop) + + @mock.patch.object(bump_mod, "git_show") + def test_dry_run_recommends_bump_does_not_write(self, git_show): + git_show.side_effect = [ + "schema: ./schemas/config.json\nversion: 1.2.3\n", + json.dumps({"type": "object", "properties": {"old_key": {"type": "string"}}}), + ] + before = self.config_def_path.read_text() + rc = bump_mod.main(["--since=v0.0.0"]) + self.assertEqual(rc, 1) + self.assertEqual(self.config_def_path.read_text(), before) + + @mock.patch.object(bump_mod, "git_show") + def test_ci_applies_bump(self, git_show): + git_show.side_effect = [ + "schema: ./schemas/config.json\nversion: 1.2.3\n", + json.dumps({"type": "object", "properties": {"old_key": {"type": "string"}}}), + ] + rc = bump_mod.main(["--since=v0.0.0", "--ci"]) + self.assertEqual(rc, 1) + # New key is additive -> minor bump 1.2.3 -> 1.3.0. + self.assertIn("version: 1.3.0", self.config_def_path.read_text()) + + @mock.patch.object(bump_mod, "git_show") + def test_no_diff_returns_0(self, git_show): + # Historical schema matches current -> no bump. + current = json.loads(self.schema_path.read_text()) + git_show.side_effect = ["schema: ./schemas/config.json\nversion: 1.2.3\n", json.dumps(current)] + rc = bump_mod.main(["--since=v0.0.0"]) + self.assertEqual(rc, 0) + + +if __name__ == "__main__": + unittest.main() diff --git a/.fleetControl/schemaGeneration/tests/test_generate_schema.py b/.fleetControl/schemaGeneration/tests/test_generate_schema.py new file mode 100644 index 0000000000..9d98b2eb07 --- /dev/null +++ b/.fleetControl/schemaGeneration/tests/test_generate_schema.py @@ -0,0 +1,555 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Unit tests for generate-schema.py. + +Run from the repo root: + + python3 -m unittest discover .fleetControl/schemaGeneration/tests + +The generator script lives one level up; we load it via importlib.util +because the filename has a hyphen and is not importable as a module. + +Diff/bump tests (classify_changes, recommend_bump, apply_bump, +bump_version) live in test_schema_diff.py since those helpers were +extracted to schema_diff.py. +""" + +import importlib.util +import textwrap +import unittest +from pathlib import Path + +# --------------------------------------------------------------------------- +# Load generate-schema.py as a module under the alias `gen` +# --------------------------------------------------------------------------- +_SCRIPT = Path(__file__).resolve().parent.parent / "generate-schema.py" +_spec = importlib.util.spec_from_file_location("gen", _SCRIPT) +gen = importlib.util.module_from_spec(_spec) +_spec.loader.exec_module(gen) + + +# --------------------------------------------------------------------------- +# Fake Settings classes -- mimic the real `class FooSettings` convention +# from newrelic.core.config so walk_settings recognizes them. +# --------------------------------------------------------------------------- + + +class FakeRootSettings: + """Stands in for newrelic.core.config.TopLevelSettings.""" + + +class FakeChildSettings: + """Stands in for any nested settings object (e.g. TransactionTracerSettings).""" + + +class NotASettingsObject: + """Plain object, intentionally NOT ending in 'Settings'. walk_settings + must NOT recurse into instances of this class -- it should treat them + as opaque leaves so AttributeFilter etc. don't get walked. + """ + + +def make_fake_settings(): + """Build a small Settings tree exercising every supported leaf type.""" + s = FakeRootSettings() + s.license_key = None + s.app_name = "Python Application" + s.monitor_mode = True + s.log_level = 20 # INFO -- must be translated to 'info' string in schema + s.log_file = None + s.proxy_port = None + s.transaction_tracer = FakeChildSettings() + s.transaction_tracer.enabled = True + s.transaction_tracer.transaction_threshold = None + s.transaction_tracer.record_sql = "obfuscated" + s.transaction_tracer.stack_trace_threshold = 0.5 + s.transaction_tracer.function_trace = [] + s.attributes = FakeChildSettings() + s.attributes.enabled = True + s.attributes.include = set() + s.attributes.exclude = set() + # Server-set / runtime -- should be excluded. + s.agent_run_id = None + s.beacon = None + # Subtree exclusion target. + s.cross_application_tracer = FakeChildSettings() + s.cross_application_tracer.enabled = False + # Non-Settings attribute (mimics AttributeFilter on the real settings). + s.attribute_filter = NotASettingsObject() + # Private/internal attribute -- should be skipped. + s._internal = "do not walk me" + return s + + +TEST_ENUMS = { + "log_level": ["critical", "error", "warning", "info", "debug"], + "transaction_tracer.record_sql": ["off", "raw", "obfuscated"], +} +# Test fixture mirrors the real TYPE_OVERRIDES -- list-typed leaves use the +# new anyOf helper, everything else stays as before. +TEST_TYPES = { + "transaction_tracer.transaction_threshold": {"type": "string"}, + "transaction_tracer.function_trace": gen.string_array_or_delimited(default=[]), + "attributes.include": gen.string_array_or_delimited(default=[]), + "attributes.exclude": gen.string_array_or_delimited(default=[]), + "log_file": {"type": "string"}, + "proxy_port": {"type": "integer"}, +} +TEST_EXCLUDES = {"agent_run_id", "beacon", "cross_application_tracer.*"} + + +# --------------------------------------------------------------------------- +# infer_type +# --------------------------------------------------------------------------- + + +class InferTypeTests(unittest.TestCase): + def test_bool_before_int(self): + # CRITICAL: bool is a subclass of int. infer_type MUST check bool + # before int so True/False don't end up as 'integer'. + self.assertEqual(gen.infer_type(True), "boolean") + self.assertEqual(gen.infer_type(False), "boolean") + + def test_integer(self): + self.assertEqual(gen.infer_type(0), "integer") + self.assertEqual(gen.infer_type(42), "integer") + self.assertEqual(gen.infer_type(-1), "integer") + + def test_number(self): + self.assertEqual(gen.infer_type(0.5), "number") + self.assertEqual(gen.infer_type(-1.25), "number") + + def test_string(self): + self.assertEqual(gen.infer_type("hello"), "string") + self.assertEqual(gen.infer_type(""), "string") + + def test_array_types(self): + self.assertEqual(gen.infer_type([]), "array") + self.assertEqual(gen.infer_type(set()), "array") + self.assertEqual(gen.infer_type(()), "array") + + def test_dict_is_object(self): + self.assertEqual(gen.infer_type({}), "object") + + def test_none_returns_none(self): + self.assertIsNone(gen.infer_type(None)) + + +# --------------------------------------------------------------------------- +# default_for +# --------------------------------------------------------------------------- + + +class DefaultForTests(unittest.TestCase): + def test_set_becomes_sorted_list(self): + self.assertEqual(gen.default_for({"b", "a", "c"}, "array"), ["a", "b", "c"]) + + def test_tuple_becomes_list(self): + self.assertEqual(gen.default_for((1, 2, 3), "array"), [1, 2, 3]) + + def test_other_passthrough(self): + self.assertEqual(gen.default_for(42, "integer"), 42) + self.assertEqual(gen.default_for("x", "string"), "x") + self.assertIs(gen.default_for(True, "boolean"), True) + + +# --------------------------------------------------------------------------- +# walk_settings +# --------------------------------------------------------------------------- + + +class WalkSettingsTests(unittest.TestCase): + def test_yields_top_level_leaves(self): + s = make_fake_settings() + leaves = dict(gen.walk_settings(s)) + self.assertIn("license_key", leaves) + self.assertIn("app_name", leaves) + self.assertEqual(leaves["app_name"], "Python Application") + + def test_recurses_into_settings_classes(self): + s = make_fake_settings() + leaves = dict(gen.walk_settings(s)) + self.assertIn("transaction_tracer.enabled", leaves) + self.assertIs(leaves["transaction_tracer.enabled"], True) + self.assertIn("attributes.include", leaves) + + def test_does_not_recurse_into_non_settings_objects(self): + # NotASettingsObject does not end in 'Settings'. walk must yield + # it as an opaque leaf rather than descending into it. + s = make_fake_settings() + leaves = dict(gen.walk_settings(s)) + self.assertIn("attribute_filter", leaves) + self.assertIsInstance(leaves["attribute_filter"], NotASettingsObject) + + def test_skips_private_attrs(self): + s = make_fake_settings() + leaves = dict(gen.walk_settings(s)) + self.assertNotIn("_internal", leaves) + + +# --------------------------------------------------------------------------- +# is_excluded +# --------------------------------------------------------------------------- + + +class IsExcludedTests(unittest.TestCase): + def test_exact_match(self): + self.assertTrue(gen.is_excluded("agent_run_id", {"agent_run_id"})) + + def test_no_match(self): + self.assertFalse(gen.is_excluded("app_name", {"agent_run_id"})) + + def test_wildcard_matches_descendant(self): + excludes = {"cross_application_tracer.*"} + self.assertTrue(gen.is_excluded("cross_application_tracer.enabled", excludes)) + self.assertTrue(gen.is_excluded("cross_application_tracer.deep.nested.key", excludes)) + + def test_wildcard_matches_root(self): + # The 'foo.*' entry should also match the bare 'foo' path so a + # subtree exclude can drop the top-level node too. + self.assertTrue(gen.is_excluded("cross_application_tracer", {"cross_application_tracer.*"})) + + def test_wildcard_does_not_match_unrelated_key(self): + excludes = {"cross_application_tracer.*"} + self.assertFalse(gen.is_excluded("cross_app", excludes)) + self.assertFalse(gen.is_excluded("transaction_tracer.enabled", excludes)) + + +# --------------------------------------------------------------------------- +# anyOf helpers +# --------------------------------------------------------------------------- + + +class StringArrayOrDelimitedTests(unittest.TestCase): + def test_shape_no_default(self): + s = gen.string_array_or_delimited() + self.assertEqual(s, {"anyOf": [{"type": "array", "items": {"type": "string"}}, {"type": "string"}]}) + + def test_shape_with_empty_default(self): + s = gen.string_array_or_delimited(default=[]) + self.assertEqual(s["default"], []) + self.assertIn("anyOf", s) + + def test_shape_with_populated_default(self): + s = gen.string_array_or_delimited(default=["a", "b"]) + self.assertEqual(s["default"], ["a", "b"]) + + def test_custom_item_type(self): + s = gen.string_array_or_delimited(item_type="integer") + self.assertEqual(s["anyOf"][0]["items"], {"type": "integer"}) + + +class StatusCodeArrayOrRangeTests(unittest.TestCase): + def test_shape_three_options(self): + s = gen.status_code_array_or_range() + types = [opt.get("type") for opt in s["anyOf"]] + self.assertEqual(types, ["integer", "array", "string"]) + # Range string carries a description so consumers know the format. + self.assertIn("range", s["anyOf"][2]["description"].lower()) + self.assertEqual(s["anyOf"][1]["items"], {"type": "integer"}) + + def test_shape_with_default(self): + s = gen.status_code_array_or_range(default=[404]) + self.assertEqual(s["default"], [404]) + + +# --------------------------------------------------------------------------- +# make_property +# --------------------------------------------------------------------------- + + +class MakePropertyTests(unittest.TestCase): + def test_boolean_with_default(self): + p = gen.make_property("enabled", True, "Enable the thing", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["type"], "boolean") + self.assertIs(p["default"], True) + self.assertEqual(p["description"], "Enable the thing") + + def test_integer(self): + p = gen.make_property("count", 42, "", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["type"], "integer") + self.assertEqual(p["default"], 42) + self.assertNotIn("description", p) + + def test_float_is_number(self): + p = gen.make_property("threshold", 0.5, "", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["type"], "number") + self.assertEqual(p["default"], 0.5) + + def test_empty_set_auto_anyof(self): + # Set-typed live values (regardless of population) get auto-anyOf + # because the underlying agent setting is INI-string-parseable. + p = gen.make_property("some.set", set(), "", {}, {}) + self.assertNotIn("type", p) + self.assertIn("anyOf", p) + self.assertEqual(p["anyOf"][0], {"type": "array", "items": {"type": "string"}}) + self.assertEqual(p["anyOf"][1], {"type": "string"}) + self.assertEqual(p["default"], []) + + def test_set_with_values_anyof_sorted_default(self): + p = gen.make_property("some.set", {"b", "a"}, "", {}, {}) + self.assertIn("anyOf", p) + self.assertEqual(p["default"], ["a", "b"]) + + def test_set_of_ints_auto_anyof_int_items(self): + # Auto-anyOf should pick up the inner item type from the first + # element of a non-empty set. + p = gen.make_property("status_codes", {404, 500}, "", {}, {}) + self.assertEqual(p["anyOf"][0]["items"], {"type": "integer"}) + self.assertEqual(p["default"], [404, 500]) + + def test_empty_list_pins_items_to_string(self): + # Plain lists (not sets) still emit a regular array. Only set-typed + # live values trigger the auto-anyOf path. + p = gen.make_property("some.list", [], "", {}, {}) + self.assertEqual(p["type"], "array") + self.assertEqual(p["items"], {"type": "string"}) + self.assertEqual(p["default"], []) + + def test_dict_is_object_with_additional_properties_true(self): + p = gen.make_property("some.dict", {}, "", {}, {}) + self.assertEqual(p["type"], "object") + self.assertTrue(p["additionalProperties"]) + + def test_log_level_int_translated_to_string(self): + p = gen.make_property("log_level", 20, "", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["type"], "string") + self.assertEqual(p["enum"], TEST_ENUMS["log_level"]) + self.assertEqual(p["default"], "info") # 20 -> 'info', not 20 + + def test_log_level_unknown_int_no_default(self): + p = gen.make_property("log_level", 99, "", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["enum"], TEST_ENUMS["log_level"]) + self.assertNotIn("default", p) + + def test_enum_with_matching_string_default(self): + p = gen.make_property("transaction_tracer.record_sql", "obfuscated", "", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["enum"], TEST_ENUMS["transaction_tracer.record_sql"]) + self.assertEqual(p["default"], "obfuscated") + + def test_enum_with_non_matching_default_no_default(self): + p = gen.make_property("transaction_tracer.record_sql", "weird", "", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["enum"], TEST_ENUMS["transaction_tracer.record_sql"]) + self.assertNotIn("default", p) + + def test_type_override_takes_precedence(self): + p = gen.make_property("transaction_tracer.transaction_threshold", None, "", TEST_ENUMS, TEST_TYPES) + self.assertEqual(p["type"], "string") + self.assertNotIn("default", p) + + def test_type_override_anyof_for_array(self): + # The TEST_TYPES override for attributes.include uses the + # string_array_or_delimited helper; the override should win over + # auto-anyOf and just be applied verbatim. + p = gen.make_property("attributes.include", set(), "doc", TEST_ENUMS, TEST_TYPES) + self.assertIn("anyOf", p) + self.assertEqual(p["anyOf"][0], {"type": "array", "items": {"type": "string"}}) + self.assertEqual(p["anyOf"][1], {"type": "string"}) + self.assertEqual(p["default"], []) + self.assertEqual(p["description"], "doc") + + def test_none_with_no_override_returns_none(self): + # license_key has no override in TEST_TYPES; make_property should + # signal "skip me" to the caller. + result = gen.make_property("license_key", None, "", {}, {}) + self.assertIsNone(result) + + +# --------------------------------------------------------------------------- +# build_properties +# --------------------------------------------------------------------------- + + +class BuildPropertiesTests(unittest.TestCase): + def test_excludes_applied(self): + s = make_fake_settings() + props = gen.build_properties(s, {}, TEST_EXCLUDES, TEST_ENUMS, TEST_TYPES) + self.assertNotIn("agent_run_id", props) + self.assertNotIn("beacon", props) + # Subtree exclude drops the descendant. + self.assertNotIn("cross_application_tracer.enabled", props) + + def test_descriptions_attached_when_present(self): + s = make_fake_settings() + descs = {"app_name": "the application name"} + props = gen.build_properties(s, descs, set(), TEST_ENUMS, TEST_TYPES) + self.assertEqual(props["app_name"]["description"], "the application name") + + def test_skipped_none_settings_do_not_appear(self): + s = make_fake_settings() + # log_file has no TYPE_OVERRIDE in this fixture (we deliberately + # omit it from TEST_TYPES below) -> should be skipped. + types = dict(TEST_TYPES) + types.pop("log_file") + props = gen.build_properties(s, {}, set(), TEST_ENUMS, types) + self.assertNotIn("log_file", props) + + +# --------------------------------------------------------------------------- +# generate_schema -- end-to-end integration against the fake tree +# --------------------------------------------------------------------------- + + +class GenerateSchemaIntegrationTests(unittest.TestCase): + def setUp(self): + s = make_fake_settings() + descriptions = { + "app_name": "The application name.", + "monitor_mode": "Enable monitoring.", + "transaction_tracer.enabled": "Capture slow transactions.", + } + self.schema = gen.generate_schema( + s, descriptions, exclude_keys=TEST_EXCLUDES, enum_overrides=TEST_ENUMS, type_overrides=TEST_TYPES + ) + self.props = self.schema["properties"] + + def test_top_level_required(self): + self.assertEqual(self.schema["required"], ["license_key", "app_name"]) + + def test_additional_properties_true(self): + self.assertTrue(self.schema["additionalProperties"]) + + def test_license_key_overridden(self): + lk = self.props["license_key"] + self.assertEqual(lk["type"], "string") + self.assertEqual(lk["minLength"], 1) + self.assertNotIn("default", lk) + self.assertIn("license key", lk["description"].lower()) + + def test_app_name_string_with_default(self): + an = self.props["app_name"] + self.assertEqual(an["type"], "string") + self.assertEqual(an["default"], "Python Application") + self.assertEqual(an["description"], "The application name.") + + def test_log_level_uses_enum_with_string_default(self): + ll = self.props["log_level"] + self.assertEqual(ll["type"], "string") + self.assertEqual(ll["enum"], TEST_ENUMS["log_level"]) + self.assertEqual(ll["default"], "info") + + def test_monitor_mode_boolean_default_true(self): + mm = self.props["monitor_mode"] + self.assertEqual(mm["type"], "boolean") + self.assertIs(mm["default"], True) + + def test_transaction_tracer_enabled_boolean(self): + tt = self.props["transaction_tracer.enabled"] + self.assertEqual(tt["type"], "boolean") + self.assertIs(tt["default"], True) + + def test_transaction_threshold_string_via_override(self): + tt = self.props["transaction_tracer.transaction_threshold"] + self.assertEqual(tt["type"], "string") + self.assertNotIn("default", tt) + + def test_attributes_include_anyof_via_override(self): + ai = self.props["attributes.include"] + self.assertIn("anyOf", ai) + self.assertEqual(ai["anyOf"][0], {"type": "array", "items": {"type": "string"}}) + self.assertEqual(ai["anyOf"][1], {"type": "string"}) + self.assertEqual(ai["default"], []) + + def test_excluded_keys_absent(self): + self.assertNotIn("agent_run_id", self.props) + self.assertNotIn("beacon", self.props) + self.assertNotIn("cross_application_tracer.enabled", self.props) + + +# --------------------------------------------------------------------------- +# parse_ini_descriptions -- INI is now description-only +# --------------------------------------------------------------------------- + + +class ParseIniDescriptionsTests(unittest.TestCase): + def test_single_comment_attached(self): + text = "[newrelic]\n# my comment\nfoo = 1\n" + self.assertEqual(gen.parse_ini_descriptions(text)["foo"], "my comment") + + def test_multi_line_comment_joined(self): + text = "[newrelic]\n# line one\n# line two\nfoo = 1\n" + self.assertEqual(gen.parse_ini_descriptions(text)["foo"], "line one line two") + + def test_blank_line_resets_pending(self): + text = "[newrelic]\n# stale\n\nfoo = 1\n" + self.assertNotIn("foo", gen.parse_ini_descriptions(text)) + + def test_commented_out_example_does_not_bleed(self): + text = textwrap.dedent("""\ + [newrelic] + # proxy_host = hostname + + # real description + transaction_tracer.enabled = true + """) + comments = gen.parse_ini_descriptions(text) + self.assertEqual(comments["transaction_tracer.enabled"], "real description") + + def test_other_section_ignored(self): + text = textwrap.dedent("""\ + [newrelic] + # in newrelic + foo = 1 + [newrelic:production] + # in production + bar = 2 + """) + comments = gen.parse_ini_descriptions(text) + self.assertIn("foo", comments) + self.assertNotIn("bar", comments) + + +# --------------------------------------------------------------------------- +# merge_schemas -- still lives in generate-schema.py +# --------------------------------------------------------------------------- + + +class MergeSchemasTests(unittest.TestCase): + def test_empty_old_returns_new(self): + new = {"type": "object", "properties": {"foo": {"type": "string"}}} + self.assertEqual(gen.merge_schemas({}, new), new) + + def test_keys_only_in_old_preserved(self): + old = {"type": "object", "properties": {"legacy": {"type": "string", "default": "x"}}} + new = {"type": "object", "properties": {"fresh": {"type": "integer"}}} + merged = gen.merge_schemas(old, new) + self.assertIn("legacy", merged["properties"]) + self.assertIn("fresh", merged["properties"]) + self.assertEqual(merged["properties"]["legacy"]["default"], "x") + + def test_keys_in_both_new_wins(self): + old = {"type": "object", "properties": {"foo": {"type": "string", "default": "old"}}} + new = {"type": "object", "properties": {"foo": {"type": "string", "default": "new"}}} + merged = gen.merge_schemas(old, new) + self.assertEqual(merged["properties"]["foo"]["default"], "new") + + def test_top_level_required_uses_new(self): + old = {"type": "object", "properties": {"foo": {"type": "string"}}, "required": ["foo"]} + new = {"type": "object", "properties": {"foo": {"type": "string"}}, "required": []} + merged = gen.merge_schemas(old, new) + self.assertEqual(merged["required"], []) + + def test_type_change_clears_stale_constraints(self): + old = {"type": "object", "properties": {"x": {"type": "string", "enum": ["a", "b"]}}} + new = {"type": "object", "properties": {"x": {"type": "integer", "default": 5}}} + merged = gen.merge_schemas(old, new) + x = merged["properties"]["x"] + self.assertEqual(x["type"], "integer") + self.assertEqual(x["default"], 5) + self.assertNotIn("enum", x) + + +if __name__ == "__main__": + unittest.main() diff --git a/.fleetControl/schemaGeneration/tests/test_schema_diff.py b/.fleetControl/schemaGeneration/tests/test_schema_diff.py new file mode 100644 index 0000000000..a7d66a9b34 --- /dev/null +++ b/.fleetControl/schemaGeneration/tests/test_schema_diff.py @@ -0,0 +1,231 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Unit tests for schema_diff.py. + +The classify/recommend_bump/apply_bump/bump_version helpers were lifted +out of generate-schema.py so bump-schema-version.py can reuse them. This +file exercises that shared module directly. + +Run from the repo root: + + python3 -m unittest discover .fleetControl/schemaGeneration/tests +""" + +import os +import sys +import tempfile +import textwrap +import unittest +from pathlib import Path + +# schema_diff.py is a regular Python module living next to generate-schema.py. +# Add the parent directory to sys.path so it imports cleanly without the +# importlib.util dance the hyphenated scripts need. +_SCRIPT_DIR = Path(__file__).resolve().parent.parent +sys.path.insert(0, str(_SCRIPT_DIR)) +import schema_diff # noqa: E402 + + +def _obj(props, required=None, additional=True): + node = {"type": "object", "properties": props, "additionalProperties": additional} + if required is not None: + node["required"] = required + return node + + +def _by_kind(changes): + return {c["kind"]: c for c in changes} + + +class ClassifyChangesTests(unittest.TestCase): + def test_no_changes(self): + s = _obj({"foo": {"type": "string", "default": "x"}}) + self.assertEqual(schema_diff.classify_changes(s, s), []) + + def test_added_is_additive(self): + ch = schema_diff.classify_changes(_obj({}), _obj({"foo": {"type": "string"}})) + self.assertEqual(ch[0]["severity"], "additive") + + def test_removed_is_breaking(self): + ch = schema_diff.classify_changes(_obj({"foo": {"type": "string"}}), _obj({})) + self.assertEqual(ch[0]["severity"], "breaking") + + def test_type_change_is_breaking(self): + ch = _by_kind( + schema_diff.classify_changes(_obj({"foo": {"type": "string"}}), _obj({"foo": {"type": "integer"}})) + ) + self.assertEqual(ch["type_changed"]["severity"], "breaking") + + def test_required_added_is_breaking(self): + ch = _by_kind( + schema_diff.classify_changes( + _obj({"foo": {"type": "string"}}, []), _obj({"foo": {"type": "string"}}, ["foo"]) + ) + ) + self.assertEqual(ch["required_added"]["severity"], "breaking") + + def test_required_removed_is_additive(self): + ch = _by_kind( + schema_diff.classify_changes( + _obj({"foo": {"type": "string"}}, ["foo"]), _obj({"foo": {"type": "string"}}, []) + ) + ) + self.assertEqual(ch["required_removed"]["severity"], "additive") + + def test_additional_properties_tightened_is_breaking(self): + ch = _by_kind(schema_diff.classify_changes(_obj({}, None, True), _obj({}, None, False))) + self.assertEqual(ch["additional_properties_tightened"]["severity"], "breaking") + + def test_additional_properties_loosened_is_additive(self): + ch = _by_kind(schema_diff.classify_changes(_obj({}, None, False), _obj({}, None, True))) + self.assertEqual(ch["additional_properties_loosened"]["severity"], "additive") + + def test_enum_value_removed_is_breaking(self): + ch = schema_diff.classify_changes( + _obj({"x": {"type": "string", "enum": ["a", "b"]}}), _obj({"x": {"type": "string", "enum": ["a"]}}) + ) + self.assertEqual(next(c for c in ch if c["kind"] == "enum_value_removed")["severity"], "breaking") + + def test_enum_value_added_is_additive(self): + ch = schema_diff.classify_changes( + _obj({"x": {"type": "string", "enum": ["a"]}}), _obj({"x": {"type": "string", "enum": ["a", "b"]}}) + ) + self.assertEqual(next(c for c in ch if c["kind"] == "enum_value_added")["severity"], "additive") + + def test_enum_introduced_is_breaking(self): + ch = _by_kind( + schema_diff.classify_changes( + _obj({"x": {"type": "string"}}), _obj({"x": {"type": "string", "enum": ["a"]}}) + ) + ) + self.assertEqual(ch["enum_introduced"]["severity"], "breaking") + + def test_default_changed_is_additive(self): + ch = _by_kind( + schema_diff.classify_changes( + _obj({"x": {"type": "string", "default": "a"}}), _obj({"x": {"type": "string", "default": "b"}}) + ) + ) + self.assertEqual(ch["default_changed"]["severity"], "additive") + + def test_description_changed_is_cosmetic(self): + ch = _by_kind( + schema_diff.classify_changes( + _obj({"x": {"type": "string", "description": "old"}}), + _obj({"x": {"type": "string", "description": "new"}}), + ) + ) + self.assertEqual(ch["description_changed"]["severity"], "cosmetic") + + +class RecommendBumpTests(unittest.TestCase): + def test_breaking_is_major(self): + self.assertEqual(schema_diff.recommend_bump([{"severity": "breaking"}]), "major") + + def test_additive_is_minor(self): + self.assertEqual(schema_diff.recommend_bump([{"severity": "additive"}]), "minor") + + def test_cosmetic_is_patch(self): + self.assertEqual(schema_diff.recommend_bump([{"severity": "cosmetic"}]), "patch") + + def test_empty_is_none(self): + self.assertEqual(schema_diff.recommend_bump([]), "none") + + def test_breaking_wins_over_additive(self): + self.assertEqual(schema_diff.recommend_bump([{"severity": "additive"}, {"severity": "breaking"}]), "major") + + +class ApplyBumpTests(unittest.TestCase): + def test_apply_bumps(self): + self.assertEqual(schema_diff.apply_bump("1.2.3", "major"), "2.0.0") + self.assertEqual(schema_diff.apply_bump("1.2.3", "minor"), "1.3.0") + self.assertEqual(schema_diff.apply_bump("1.2.3", "patch"), "1.2.4") + self.assertEqual(schema_diff.apply_bump("1.2.3", "none"), "1.2.3") + + def test_apply_bump_invalid_semver(self): + with self.assertRaises(ValueError): + schema_diff.apply_bump("not-semver", "major") + + def test_apply_bump_unknown_kind(self): + with self.assertRaises(ValueError): + schema_diff.apply_bump("1.2.3", "weird") + + +FIXTURE_YAML = textwrap.dedent("""\ + configurationDefinitions: + - platform: KUBERNETESCLUSTER + description: Test agent configuration + type: agent-config + version: 1.2.3 + schema: ./schemas/config.json + format: ini + """) + + +class BumpVersionTests(unittest.TestCase): + def _temp_yaml(self, content=FIXTURE_YAML): + f = tempfile.NamedTemporaryFile(mode="w", suffix=".yml", delete=False, encoding="utf-8") + f.write(content) + f.close() + self.addCleanup(os.unlink, f.name) + return Path(f.name) + + def test_read_returns_old_new(self): + path = self._temp_yaml() + old_v, new_v = schema_diff.bump_version(path, "minor", False) + self.assertEqual(old_v, "1.2.3") + self.assertEqual(new_v, "1.3.0") + + def test_write_false_does_not_touch_file(self): + path = self._temp_yaml() + before = path.read_text() + schema_diff.bump_version(path, "major", False) + self.assertEqual(path.read_text(), before) + + def test_write_true_mutates(self): + path = self._temp_yaml() + schema_diff.bump_version(path, "major", True) + self.assertIn("version: 2.0.0", path.read_text()) + + def test_missing_version_raises(self): + path = self._temp_yaml("configurationDefinitions:\n - platform: foo\n") + with self.assertRaises(RuntimeError): + schema_diff.bump_version(path, "major", False) + + +class LoadExistingTests(unittest.TestCase): + def test_missing_returns_empty(self): + self.assertEqual(schema_diff.load_existing("/nonexistent/path/to/schema.json"), {}) + + def test_malformed_json_returns_empty(self): + f = tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False, encoding="utf-8") + f.write("{ this is not valid json") + f.close() + self.addCleanup(os.unlink, f.name) + self.assertEqual(schema_diff.load_existing(f.name), {}) + + def test_valid_json_round_trips(self): + import json + + payload = {"type": "object", "properties": {"foo": {"type": "string"}}} + f = tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False, encoding="utf-8") + json.dump(payload, f) + f.close() + self.addCleanup(os.unlink, f.name) + self.assertEqual(schema_diff.load_existing(f.name), payload) + + +if __name__ == "__main__": + unittest.main() diff --git a/.fleetControl/schemas/config.json b/.fleetControl/schemas/config.json new file mode 100644 index 0000000000..9ac4e732c9 --- /dev/null +++ b/.fleetControl/schemas/config.json @@ -0,0 +1,1110 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "New Relic Python Agent Configuration", + "description": "Fleet Control configuration schema for the New Relic Python agent. Generated from newrelic.core.config.global_settings().", + "type": "object", + "properties": { + "agent_limits.data_collector_timeout": { + "type": "number", + "default": 30.0 + }, + "agent_limits.data_compression_level": { + "type": "integer" + }, + "agent_limits.data_compression_threshold": { + "type": "integer", + "default": 65536 + }, + "agent_limits.errors_per_harvest": { + "type": "integer", + "default": 20 + }, + "agent_limits.errors_per_transaction": { + "type": "integer", + "default": 5 + }, + "agent_limits.max_sql_connections": { + "type": "integer", + "default": 4 + }, + "agent_limits.slow_sql_data": { + "type": "integer", + "default": 10 + }, + "agent_limits.slow_sql_stack_trace": { + "type": "integer", + "default": 30 + }, + "agent_limits.slow_transaction_dry_harvests": { + "type": "integer", + "default": 5 + }, + "agent_limits.sql_explain_plans": { + "type": "integer", + "default": 30 + }, + "agent_limits.sql_explain_plans_per_harvest": { + "type": "integer", + "default": 60 + }, + "agent_limits.sql_query_length_maximum": { + "type": "integer", + "default": 16384 + }, + "agent_limits.synthetics_events": { + "type": "integer", + "default": 200 + }, + "agent_limits.synthetics_transactions": { + "type": "integer", + "default": 20 + }, + "agent_limits.thread_profiler_nodes": { + "type": "integer", + "default": 20000 + }, + "agent_limits.transaction_traces_nodes": { + "type": "integer", + "default": 2000 + }, + "ai_monitoring.enabled": { + "type": "boolean", + "default": false + }, + "ai_monitoring.record_content.enabled": { + "type": "boolean", + "default": true + }, + "ai_monitoring.streaming.enabled": { + "type": "boolean", + "default": true + }, + "apdex_t": { + "type": "number", + "default": 0.5 + }, + "api_key": { + "type": "string" + }, + "app_name": { + "type": "string", + "default": "Python Application", + "description": "The application name. Set this to be the name of your application as you would like it to show up in New Relic UI. You may also set this using the NEW_RELIC_APP_NAME environment variable. The UI will then auto-map instances of your application into a entry on your home dashboard page. You can also specify multiple app names to group your aggregated data. For further details, please see: https://docs.newrelic.com/docs/apm/agents/manage-apm-agents/app-naming/use-multiple-names-app/" + }, + "application_logging.enabled": { + "type": "boolean", + "default": true + }, + "application_logging.forwarding.context_data.enabled": { + "type": "boolean", + "default": false + }, + "application_logging.forwarding.context_data.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "application_logging.forwarding.context_data.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "application_logging.forwarding.custom_attributes": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "application_logging.forwarding.enabled": { + "type": "boolean", + "default": true + }, + "application_logging.forwarding.labels.enabled": { + "type": "boolean", + "default": false + }, + "application_logging.forwarding.labels.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "application_logging.local_decorating.enabled": { + "type": "boolean", + "default": false + }, + "application_logging.metrics.enabled": { + "type": "boolean", + "default": true + }, + "attributes.enabled": { + "type": "boolean", + "default": true + }, + "attributes.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "attributes.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "audit_log_file": { + "type": "string" + }, + "aws_lambda_metadata": { + "type": "object", + "additionalProperties": true + }, + "azure_operator.enabled": { + "type": "boolean", + "default": false + }, + "browser_monitoring.attributes.enabled": { + "type": "boolean", + "default": false + }, + "browser_monitoring.attributes.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "browser_monitoring.attributes.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "browser_monitoring.auto_instrument": { + "type": "boolean", + "default": true, + "description": "Browser monitoring is the Real User Monitoring feature of the UI. For those Python web frameworks that are supported, this setting enables the auto-insertion of the browser monitoring JavaScript fragments." + }, + "browser_monitoring.content_type": { + "type": "array", + "items": { + "type": "string" + }, + "default": [ + "text/html" + ] + }, + "browser_monitoring.debug": { + "type": "boolean", + "default": false + }, + "browser_monitoring.enabled": { + "type": "boolean", + "default": true + }, + "browser_monitoring.loader": { + "type": "string", + "default": "rum" + }, + "browser_monitoring.loader_version": { + "type": "string" + }, + "browser_monitoring.ssl_for_http": { + "type": "boolean" + }, + "ca_bundle_path": { + "type": "string" + }, + "capture_environ": { + "type": "boolean", + "default": true + }, + "capture_params": { + "type": "boolean" + }, + "cloud.aws.account_id": { + "type": "integer" + }, + "code_level_metrics.enabled": { + "type": "boolean", + "default": true + }, + "collect_analytics_events": { + "type": "boolean", + "default": true + }, + "collect_custom_events": { + "type": "boolean", + "default": true + }, + "collect_error_events": { + "type": "boolean", + "default": true + }, + "collect_errors": { + "type": "boolean", + "default": true + }, + "collect_span_events": { + "type": "boolean", + "default": true + }, + "collect_traces": { + "type": "boolean", + "default": true + }, + "compressed_content_encoding": { + "type": "string", + "default": "gzip" + }, + "console.allow_interpreter_cmd": { + "type": "boolean", + "default": false + }, + "console.listener_socket": { + "type": "string" + }, + "custom_insights_events.enabled": { + "type": "boolean", + "default": true + }, + "custom_insights_events.max_attribute_value": { + "type": "integer", + "default": 255 + }, + "datastore_tracer.database_name_reporting.enabled": { + "type": "boolean", + "default": true + }, + "datastore_tracer.instance_reporting.enabled": { + "type": "boolean", + "default": true + }, + "distributed_tracing.enabled": { + "type": "boolean", + "default": true, + "description": "Distributed tracing lets you see the path that a request takes through your distributed system. For more information, please consult our distributed tracing planning guide. https://docs.newrelic.com/docs/transition-guide-distributed-tracing" + }, + "distributed_tracing.exclude_newrelic_header": { + "type": "boolean", + "default": true + }, + "distributed_tracing.sampler.adaptive_sampling_target": { + "type": "integer", + "default": 10 + }, + "distributed_tracing.sampler.full_granularity.enabled": { + "type": "boolean", + "default": true + }, + "distributed_tracing.sampler.partial_granularity.enabled": { + "type": "boolean", + "default": false + }, + "distributed_tracing.sampler.partial_granularity.remote_parent_not_sampled.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.partial_granularity.remote_parent_not_sampled.trace_id_ratio_based.ratio": { + "type": "number" + }, + "distributed_tracing.sampler.partial_granularity.remote_parent_sampled.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.partial_granularity.remote_parent_sampled.trace_id_ratio_based.ratio": { + "type": "number" + }, + "distributed_tracing.sampler.partial_granularity.root.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.partial_granularity.root.trace_id_ratio_based.ratio": { + "type": "number" + }, + "distributed_tracing.sampler.partial_granularity.type": { + "type": "string", + "default": "essential" + }, + "distributed_tracing.sampler.remote_parent_not_sampled.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.remote_parent_not_sampled.trace_id_ratio_based.ratio": { + "type": "number" + }, + "distributed_tracing.sampler.remote_parent_sampled.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.remote_parent_sampled.trace_id_ratio_based.ratio": { + "type": "number" + }, + "distributed_tracing.sampler.root.adaptive.sampling_target": { + "type": "integer" + }, + "distributed_tracing.sampler.root.trace_id_ratio_based.ratio": { + "type": "number" + }, + "enabled": { + "type": "boolean", + "default": false + }, + "error_collector.attributes.enabled": { + "type": "boolean", + "default": true + }, + "error_collector.attributes.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "error_collector.attributes.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "error_collector.capture_events": { + "type": "boolean", + "default": true + }, + "error_collector.capture_source": { + "type": "boolean", + "default": false + }, + "error_collector.enabled": { + "type": "boolean", + "default": true, + "description": "The error collector captures information about uncaught exceptions or logged exceptions and sends them to UI for viewing. The error collector is enabled by default. Set this to \"false\" to turn it off. For more details on errors, see https://docs.newrelic.com/docs/apm/agents/manage-apm-agents/agent-data/manage-errors-apm-collect-ignore-or-mark-expected/" + }, + "error_collector.expected_classes": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [], + "description": "Expected errors are reported to the UI but will not affect the Apdex or error rate. To mark specific errors as expected, set this to a space separated list of the Python exception type names to expected. The exception name should be of the form 'module:class'." + }, + "error_collector.expected_status_codes": { + "anyOf": [ + { + "type": "integer" + }, + { + "type": "array", + "items": { + "type": "integer" + } + }, + { + "type": "string", + "description": "Comma- or space-separated integers and ranges (e.g. \"100-102 200-208 226 300-308 404\")" + } + ] + }, + "error_collector.ignore_classes": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [], + "description": "To stop specific errors from reporting to the UI, set this to a space separated list of the Python exception type names to ignore. The exception name should be of the form 'module:class'." + }, + "error_collector.ignore_status_codes": { + "anyOf": [ + { + "type": "integer" + }, + { + "type": "array", + "items": { + "type": "integer" + } + }, + { + "type": "string", + "description": "Comma- or space-separated integers and ranges (e.g. \"100-102 200-208 226 300-308 404\")" + } + ] + }, + "event_harvest_config.harvest_limits.analytic_event_data": { + "type": "integer", + "default": 1200 + }, + "event_harvest_config.harvest_limits.custom_event_data": { + "type": "integer", + "default": 3600 + }, + "event_harvest_config.harvest_limits.error_event_data": { + "type": "integer", + "default": 100 + }, + "event_harvest_config.harvest_limits.log_event_data": { + "type": "integer", + "default": 10000 + }, + "event_harvest_config.harvest_limits.ml_event_data": { + "type": "integer", + "default": 100000 + }, + "event_harvest_config.harvest_limits.span_event_data": { + "type": "integer", + "default": 2000 + }, + "event_loop_visibility.blocking_threshold": { + "type": "number", + "default": 0.1 + }, + "event_loop_visibility.enabled": { + "type": "boolean", + "default": true + }, + "gc_runtime_metrics.enabled": { + "type": "boolean", + "default": false + }, + "gc_runtime_metrics.top_object_count_limit": { + "type": "integer", + "default": 5 + }, + "heroku.dyno_name_prefixes_to_shorten": { + "type": "array", + "items": { + "type": "string" + }, + "default": [ + "run", + "scheduler" + ] + }, + "heroku.use_dyno_names": { + "type": "boolean", + "default": true + }, + "high_security": { + "type": "boolean", + "default": false, + "description": "High Security Mode enforces certain security settings, and prevents them from being overridden, so that no sensitive data is sent to New Relic. Enabling High Security Mode means that request parameters are not collected and SQL can not be sent to New Relic in its raw form. To activate High Security Mode, it must be set to 'true' in this local .ini configuration file AND be set to 'true' in the server-side configuration in the New Relic user interface. It can also be set using the NEW_RELIC_HIGH_SECURITY environment variable. For details, see https://docs.newrelic.com/docs/subscriptions/high-security" + }, + "include_environ": { + "type": "array", + "items": { + "type": "string" + }, + "default": [ + "CONTENT_LENGTH", + "CONTENT_TYPE", + "HTTP_ACCEPT", + "HTTP_HOST", + "HTTP_REFERER", + "HTTP_USER_AGENT", + "REQUEST_METHOD" + ] + }, + "infinite_tracing.batching": { + "type": "boolean", + "default": true + }, + "infinite_tracing.compression": { + "type": "boolean", + "default": true + }, + "infinite_tracing.span_queue_size": { + "type": "integer", + "default": 10000 + }, + "infinite_tracing.ssl": { + "type": "boolean", + "default": true + }, + "infinite_tracing.trace_observer_port": { + "type": "integer", + "default": 443 + }, + "instrumentation.graphql.capture_introspection_queries": { + "type": "boolean", + "default": false + }, + "instrumentation.kombu.consumer.enabled": { + "type": "boolean", + "default": false + }, + "instrumentation.kombu.ignored_exchanges": { + "type": "array", + "items": { + "type": "string" + }, + "default": [ + "celeryev" + ] + }, + "instrumentation.middleware.django.enabled": { + "type": "boolean", + "default": true + }, + "instrumentation.middleware.django.exclude": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "instrumentation.middleware.django.include": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "k8s_operator.enabled": { + "type": "boolean", + "default": false + }, + "labels": { + "type": "string" + }, + "linked_applications": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "log_file": { + "type": "string", + "description": "Sets the name of a file to log agent messages to. Whatever you set this to, you must ensure that the permissions for the containing directory and the file itself are correct, and that the user that your web application runs as can write out to the file. If not able to out a log file, it is also possible to say \"stderr\" and output to standard error output. This would normally result in output appearing in your web server log. It can also be set using the NEW_RELIC_LOG environment variable." + }, + "log_level": { + "type": "string", + "enum": [ + "critical", + "error", + "warning", + "info", + "debug" + ], + "default": "info", + "description": "Sets the level of detail of messages sent to the log file, if a log file location has been provided. Possible values, in increasing order of detail, are: \"critical\", \"error\", \"warning\", \"info\" and \"debug\". When reporting any agent issues to New Relic technical support, the most useful setting for the support engineers is \"debug\". However, this can generate a lot of information very quickly, so it is best not to keep the agent at this level for longer than it takes to reproduce the problem you are experiencing. This may also be set using the NEW_RELIC_LOG_LEVEL environment variable." + }, + "machine_learning.enabled": { + "type": "boolean", + "default": false + }, + "machine_learning.inference_events_value.enabled": { + "type": "boolean", + "default": false + }, + "max_payload_size_in_bytes": { + "type": "integer", + "default": 1000000 + }, + "max_stack_trace_lines": { + "type": "integer", + "default": 50 + }, + "memory_runtime_pid_metrics.enabled": { + "type": "boolean", + "default": true + }, + "message_tracer.segment_parameters_enabled": { + "type": "boolean", + "default": true + }, + "metric_name_rules": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "ml_insights_events.enabled": { + "type": "boolean", + "default": false + }, + "monitor_mode": { + "type": "boolean", + "default": true, + "description": "When \"true\", the agent collects performance data about your application and reports this data to the New Relic UI at newrelic.com. This global switch is normally overridden for each environment below. It may also be set using the NEW_RELIC_MONITOR_MODE environment variable." + }, + "opentelemetry.enabled": { + "type": "boolean", + "default": false + }, + "opentelemetry.traces.enabled": { + "type": "boolean", + "default": true + }, + "opentelemetry.traces.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "opentelemetry.traces.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "otlp_port": { + "type": "integer", + "default": 0 + }, + "package_reporting.enabled": { + "type": "boolean", + "default": true + }, + "port": { + "type": "integer", + "default": 0 + }, + "proxy_host": { + "type": "string" + }, + "proxy_pass": { + "type": "string" + }, + "proxy_port": { + "type": "integer" + }, + "proxy_scheme": { + "type": "string" + }, + "proxy_user": { + "type": "string" + }, + "sampling_rate": { + "type": "integer", + "default": 0 + }, + "sampling_target": { + "type": "integer", + "default": 10 + }, + "sampling_target_period_in_seconds": { + "type": "integer", + "default": 60 + }, + "serverless_mode.enabled": { + "type": "boolean", + "default": false + }, + "shutdown_timeout": { + "type": "number", + "default": 2.5 + }, + "slow_sql.enabled": { + "type": "boolean", + "default": true + }, + "span_events.attributes.enabled": { + "type": "boolean", + "default": true + }, + "span_events.attributes.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "span_events.attributes.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "span_events.enabled": { + "type": "boolean", + "default": true + }, + "startup_timeout": { + "type": "number", + "default": 0.0 + }, + "strip_exception_messages.allowlist": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "strip_exception_messages.enabled": { + "type": "boolean", + "default": false + }, + "synthetics.enabled": { + "type": "boolean", + "default": true + }, + "thread_profiler.enabled": { + "type": "boolean", + "default": true, + "description": "A thread profiling session can be scheduled via the UI when this option is enabled. The thread profiler will periodically capture a snapshot of the call stack for each active thread in the application to construct a statistically representative call tree. For more details on the thread profiler tool, see https://docs.newrelic.com/docs/apm/apm-ui-pages/events/thread-profiler-tool/" + }, + "transaction_events.attributes.enabled": { + "type": "boolean", + "default": true + }, + "transaction_events.attributes.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "transaction_events.attributes.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "transaction_events.enabled": { + "type": "boolean", + "default": true + }, + "transaction_name.limit": { + "type": "integer" + }, + "transaction_name.naming_scheme": { + "type": "string" + }, + "transaction_name_rules": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "transaction_segment_terms": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "transaction_segments.attributes.enabled": { + "type": "boolean", + "default": true + }, + "transaction_segments.attributes.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "transaction_segments.attributes.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "transaction_tracer.attributes.enabled": { + "type": "boolean", + "default": true + }, + "transaction_tracer.attributes.exclude": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "transaction_tracer.attributes.include": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "transaction_tracer.enabled": { + "type": "boolean", + "default": true, + "description": "The transaction tracer captures deep information about slow transactions and sends this to the UI on a periodic basis. The transaction tracer is enabled by default. Set this to \"false\" to turn it off." + }, + "transaction_tracer.explain_enabled": { + "type": "boolean", + "default": true, + "description": "Determines whether the agent will capture query plans for slow SQL queries. Only supported in MySQL and PostgreSQL. Set this to \"false\" to turn it off." + }, + "transaction_tracer.explain_threshold": { + "type": "number", + "default": 0.5, + "description": "Threshold for query execution time below which query plans will not not be captured. Relevant only when \"explain_enabled\" is true." + }, + "transaction_tracer.function_trace": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [], + "description": "Space separated list of function or method names in form 'module:function' or 'module:class.function' for which additional function timing instrumentation will be added." + }, + "transaction_tracer.generator_trace": { + "anyOf": [ + { + "type": "array", + "items": { + "type": "string" + } + }, + { + "type": "string" + } + ], + "default": [] + }, + "transaction_tracer.record_sql": { + "type": "string", + "enum": [ + "off", + "raw", + "obfuscated" + ], + "default": "obfuscated", + "description": "When the transaction tracer is on, SQL statements can optionally be recorded. The recorder has three modes, \"off\" which sends no SQL, \"raw\" which sends the SQL statement in its original form, and \"obfuscated\", which strips out numeric and string literals." + }, + "transaction_tracer.stack_trace_threshold": { + "type": "number", + "default": 0.5, + "description": "Threshold in seconds for when to collect stack trace for a SQL call. In other words, when SQL statements exceed this threshold, then capture and send to the UI the current stack trace. This is helpful for pinpointing where long SQL calls originate from in an application." + }, + "transaction_tracer.top_n": { + "type": "integer", + "default": 20 + }, + "transaction_tracer.transaction_threshold": { + "type": "string", + "description": "Threshold in seconds for when to collect a transaction trace. When the response time of a controller action exceeds this threshold, a transaction trace will be recorded and sent to the UI. Valid values are any positive float value, or (default) \"apdex_f\", which will use the threshold for a dissatisfying Apdex controller action - four times the Apdex T value." + }, + "url_rules": { + "type": "array", + "items": { + "type": "string" + }, + "default": [] + }, + "utilization.billing_hostname": { + "type": "string" + }, + "utilization.detect_aws": { + "type": "boolean", + "default": true + }, + "utilization.detect_azure": { + "type": "boolean", + "default": true + }, + "utilization.detect_azurefunction": { + "type": "boolean", + "default": true + }, + "utilization.detect_docker": { + "type": "boolean", + "default": true + }, + "utilization.detect_gcp": { + "type": "boolean", + "default": true + }, + "utilization.detect_kubernetes": { + "type": "boolean", + "default": true + }, + "utilization.detect_pcf": { + "type": "boolean", + "default": true + }, + "utilization.logical_processors": { + "type": "integer", + "default": 0 + }, + "utilization.total_ram_mib": { + "type": "integer", + "default": 0 + }, + "web_transactions_apdex": { + "type": "object", + "additionalProperties": true + }, + "license_key": { + "type": "string", + "description": "New Relic license key associated with your account. Binds the agent's data to your account in the New Relic UI.", + "minLength": 1 + } + }, + "required": [ + "license_key", + "app_name" + ], + "additionalProperties": true +} diff --git a/.github/workflows/fleet-control-schema-bump.yml b/.github/workflows/fleet-control-schema-bump.yml new file mode 100644 index 0000000000..b28f48ad1f --- /dev/null +++ b/.github/workflows/fleet-control-schema-bump.yml @@ -0,0 +1,128 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: Fleet Control Config Schema Bump + +# Release-prep workflow: opens a PR that bumps the `version:` field in +# .fleetControl/configurationDefinitions.yml based on the cumulative +# schema diff since the previous release tag. +# +# Triggered manually before a release tag is cut, so the bumped version +# can be merged to main and included in the GitHub Release. +# +# Release ordering: +# 1. Run this workflow (workflow_dispatch). +# 2. Wait for the PR to open (or for the workflow to report no bump needed). +# 3. Review and merge the bump PR if one was opened. +# 4. Cut the GitHub Release from the post-merge main. +# +# Bootstrap: if the chosen ref predates the .fleetControl/ directory or +# the `schema:` field, the script exits 0 with a bootstrap message and +# no PR is opened. + +permissions: {} + +on: + workflow_dispatch: + inputs: + since_ref: + description: >- + Git ref (tag or commit) to compare against. Defaults to the + latest v* tag on main. + required: false + default: "" + +jobs: + bump: + name: Open agent config schema bump PR + runs-on: ubuntu-24.04 + permissions: + contents: write + pull-requests: write + + steps: + - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # pin@v4 + with: + fetch-depth: 0 + ref: main + + - name: Resolve --since ref + id: ref + env: + INPUT_REF: ${{ github.event.inputs.since_ref }} + run: | + if [ -n "$INPUT_REF" ]; then + ref="$INPUT_REF" + echo "Using user-supplied ref: $ref" + else + ref=$(git describe --tags --abbrev=0 --match='v*' main 2>/dev/null || true) + if [ -z "$ref" ]; then + echo "::error::No release tag (v*) found on main and no since_ref supplied." + exit 1 + fi + echo "Auto-discovered latest tag: $ref" + fi + echo "ref=$ref" >> "$GITHUB_OUTPUT" + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: "3.11" + + - name: Compute and apply bump + id: bump + run: | + set +e + python3 .fleetControl/schemaGeneration/bump-schema-version.py \ + --since="${{ steps.ref.outputs.ref }}" --ci + code=$? + set -e + case "$code" in + 0) echo "bumped=false" >> "$GITHUB_OUTPUT" ;; + 1) echo "bumped=true" >> "$GITHUB_OUTPUT" ;; + *) echo "Bump failed (exit $code)"; exit "$code" ;; + esac + + - name: Open bump PR + id: create-pr + if: steps.bump.outputs.bumped == 'true' + uses: peter-evans/create-pull-request@6d6857d36972b65feb161a90e484f2984215f83e # pin@v6 + with: + token: ${{ secrets.GITHUB_TOKEN }} + add-paths: .fleetControl/configurationDefinitions.yml + branch: bump-agent-config-schema-version-${{ github.run_id }} + delete-branch: true + base: main + title: "chore: bump agent config schema version" + commit-message: "chore: bump agent config schema version" + body: | + Auto-generated bump PR. + + Bump computed from the schema diff since `${{ steps.ref.outputs.ref }}`. + + ## ⚠️ Release ordering + + **Merge this PR BEFORE cutting the release tag.** + + If the release tag is cut before this PR merges, the tag's + `configurationDefinitions.yml` will still reflect the + pre-bump version even though the schema itself reflects the + new keys. Consumers see a mismatch. + + - name: Summary + if: steps.create-pr.outputs.pull-request-number + run: | + echo "Submitted PR #${{ steps.create-pr.outputs.pull-request-number }} to bump the agent config schema version." + echo "Review and merge before cutting the release tag." + echo "${{ steps.create-pr.outputs.pull-request-url }}" diff --git a/.github/workflows/fleet-control-schema.yml b/.github/workflows/fleet-control-schema.yml new file mode 100644 index 0000000000..fb1eb0eb77 --- /dev/null +++ b/.github/workflows/fleet-control-schema.yml @@ -0,0 +1,106 @@ +# Copyright 2010 New Relic, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: Fleet Control Config Schema + +# Per-push regenerator: walks `newrelic.core.config.global_settings()` and +# rewrites .fleetControl/schemas/config.json on every push to a non-main +# branch. Auto-commits the regenerated schema back to the pushed branch +# so reviewers see schema diffs in the PR. +# +# Path filters are intentionally NOT used here. The agent's settings +# tree is dynamic enough (defaults can shift via imported constants in +# files outside core/config.py) that a path-based skip risks staleness. +# The generator exits 0 quickly when nothing changed, so the cost of +# always running is minimal. +# +# Version bumps to .fleetControl/configurationDefinitions.yml are NOT +# done here -- they happen at release-prep time via the +# fleet-control-schema-bump.yml workflow (workflow_dispatch only). +# +# Skips main because branch protection blocks the bot from pushing +# there. Skips fork branches because GITHUB_TOKEN cannot push to forks +# anyway -- reviewers can run the generator locally and ask the +# contributor to pull. + +permissions: {} + +on: + push: + branches-ignore: + - main + workflow_dispatch: + +concurrency: + group: fleet-control-schema-${{ github.ref }} + cancel-in-progress: true + +jobs: + regenerate: + name: Regenerate config schema + runs-on: ubuntu-24.04 + permissions: + contents: write + + steps: + - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # pin@v4 + with: + ref: ${{ github.ref }} + token: ${{ secrets.GITHUB_TOKEN }} + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: "3.11" + + - name: Run schema generator + id: generate + # `import newrelic.core.config` reads NEW_RELIC_* env vars at import + # time, which would leak into the schema's defaults. Unset all of + # them before invoking the generator (defense-in-depth even though + # CI runners shouldn't have any set). + run: | + while IFS= read -r var; do + unset "$var" + done < <(printenv | grep -oE '^NEW_RELIC_[^=]+' || true) + set +e + python3 .fleetControl/schemaGeneration/generate-schema.py + code=$? + set -e + case "$code" in + 0) echo "changed=false" >> "$GITHUB_OUTPUT" ;; + 1) echo "changed=true" >> "$GITHUB_OUTPUT" ;; + *) echo "Schema generator failed (exit $code)"; exit "$code" ;; + esac + + - name: Run schema generator tests + run: python3 -m unittest discover .fleetControl/schemaGeneration/tests + + - name: Commit and push regenerated schema + if: steps.generate.outputs.changed == 'true' + env: + # Pass the (potentially attacker-controlled) branch name through + # an env var rather than interpolating it directly into the shell + # script -- prevents script injection via crafted branch names. + HEAD_REF: ${{ github.ref_name }} + run: | + if [ -z "$(git status --porcelain .fleetControl/schemas/config.json)" ]; then + echo "Generator reported changes but config.json is clean -- nothing to commit." + exit 0 + fi + git config user.name 'github-actions[bot]' + git config user.email '41898282+github-actions[bot]@users.noreply.github.com' + git add .fleetControl/schemas/config.json + git commit -m "chore: regenerate Fleet Control config schema" + git push origin "HEAD:$HEAD_REF" diff --git a/pyproject.toml b/pyproject.toml index 70700d287e..728d3c4952 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -252,6 +252,15 @@ ignore = [ # Disabled rules in admin scripts "S108", # flake8-bandit (hardcoded log files are never used as input) ] +".fleetControl/schemaGeneration/*" = [ + # Standalone build-time generator + its tests; not part of the agent package. + "INP", # flake8-no-pep420 (no __init__.py; this is a script, not a package) +] +".fleetControl/schemaGeneration/tests/*" = [ + # Schema generator tests use stdlib unittest (no pytest dependency). + "PT009", # use-of-assertEqual (we deliberately use unittest assertions) + "PT027", # use-of-assertRaises (we deliberately use unittest assertions) +] # ========================= # Other Tools Configuration diff --git a/tox.ini b/tox.ini index 30334daa1a..38ef13d7b4 100644 --- a/tox.ini +++ b/tox.ini @@ -636,6 +636,51 @@ changedir = template_mako: tests/template_mako +[testenv:fleet-schema] +; Local "build" for the Fleet Control config schema. +; +; Regenerate and validate the schema before pushing a branch that +; touches anything under newrelic/core/config.py, newrelic/newrelic.ini, +; or .fleetControl/. +; +; Usage (from repo root): +; tox -e fleet-schema +; +; What it does: +; 1. Strips NEW_RELIC_* env vars (their values would leak into the +; generated schema's defaults via newrelic.core.config import). +; 2. Runs the schema generator -- writes .fleetControl/schemas/config.json +; and prints a classified diff against the existing schema. +; 3. Runs all schemaGeneration unit tests. +; 4. Runs dump-settings.py --missing to surface any settings that +; didn't make it into the schema (excluded or skipped). +; +; The generator exits 1 when the schema changed (so you commit the +; result); tox treats that as a failure by design, which is the signal +; you have something to commit. Re-run after committing to see exit 0. +skip_install = true +deps = jsonschema +passenv = + HOME + PATH +setenv = + ; Defensive: clear any NEW_RELIC_* vars inherited from the dev's + ; shell so they don't leak into the schema's defaults. + NEW_RELIC_LICENSE_KEY= + NEW_RELIC_APP_NAME= + NEW_RELIC_LOG= + NEW_RELIC_ENABLED= + NEW_RELIC_CONFIG_FILE= + NEW_RELIC_ENVIRONMENT= +allowlist_externals = env +commands = + ; env -i strips the rest of the NEW_RELIC_* surface (we can't enumerate + ; every possible name in setenv, so wipe them all by inverting env -i). + env -i PATH={env:PATH} HOME={env:HOME} python3 -m unittest discover .fleetControl/schemaGeneration/tests + env -i PATH={env:PATH} HOME={env:HOME} python3 .fleetControl/schemaGeneration/generate-schema.py + env -i PATH={env:PATH} HOME={env:HOME} python3 .fleetControl/schemaGeneration/dump-settings.py --missing + + [pytest] usefixtures = collector_available_fixture