|
| 1 | +# Plan: Add `--domain` option to analyze.sh |
| 2 | + |
| 3 | +Add an optional `--domain <name>` CLI option to `analyze.sh` that selects a single domain (subdirectory of `domains/`) for vertical-slice analysis. When set, only that domain's report scripts run; core reports from `scripts/reports/` and other domains are skipped. Composes naturally with `--report` (horizontal slice). When omitted, behavior is unchanged. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +**Steps** |
| 8 | + |
| 9 | +### Phase 1: `analyze.sh` CLI parsing and validation |
| 10 | + |
| 11 | +1. **Add `--domain` to argument parsing** in [analyze.sh](scripts/analysis/analyze.sh) — add default `analysisDomain=""`, add `--domain)` case in the `while` loop, update `usage()` |
| 12 | +2. **Validate the domain name** — POSIX `case` glob pattern `*[!A-Za-z0-9-]*` to reject invalid characters (only if non-empty), resolve `DOMAINS_DIR="${SCRIPTS_DIR}/../domains"`, check `domains/<name>/` subdirectory exists with clear error message, then set `ANALYSIS_DOMAIN` (plain variable, no `export`) |
| 13 | +3. **Log the domain** in the "Start Analysis" group alongside `analysisReportCompilation`, `settingsProfile`, `exploreMode` |
| 14 | + |
| 15 | +### Phase 2: Report compilation scripts — respect `ANALYSIS_DOMAIN` (*all steps parallel*) |
| 16 | + |
| 17 | +4. **Modify [CsvReports.sh](scripts/reports/compilations/CsvReports.sh)** — when `ANALYSIS_DOMAIN` is set, replace `for directory in "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}"` with just `"${DOMAINS_DIRECTORY}/${ANALYSIS_DOMAIN}"` |
| 18 | +5. **Modify [PythonReports.sh](scripts/reports/compilations/PythonReports.sh)** — same pattern (Python env activation still runs) |
| 19 | +6. **Modify [VisualizationReports.sh](scripts/reports/compilations/VisualizationReports.sh)** — same pattern |
| 20 | +7. **Modify [MarkdownReports.sh](scripts/reports/compilations/MarkdownReports.sh)** — same pattern |
| 21 | +8. **Modify [JupyterReports.sh](scripts/reports/compilations/JupyterReports.sh)** — add early return with log message when `ANALYSIS_DOMAIN` is set (domains don't include Jupyter notebooks in the compilation path) |
| 22 | +9. **No changes to `AllReports.sh`** (chains the above scripts, filtering cascades) or **`DatabaseCsvExportReports.sh`** (special case, invoked explicitly only) |
| 23 | + |
| 24 | +### Phase 3: GitHub Actions workflow (*depends on Phase 1*) |
| 25 | + |
| 26 | +10. **Add `domain` input** to [public-analyze-code-graph.yml](.github/workflows/public-analyze-code-graph.yml) — optional string, default `''`. In the "Analyze" step, prepend `--domain <value>` to `analysis-arguments` when non-empty |
| 27 | + |
| 28 | +### Phase 4: Documentation (*depends on Phase 1*) |
| 29 | + |
| 30 | +11. **Update [analyze.sh](scripts/analysis/analyze.sh) header comments** — add `# Note:` block for `--domain` matching existing style |
| 31 | +12. **Update [COMMANDS.md](COMMANDS.md)** — add `--domain` under "Command Line Options" and document the `ANALYSIS_DOMAIN` environment variable alongside other overrideable variables |
| 32 | +13. **Update [GETTING_STARTED.md](GETTING_STARTED.md)** — add example: `./../../scripts/analysis/analyze.sh --domain anomaly-detection` |
| 33 | + |
| 34 | +### Phase 5: Test scripts (*depends on Phases 1–2*) |
| 35 | + |
| 36 | +14. **Create [testAnalyzeDomainOption.sh](scripts/testAnalyzeDomainOption.sh)** — follow existing conventions (`testCloneGitRepository.sh` pattern: `tearDown`, `successful`, `fail`, `info` helpers; temp directory with fake `domains/` structure; auto-discovered by `runTests.sh` via `find … -name 'test*.sh'`). Test cases: |
| 37 | + - Reject `--domain` with invalid characters (e.g. `../../etc`) → fails at regex |
| 38 | + - Reject `--domain` with nonexistent domain name → fails with error listing available domains |
| 39 | + - Accept `--domain` with valid name matching a temp subdirectory → passes validation (script then fails at "no artifacts" check, which confirms domain validation succeeded) |
| 40 | + - No `--domain` given → passes validation unchanged (same late failure) |
| 41 | + |
| 42 | +--- |
| 43 | + |
| 44 | +**Relevant files** |
| 45 | +- `scripts/analysis/analyze.sh` — add `--domain` parsing, validation (match pattern of `settingsProfile`), set `ANALYSIS_DOMAIN` (no `export`) |
| 46 | +- `scripts/reports/compilations/CsvReports.sh` — conditionally filter `for directory in ...` loop |
| 47 | +- `scripts/reports/compilations/PythonReports.sh` — same conditional filtering |
| 48 | +- `scripts/reports/compilations/VisualizationReports.sh` — same conditional filtering |
| 49 | +- `scripts/reports/compilations/MarkdownReports.sh` — same conditional filtering |
| 50 | +- `scripts/reports/compilations/JupyterReports.sh` — early exit when `ANALYSIS_DOMAIN` is set |
| 51 | +- `.github/workflows/public-analyze-code-graph.yml` — add `domain` input, pass through |
| 52 | +- `COMMANDS.md` — document `--domain` option and `ANALYSIS_DOMAIN` environment variable |
| 53 | +- `GETTING_STARTED.md` — add usage examples |
| 54 | +- `scripts/testAnalyzeDomainOption.sh` — new test script for `--domain` validation (auto-discovered by `runTests.sh`) |
| 55 | + |
| 56 | +**Verification** |
| 57 | +1. Run `analyze.sh --domain nonexistent` → clear error listing available domains |
| 58 | +2. Run `--domain anomaly-detection --report Csv` → only `anomalyDetectionCsv.sh` runs (no core CSV, no `externalDependenciesCsv.sh`) |
| 59 | +3. Run `--domain anomaly-detection` (default `--report All`) → only anomaly-detection scripts for Csv/Python/Visualization/Markdown; Jupyter skipped |
| 60 | +4. Run without `--domain` → all reports + all domains execute unchanged (backward compat) |
| 61 | +5. Run `--domain "../../etc"` → regex rejects it |
| 62 | +6. Run example script with `--domain anomaly-detection` → argument passes through via `"${@}"` |
| 63 | + |
| 64 | +**Decisions** |
| 65 | +- `--domain` and `--report` compose: report selects type (horizontal), domain selects scope (vertical) |
| 66 | +- When `--domain` is set, core reports from `scripts/reports/` are **skipped** — only the domain's scripts run |
| 67 | +- JupyterReports.sh skipped when a domain is selected (no domain-scoped notebooks) |
| 68 | +- Only a single domain selectable (not comma-separated) |
| 69 | +- Propagated via `ANALYSIS_DOMAIN` shell variable (no `export`) from `analyze.sh` to compilation scripts — an env var (not script arguments) because compilation scripts are `source`d (not subprocesses), positional params would conflict in nested sourcing, and it follows the established convention (`DOMAINS_DIRECTORY`, `REPORTS_SCRIPT_DIR`, etc.) |
| 70 | +- **Not exported** — `export` would leak the variable into all child processes (Python, Java/jQAssistant, Neo4j, npm/node) where it could collide with unrelated programs outside this project's control. Since all compilation scripts are `source`d (same shell), `export` is unnecessary |
| 71 | +- **POSIX-compliant where practical** — prefer `case` glob patterns over `[[ =~ ]]` for validation (e.g. `case "${var}" in *[!A-Za-z0-9-]*) …`), `[ ]` over `[[ ]]` for simple tests, standard parameter expansion, and portable constructs. No new external dependencies. Must run on macOS, Linux, and Windows (Git Bash, WSL). Exception: `${BASH_SOURCE[0]}` (already used throughout the codebase). Follow existing script conventions over strict POSIX when they conflict |
| 72 | +- **Readability over brevity** — no abbreviations in variable names, function names, or messages, even if names feel long (e.g. selectedAnalysisDomain over domain, analysisDomainsDirectory over domainsDir). Follow the existing codebase style (analysisReportCompilation, settingsProfile, REPORT_COMPILATIONS_SCRIPT_DIR, etc.) |
0 commit comments