diff --git a/.github/prompts/plan-addDomainOption.prompt.md b/.github/prompts/plan-addDomainOption.prompt.md new file mode 100644 index 000000000..552652d78 --- /dev/null +++ b/.github/prompts/plan-addDomainOption.prompt.md @@ -0,0 +1,72 @@ +# Plan: Add `--domain` option to analyze.sh + +Add an optional `--domain ` CLI option to `analyze.sh` that selects a single domain (subdirectory of `domains/`) for vertical-slice analysis. When set, only that domain's report scripts run; core reports from `scripts/reports/` and other domains are skipped. Composes naturally with `--report` (horizontal slice). When omitted, behavior is unchanged. + +--- + +**Steps** + +### Phase 1: `analyze.sh` CLI parsing and validation + +1. **Add `--domain` to argument parsing** in [analyze.sh](scripts/analysis/analyze.sh) — add default `analysisDomain=""`, add `--domain)` case in the `while` loop, update `usage()` +2. **Validate the domain name** — POSIX `case` glob pattern `*[!A-Za-z0-9-]*` to reject invalid characters (only if non-empty), resolve `DOMAINS_DIR="${SCRIPTS_DIR}/../domains"`, check `domains//` subdirectory exists with clear error message, then set `ANALYSIS_DOMAIN` (plain variable, no `export`) +3. **Log the domain** in the "Start Analysis" group alongside `analysisReportCompilation`, `settingsProfile`, `exploreMode` + +### Phase 2: Report compilation scripts — respect `ANALYSIS_DOMAIN` (*all steps parallel*) + +4. **Modify [CsvReports.sh](scripts/reports/compilations/CsvReports.sh)** — when `ANALYSIS_DOMAIN` is set, replace `for directory in "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}"` with just `"${DOMAINS_DIRECTORY}/${ANALYSIS_DOMAIN}"` +5. **Modify [PythonReports.sh](scripts/reports/compilations/PythonReports.sh)** — same pattern (Python env activation still runs) +6. **Modify [VisualizationReports.sh](scripts/reports/compilations/VisualizationReports.sh)** — same pattern +7. **Modify [MarkdownReports.sh](scripts/reports/compilations/MarkdownReports.sh)** — same pattern +8. **Modify [JupyterReports.sh](scripts/reports/compilations/JupyterReports.sh)** — add early return with log message when `ANALYSIS_DOMAIN` is set (domains don't include Jupyter notebooks in the compilation path) +9. **No changes to `AllReports.sh`** (chains the above scripts, filtering cascades) or **`DatabaseCsvExportReports.sh`** (special case, invoked explicitly only) + +### Phase 3: GitHub Actions workflow (*depends on Phase 1*) + +10. **Add `domain` input** to [public-analyze-code-graph.yml](.github/workflows/public-analyze-code-graph.yml) — optional string, default `''`. In the "Analyze" step, prepend `--domain ` to `analysis-arguments` when non-empty + +### Phase 4: Documentation (*depends on Phase 1*) + +11. **Update [analyze.sh](scripts/analysis/analyze.sh) header comments** — add `# Note:` block for `--domain` matching existing style +12. **Update [COMMANDS.md](COMMANDS.md)** — add `--domain` under "Command Line Options" and document the `ANALYSIS_DOMAIN` environment variable alongside other overrideable variables +13. **Update [GETTING_STARTED.md](GETTING_STARTED.md)** — add example: `./../../scripts/analysis/analyze.sh --domain anomaly-detection` + +### Phase 5: Test scripts (*depends on Phases 1–2*) + +14. **Create [testAnalyzeDomainOption.sh](scripts/testAnalyzeDomainOption.sh)** — follow existing conventions (`testCloneGitRepository.sh` pattern: `tearDown`, `successful`, `fail`, `info` helpers; temp directory with fake `domains/` structure; auto-discovered by `runTests.sh` via `find … -name 'test*.sh'`). Test cases: + - Reject `--domain` with invalid characters (e.g. `../../etc`) → fails at regex + - Reject `--domain` with nonexistent domain name → fails with error listing available domains + - Accept `--domain` with valid name matching a temp subdirectory → passes validation (script then fails at "no artifacts" check, which confirms domain validation succeeded) + - No `--domain` given → passes validation unchanged (same late failure) + +--- + +**Relevant files** +- `scripts/analysis/analyze.sh` — add `--domain` parsing, validation (match pattern of `settingsProfile`), set `ANALYSIS_DOMAIN` (no `export`) +- `scripts/reports/compilations/CsvReports.sh` — conditionally filter `for directory in ...` loop +- `scripts/reports/compilations/PythonReports.sh` — same conditional filtering +- `scripts/reports/compilations/VisualizationReports.sh` — same conditional filtering +- `scripts/reports/compilations/MarkdownReports.sh` — same conditional filtering +- `scripts/reports/compilations/JupyterReports.sh` — early exit when `ANALYSIS_DOMAIN` is set +- `.github/workflows/public-analyze-code-graph.yml` — add `domain` input, pass through +- `COMMANDS.md` — document `--domain` option and `ANALYSIS_DOMAIN` environment variable +- `GETTING_STARTED.md` — add usage examples +- `scripts/testAnalyzeDomainOption.sh` — new test script for `--domain` validation (auto-discovered by `runTests.sh`) + +**Verification** +1. Run `analyze.sh --domain nonexistent` → clear error listing available domains +2. Run `--domain anomaly-detection --report Csv` → only `anomalyDetectionCsv.sh` runs (no core CSV, no `externalDependenciesCsv.sh`) +3. Run `--domain anomaly-detection` (default `--report All`) → only anomaly-detection scripts for Csv/Python/Visualization/Markdown; Jupyter skipped +4. Run without `--domain` → all reports + all domains execute unchanged (backward compat) +5. Run `--domain "../../etc"` → regex rejects it +6. Run example script with `--domain anomaly-detection` → argument passes through via `"${@}"` + +**Decisions** +- `--domain` and `--report` compose: report selects type (horizontal), domain selects scope (vertical) +- When `--domain` is set, core reports from `scripts/reports/` are **skipped** — only the domain's scripts run +- JupyterReports.sh skipped when a domain is selected (no domain-scoped notebooks) +- Only a single domain selectable (not comma-separated) +- Propagated via `ANALYSIS_DOMAIN` shell variable (no `export`) from `analyze.sh` to compilation scripts — an env var (not script arguments) because compilation scripts are `source`d (not subprocesses), positional params would conflict in nested sourcing, and it follows the established convention (`DOMAINS_DIRECTORY`, `REPORTS_SCRIPT_DIR`, etc.) +- **Not exported** — `export` would leak the variable into all child processes (Python, Java/jQAssistant, Neo4j, npm/node) where it could collide with unrelated programs outside this project's control. Since all compilation scripts are `source`d (same shell), `export` is unnecessary +- **POSIX-compliant where practical** — prefer `case` glob patterns over `[[ =~ ]]` for validation (e.g. `case "${var}" in *[!A-Za-z0-9-]*) …`), `[ ]` over `[[ ]]` for simple tests, standard parameter expansion, and portable constructs. No new external dependencies. Must run on macOS, Linux, and Windows (Git Bash, WSL). Exception: `${BASH_SOURCE[0]}` (already used throughout the codebase). Follow existing script conventions over strict POSIX when they conflict +- **Readability over brevity** — no abbreviations in variable names, function names, or messages, even if names feel long (e.g. selectedAnalysisDomain over domain, analysisDomainsDirectory over domainsDir). Follow the existing codebase style (analysisReportCompilation, settingsProfile, REPORT_COMPILATIONS_SCRIPT_DIR, etc.) diff --git a/.github/workflows/internal-typescript-upload-code-example.yml b/.github/workflows/internal-typescript-upload-code-example.yml index c6b92309b..9d4d00ed0 100644 --- a/.github/workflows/internal-typescript-upload-code-example.yml +++ b/.github/workflows/internal-typescript-upload-code-example.yml @@ -121,4 +121,5 @@ jobs: analysis-name: ${{ needs.prepare-code-to-analyze.outputs.analysis-name }} sources-upload-name: ${{ needs.prepare-code-to-analyze.outputs.sources-upload-name }} jupyter-pdf: "false" - analysis-arguments: "--explore" # Only setup the Graph, do not generate any reports \ No newline at end of file + domain: "external-dependencies" # For testing purposes: only run the external-dependencies domain (vertical slice) + analysis-arguments: "--report Csv" # For testing purposes: only generate CSV reports \ No newline at end of file diff --git a/.github/workflows/public-analyze-code-graph.yml b/.github/workflows/public-analyze-code-graph.yml index 959a922e8..adf71faba 100644 --- a/.github/workflows/public-analyze-code-graph.yml +++ b/.github/workflows/public-analyze-code-graph.yml @@ -73,6 +73,18 @@ on: required: false type: string default: '--profile Neo4j-latest-low-memory' + domain: + description: > + The name of an analysis domain to run. + Must match a subdirectory name in the 'domains/' directory + (e.g. 'anomaly-detection', 'external-dependencies'). + When set, only that domain's report scripts run; + core reports from 'scripts/reports/' and other domains are skipped. + Can be combined with 'analysis-arguments' to further narrow the reports. + Default: '' (all domains and reports run unchanged) + required: false + type: string + default: '' typescript-scan-heap-memory: description: > The heap memory size in MB to use for the TypeScript code scans (default=4096). @@ -252,7 +264,8 @@ jobs: working-directory: temp run: | ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/' - + - name: Assemble DOMAIN_ARGUMENT + run: echo "domainAnalysisArgument=${{ inputs.domain != '' && format('--domain {0} ', inputs.domain) || '' }}" >> $GITHUB_ENV - name: (Code Analysis) Analyze ${{ inputs.analysis-name }} working-directory: temp/${{ inputs.analysis-name }} # Shell type can be skipped if jupyter notebook analysis-results (and therefore conda) aren't needed @@ -264,7 +277,7 @@ jobs: PREPARE_CONDA_ENVIRONMENT: "false" # Had already been done in step with id "prepare-conda-environment". USE_VIRTUAL_PYTHON_ENVIRONMENT_VENV: ${{ inputs.use-venv_virtual_python_environment }} run: | - TYPESCRIPT_SCAN_HEAP_MEMORY=${{ inputs.typescript-scan-heap-memory }} ./../../scripts/analysis/analyze.sh ${{ inputs.analysis-arguments }} + TYPESCRIPT_SCAN_HEAP_MEMORY=${{ inputs.typescript-scan-heap-memory }} ./../../scripts/analysis/analyze.sh ${{ env.domainAnalysisArgument }}${{ inputs.analysis-arguments }} - name: Set artifact name for uploaded analysis results id: set-analysis-results-artifact-name diff --git a/COMMANDS.md b/COMMANDS.md index c9f4d4043..f5c793153 100644 --- a/COMMANDS.md +++ b/COMMANDS.md @@ -86,6 +86,8 @@ The [analyze.sh](./scripts/analysis/analyze.sh) command comes with these command - `--explore` activates the "explore" mode where no reports are generated. Furthermore, Neo4j won't be stopped at the end of the script and will therefore continue running. This makes it easy to just set everything up but then use the running Neo4j server to explore the data manually. +- `--domain anomaly-detection` selects a single analysis domain (a subdirectory of [domains/](./domains/)) to run reports for, following a vertical-slice approach. When set, only that domain's report scripts run; core reports from `scripts/reports/` and other domains are skipped. The domain option composes with `--report` to further narrow down which reports are generated, e.g. `--domain anomaly-detection --report Csv`. When not specified, all domains and reports run unchanged. The selected domain name is passed to report compilation scripts via the environment variable `ANALYSIS_DOMAIN`. Available domains can be found in the [domains/](./domains/) directory. + ### Notes - Be sure to use Java 21 for Neo4j v2025, Java 17 for v5 and Java 11 for v4. Details see [Neo4j System Requirements / Java](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-java). @@ -144,6 +146,22 @@ without report generation use this command: ./../../scripts/analysis/analyze.sh --explore ``` +#### Only run the reports of one specific domain + +To only run the reports of a single analysis domain (vertical slice, no additional Python or Node.js dependencies for core reports): + +```shell +./../../scripts/analysis/analyze.sh --domain anomaly-detection +``` + +#### Only run the CSV reports of one specific domain + +To further narrow down to only one report type within a specific domain: + +```shell +./../../scripts/analysis/analyze.sh --domain anomaly-detection --report Csv +``` + ## Generate Markdown References ### Generate Cypher Reference diff --git a/GETTING_STARTED.md b/GETTING_STARTED.md index 97789e852..b95e0831a 100644 --- a/GETTING_STARTED.md +++ b/GETTING_STARTED.md @@ -118,6 +118,18 @@ Use these optional command line options as needed: ./../../scripts/analysis/analyze.sh --explore ``` +- Only run the reports of one specific domain (vertical slice): + + ```shell + ./../../scripts/analysis/analyze.sh --domain anomaly-detection + ``` + +- Only run the CSV reports of one specific domain: + + ```shell + ./../../scripts/analysis/analyze.sh --domain anomaly-detection --report Csv + ``` + 👉 Open your browser and login to your local Neo4j Web UI (`http://localhost:7474/browser`) with "neo4j" as user and the initial password you've chosen. ## GitHub Actions diff --git a/scripts/analysis/analyze.sh b/scripts/analysis/analyze.sh index 31dff6765..36d1e5955 100755 --- a/scripts/analysis/analyze.sh +++ b/scripts/analysis/analyze.sh @@ -24,6 +24,12 @@ # It activates "explore" mode where no reports are executed and Neo4j keeps running (skip stop step). # This makes it easy to just set everything up but then use the running Neo4j server to explore the data manually. +# Note: The argument "--domain" is optional. The default value is "" (empty = all domains run unchanged). +# It selects a single analysis domain (a subdirectory of "domains/") to run reports for, following a vertical-slice approach. +# When set, only that domain's report scripts run; core reports from "scripts/reports/" and other domains are skipped. +# The domain option can be combined with "--report" e.g. "--domain anomaly-detection --report Csv". +# Only a single domain can be selected. The domain name must match a subdirectory of the "domains" directory. + # Note: The script and its sub scripts are designed to be as efficient as possible # when it comes to subsequent executions. # Existing downloads, installations, scans and processes will be detected. @@ -44,24 +50,41 @@ LOG_GROUP_END=${LOG_GROUP_END:-"::endgroup::"} # Prefix to end a log group. Defa # Function to display script usage usage() { - echo "Usage: $0 [--report ] [--profile ] [--explore]" + echo "Usage: $0 [--report ] [--profile ] [--domain ] [--explore]" exit 1 } # Default values analysisReportCompilation="All" settingsProfile="Default" +selectedAnalysisDomain="" exploreMode=false +# Function to check if a parameter value is missing (either empty or another option starting with --) +is_missing_value_parameter() { + case "${2:-}" in + ''|--*) return 0 ;; # missing value + *) return 1 ;; # value is present + esac +} + # Parse command line arguments while [[ $# -gt 0 ]]; do key="$1" case $key in --report) + if is_missing_value_parameter "$1" "$2"; then + echo "analyze: Error: --report requires a value." + usage + fi analysisReportCompilation="$2" shift ;; --profile) + if is_missing_value_parameter "$1" "$2"; then + echo "analyze: Error: --profile requires a value." + usage + fi settingsProfile="$2" shift ;; @@ -69,6 +92,14 @@ while [[ $# -gt 0 ]]; do exploreMode=true shift ;; + --domain) + if is_missing_value_parameter "$1" "$2"; then + echo "analyze: Error: --domain requires a value." + usage + fi + selectedAnalysisDomain="$2" + shift + ;; *) echo "analyze: Error: Unknown option: ${key}" usage @@ -89,6 +120,16 @@ if ! [[ ${settingsProfile} =~ ^[-[:alnum:]]+$ ]]; then exit 1 fi +# Assure that the selected analysis domain only consists of letters, numbers, and hyphens (if specified). +if [ -n "${selectedAnalysisDomain}" ]; then + case "${selectedAnalysisDomain}" in + *[!A-Za-z0-9-]*) + echo "analyze: Error: Domain '${selectedAnalysisDomain}' can only contain letters, numbers, and hyphens." + exit 1 + ;; + esac +fi + # Check if there is something to scan and analyze if [ ! -d "${ARTIFACTS_DIRECTORY}" ] && [ ! -d "${SOURCE_DIRECTORY}" ] ; then echo "analyze: Neither ${ARTIFACTS_DIRECTORY} nor the ${SOURCE_DIRECTORY} directory exist. Please download artifacts/sources first." @@ -98,6 +139,7 @@ fi echo "${LOG_GROUP_START}Start Analysis" echo "analyze: analysisReportCompilation=${analysisReportCompilation}" echo "analyze: settingsProfile=${settingsProfile}" +echo "analyze: selectedAnalysisDomain=${selectedAnalysisDomain}" echo "analyze: exploreMode=${exploreMode}" ## Get this "scripts/analysis" directory if not already set @@ -111,6 +153,24 @@ echo "analyze: ANALYSIS_SCRIPT_DIR=${ANALYSIS_SCRIPT_DIR}" SCRIPTS_DIR=${SCRIPTS_DIR:-$(dirname -- "${ANALYSIS_SCRIPT_DIR}")} # Repository directory containing the shell scripts echo "analyze: SCRIPTS_DIR=${SCRIPTS_DIR}" +# Resolve the analysis domains directory. Can be overridden by the environment variable DOMAINS_DIRECTORY. +DOMAINS_DIRECTORY=${DOMAINS_DIRECTORY:-"${SCRIPTS_DIR}/../domains"} +echo "analyze: DOMAINS_DIRECTORY=${DOMAINS_DIRECTORY}" + +# When a specific analysis domain is selected, validate that it matches an existing subdirectory of the domains directory. +# ANALYSIS_DOMAIN is empty when no domain is selected, causing all domains to run unchanged. +ANALYSIS_DOMAIN="" +if [ -n "${selectedAnalysisDomain}" ]; then + if [ ! -d "${DOMAINS_DIRECTORY}/${selectedAnalysisDomain}" ]; then + availableAnalysisDomains=$(find "${DOMAINS_DIRECTORY}" -mindepth 1 -maxdepth 1 -type d -exec basename {} \; 2>/dev/null | sort | tr '\n' ' ') + echo "analyze: Error: Selected domain '${selectedAnalysisDomain}' does not match any subdirectory in ${DOMAINS_DIRECTORY}." + echo "analyze: Available domains: ${availableAnalysisDomains}" + exit 1 + fi + ANALYSIS_DOMAIN="${selectedAnalysisDomain}" + echo "analyze: ANALYSIS_DOMAIN=${ANALYSIS_DOMAIN}" +fi + # Assure that there is a report compilation script for the given report argument. REPORT_COMPILATION_SCRIPT="${SCRIPTS_DIR}/${REPORTS_SCRIPTS_DIRECTORY}/${REPORT_COMPILATIONS_SCRIPTS_DIRECTORY}/${analysisReportCompilation}Reports.sh" if [ ! -f "${REPORT_COMPILATION_SCRIPT}" ] ; then diff --git a/scripts/reports/compilations/CsvReports.sh b/scripts/reports/compilations/CsvReports.sh index cf49e43e7..ed933b41c 100755 --- a/scripts/reports/compilations/CsvReports.sh +++ b/scripts/reports/compilations/CsvReports.sh @@ -29,12 +29,21 @@ echo "${LOG_GROUP_START}$(date +'%Y-%m-%dT%H:%M:%S') Initialize CSV Reports"; echo "${SCRIPT_NAME}: REPORT_COMPILATIONS_SCRIPT_DIR=${REPORT_COMPILATIONS_SCRIPT_DIR}" echo "${SCRIPT_NAME}: REPORTS_SCRIPT_DIR=${REPORTS_SCRIPT_DIR}" echo "${SCRIPT_NAME}: DOMAINS_DIRECTORY=${DOMAINS_DIRECTORY}" +echo "${SCRIPT_NAME}: ANALYSIS_DOMAIN=${ANALYSIS_DOMAIN}" echo "${LOG_GROUP_END}"; # Run all CSV report scripts (filename ending with Csv.sh) in the REPORTS_SCRIPT_DIR and DOMAINS_DIRECTORY directories. -for directory in "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}"; do +# When a specific analysis domain is selected, only run reports for that domain's directory. +# Otherwise, run reports from both the general reports directory and all domains. +if [ -n "${ANALYSIS_DOMAIN}" ]; then + analysisReportScriptDirectories=( "${DOMAINS_DIRECTORY}/${ANALYSIS_DOMAIN}" ) +else + analysisReportScriptDirectories=( "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}" ) +fi + +for directory in "${analysisReportScriptDirectories[@]}"; do if [ ! -d "${directory}" ]; then - echo "${SCRIPT_NAME}: Error: Directory ${directory} does not exist. Please check your REPORTS_SCRIPT_DIR and DOMAIN_DIRECTORY settings." + echo "${SCRIPT_NAME}: Error: Directory ${directory} does not exist. Please check your REPORTS_SCRIPT_DIR and DOMAINS_DIRECTORY settings." exit 1 fi diff --git a/scripts/reports/compilations/JupyterReports.sh b/scripts/reports/compilations/JupyterReports.sh index 1fb8cd965..4db8e5530 100755 --- a/scripts/reports/compilations/JupyterReports.sh +++ b/scripts/reports/compilations/JupyterReports.sh @@ -36,6 +36,13 @@ echo "${SCRIPT_NAME}: SCRIPTS_DIR=${SCRIPTS_DIR}" echo "${SCRIPT_NAME}: JUPYTER_NOTEBOOK_DIRECTORY=${JUPYTER_NOTEBOOK_DIRECTORY}" echo "${LOG_GROUP_END}"; +# Jupyter Notebook reports are not domain-scoped. Skip them when a specific analysis domain is selected. +if [ -n "${ANALYSIS_DOMAIN}" ]; then + echo "${SCRIPT_NAME}: Skipping Jupyter Notebook reports because a specific analysis domain is selected (ANALYSIS_DOMAIN=${ANALYSIS_DOMAIN})." + echo "${SCRIPT_NAME}: Jupyter Notebook reports are not domain-scoped and cannot be run for a specific domain." + return 0 2>/dev/null || exit 0 +fi + # Run all jupiter notebooks for jupyter_notebook_file in "${JUPYTER_NOTEBOOK_DIRECTORY}"/*.ipynb; do jupyter_notebook_filename=$(basename -- "${jupyter_notebook_file}") diff --git a/scripts/reports/compilations/MarkdownReports.sh b/scripts/reports/compilations/MarkdownReports.sh index 812d6d1f6..8d78c3342 100755 --- a/scripts/reports/compilations/MarkdownReports.sh +++ b/scripts/reports/compilations/MarkdownReports.sh @@ -29,12 +29,21 @@ echo "${LOG_GROUP_START}$(date +'%Y-%m-%dT%H:%M:%S') Initialize Markdown Reports echo "${SCRIPT_NAME}: REPORT_COMPILATIONS_SCRIPT_DIR=${REPORT_COMPILATIONS_SCRIPT_DIR}" echo "${SCRIPT_NAME}: REPORTS_SCRIPT_DIR=${REPORTS_SCRIPT_DIR}" echo "${SCRIPT_NAME}: DOMAINS_DIRECTORY=${DOMAINS_DIRECTORY}" +echo "${SCRIPT_NAME}: ANALYSIS_DOMAIN=${ANALYSIS_DOMAIN}" echo "${LOG_GROUP_END}"; # Run all Markdown report scripts (filename ending with Markdown.sh or Summary.sh) in the REPORTS_SCRIPT_DIR and DOMAINS_DIRECTORY directories. -for directory in "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}"; do +# When a specific analysis domain is selected, only run reports for that domain's directory. +# Otherwise, run reports from both the general reports directory and all domains. +if [ -n "${ANALYSIS_DOMAIN}" ]; then + analysisReportScriptDirectories=( "${DOMAINS_DIRECTORY}/${ANALYSIS_DOMAIN}" ) +else + analysisReportScriptDirectories=( "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}" ) +fi + +for directory in "${analysisReportScriptDirectories[@]}"; do if [ ! -d "${directory}" ]; then - echo "${SCRIPT_NAME}: Error: Directory ${directory} does not exist. Please check your REPORTS_SCRIPT_DIR and DOMAIN_DIRECTORY settings." + echo "${SCRIPT_NAME}: Error: Directory ${directory} does not exist. Please check your REPORTS_SCRIPT_DIR and DOMAINS_DIRECTORY settings." exit 1 fi diff --git a/scripts/reports/compilations/PythonReports.sh b/scripts/reports/compilations/PythonReports.sh index d2e02425a..c04fd543d 100755 --- a/scripts/reports/compilations/PythonReports.sh +++ b/scripts/reports/compilations/PythonReports.sh @@ -31,6 +31,7 @@ echo "${SCRIPT_NAME}: REPORT_COMPILATIONS_SCRIPT_DIR=${REPORT_COMPILATIONS_SCRIP echo "${SCRIPT_NAME}: REPORTS_SCRIPT_DIR=${REPORTS_SCRIPT_DIR}" echo "${SCRIPT_NAME}: SCRIPTS_DIR=${SCRIPTS_DIR}" echo "${SCRIPT_NAME}: DOMAINS_DIRECTORY=${DOMAINS_DIRECTORY}" +echo "${SCRIPT_NAME}: ANALYSIS_DOMAIN=${ANALYSIS_DOMAIN}" # Create and activate (if necessary) a virtual environment (Conda or venv). # For Conda, the environment name is taken from the environment variable CODEGRAPH_CONDA_ENVIRONMENT (default "codegraph") @@ -42,7 +43,15 @@ time source "${SCRIPTS_DIR}/activatePythonEnvironment.sh" echo "${LOG_GROUP_END}"; # Run all Python report scripts (filename ending with Csv.sh) in the REPORTS_SCRIPT_DIR and DOMAINS_DIRECTORY directories. -for directory in "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}"; do +# When a specific analysis domain is selected, only run reports for that domain's directory. +# Otherwise, run reports from both the general reports directory and all domains. +if [ -n "${ANALYSIS_DOMAIN}" ]; then + analysisReportScriptDirectories=( "${DOMAINS_DIRECTORY}/${ANALYSIS_DOMAIN}" ) +else + analysisReportScriptDirectories=( "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}" ) +fi + +for directory in "${analysisReportScriptDirectories[@]}"; do if [ ! -d "${directory}" ]; then echo "${SCRIPT_NAME}: Error: Directory ${directory} does not exist. Please check your REPORTS_SCRIPT_DIR and DOMAIN_DIRECTORY settings." exit 1 diff --git a/scripts/reports/compilations/VisualizationReports.sh b/scripts/reports/compilations/VisualizationReports.sh index dcb5ada27..4a4367341 100755 --- a/scripts/reports/compilations/VisualizationReports.sh +++ b/scripts/reports/compilations/VisualizationReports.sh @@ -30,12 +30,21 @@ echo "${LOG_GROUP_START}$(date +'%Y-%m-%dT%H:%M:%S') Initialize Visualization Re echo "${SCRIPT_NAME}: REPORT_COMPILATIONS_SCRIPT_DIR=${REPORT_COMPILATIONS_SCRIPT_DIR}" echo "${SCRIPT_NAME}: REPORTS_SCRIPT_DIR=${REPORTS_SCRIPT_DIR}" echo "${SCRIPT_NAME}: DOMAINS_DIRECTORY=${DOMAINS_DIRECTORY}" +echo "${SCRIPT_NAME}: ANALYSIS_DOMAIN=${ANALYSIS_DOMAIN}" echo "${LOG_GROUP_END}"; # Run all visualization scripts (filename ending with Visualization.sh) in the REPORTS_SCRIPT_DIR and DOMAINS_DIRECTORY directories. -for directory in "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}"; do +# When a specific analysis domain is selected, only run reports for that domain's directory. +# Otherwise, run reports from both the general reports directory and all domains. +if [ -n "${ANALYSIS_DOMAIN}" ]; then + analysisReportScriptDirectories=( "${DOMAINS_DIRECTORY}/${ANALYSIS_DOMAIN}" ) +else + analysisReportScriptDirectories=( "${REPORTS_SCRIPT_DIR}" "${DOMAINS_DIRECTORY}" ) +fi + +for directory in "${analysisReportScriptDirectories[@]}"; do if [ ! -d "${directory}" ]; then - echo "PythonReports: Error: Directory ${directory} does not exist. Please check your REPORTS_SCRIPT_DIR and DOMAIN_DIRECTORY settings." + echo "${SCRIPT_NAME}: Error: Directory ${directory} does not exist. Please check your REPORTS_SCRIPT_DIR and DOMAINS_DIRECTORY settings." exit 1 fi diff --git a/scripts/testAnalyzeDomainOption.sh b/scripts/testAnalyzeDomainOption.sh new file mode 100755 index 000000000..4d35d3858 --- /dev/null +++ b/scripts/testAnalyzeDomainOption.sh @@ -0,0 +1,210 @@ +#!/usr/bin/env bash + +# Tests "--domain" command line option of "analyze.sh". + +# Fail on any error ("-e" = exit on first error, "-o pipefail" exit on errors within piped commands) +set -o errexit -o pipefail + +# Local constants +SCRIPT_NAME=$(basename "${0}") +COLOR_ERROR='\033[0;31m' # red +COLOR_DE_EMPHASIZED='\033[0;90m' # dark gray +COLOR_SUCCESSFUL="\033[0;32m" # green +COLOR_DEFAULT='\033[0m' + +## Get this "scripts" directory if not already set +# Even if $BASH_SOURCE is made for Bourne-like shells it is also supported by others and therefore here the preferred solution. +# CDPATH reduces the scope of the cd command to potentially prevent unintended directory changes. +# This way non-standard tools like readlink aren't needed. +SCRIPTS_DIR=${SCRIPTS_DIR:-$( CDPATH=. cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd -P )} # Repository directory containing the shell scripts +# Derive the "scripts/analysis" directory always from this test's own path to avoid environment pollution. +ANALYSIS_SCRIPTS_DIR=$( CDPATH=. cd -- "${SCRIPTS_DIR}/analysis" && pwd -P ) + +tearDown() { + # echo "${SCRIPT_NAME}: Tear down tests...." + rm -rf "${temporaryTestDirectory}" +} + +successful() { + echo "" + echo -e "${COLOR_DE_EMPHASIZED}${SCRIPT_NAME}:${COLOR_DEFAULT} ${COLOR_SUCCESSFUL}✅ Tests finished successfully.${COLOR_DEFAULT}" + tearDown + + # If sourced, return to caller; if executed directly, exit. + if [ "${BASH_SOURCE[0]}" != "$0" ]; then + return 0 + else + exit 0 + fi +} + +info() { + local infoMessage="${1}" + echo -e "${COLOR_DE_EMPHASIZED}${SCRIPT_NAME}:${COLOR_DEFAULT} ${infoMessage}" +} + +fail() { + local errorMessage="${1}" + echo -e "${COLOR_DE_EMPHASIZED}${SCRIPT_NAME}: ${COLOR_ERROR}${errorMessage}${COLOR_DEFAULT}" + tearDown + return 1 +} + +printTestLogFileContent() { + local logFileName="${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" + if [ -f "${logFileName}" ]; then + local logFileContent + logFileContent=$( cat "${logFileName}" ) + # Remove color codes from the output for better readability in test logs + logFileContent=$(echo -e "${logFileContent}" | sed -r "s/\x1B\[[0-9;]*[mK]//g") + echo -e "${COLOR_DE_EMPHASIZED}${logFileContent}${COLOR_DEFAULT}" + else + echo -e "${COLOR_ERROR}No log file found at expected location: ${logFileName}${COLOR_DEFAULT}" + fi +} + +analyzeExpectingFailureUnderTest() { + set +o errexit + ( + cd "${temporaryTestDirectory}" + ARTIFACTS_DIRECTORY="${temporaryArtifactsDirectory}" + SCRIPTS_DIR="${temporaryMinimalScriptsDirectory}" + DOMAINS_DIRECTORY="${temporaryDomainsDirectory}" + source "${ANALYSIS_SCRIPTS_DIR}/analyze.sh" "$@" > "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>&1 + ) + exitCode=$? + set -o errexit + if [ ${exitCode} -eq 0 ]; then + fail "❌ Test failed: Script exited with zero exit code but was expected to fail." + fi + printTestLogFileContent +} + +analyzeExpectingSuccessUnderTest() { + ( + cd "${temporaryTestDirectory}" + ARTIFACTS_DIRECTORY="${temporaryArtifactsDirectory}" + SCRIPTS_DIR="${temporaryMinimalScriptsDirectory}" + DOMAINS_DIRECTORY="${temporaryDomainsDirectory}" + source "${ANALYSIS_SCRIPTS_DIR}/analyze.sh" "$@" > "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>&1 + ) + exitCode=$? + if [ ${exitCode} -ne 0 ]; then + fail "❌ Test failed: Script exited with non-zero exit code ${exitCode}." + fi + printTestLogFileContent +} + +info "Starting tests...." + +# Create test resources +temporaryTestDirectory=$(mktemp -d 2>/dev/null || mktemp -d -t 'temporaryTestDirectory') +mkdir -p "${temporaryTestDirectory}" + +# Create minimal artifacts directory so the "no artifacts" check in analyze.sh passes +temporaryArtifactsDirectory="${temporaryTestDirectory}/artifacts" +mkdir -p "${temporaryArtifactsDirectory}" + +# Create domains directory with one valid test domain subdirectory +temporaryDomainsDirectory="${temporaryTestDirectory}/domains" +mkdir -p "${temporaryDomainsDirectory}/valid-test-domain" + +# Create a minimal scripts directory structure to satisfy all file-existence checks in analyze.sh. +# The placeholder scripts are empty (no-op) bash scripts since analyze.sh sources them. +temporaryMinimalScriptsDirectory="${temporaryTestDirectory}/scripts" +mkdir -p "${temporaryMinimalScriptsDirectory}/reports/compilations" +mkdir -p "${temporaryMinimalScriptsDirectory}/profiles" +for placeholderScriptFile in \ + "reports/compilations/AllReports.sh" \ + "profiles/Default.sh" \ + "setupNeo4j.sh" \ + "setupJQAssistant.sh" \ + "startNeo4j.sh" \ + "resetAndScanChanged.sh" \ + "prepareAnalysis.sh" \ + "stopNeo4j.sh"; do + printf '#!/usr/bin/env bash\n# Minimal placeholder script for testing - does nothing\n' \ + > "${temporaryMinimalScriptsDirectory}/${placeholderScriptFile}" +done + +# -------- Test case 1 -------- +test_case_number=1 +echo "" +info "${test_case_number}.) Should fail when --domain contains characters that are not letters, numbers, or hyphens." + +analyzeExpectingFailureUnderTest --domain "../../etc" +if ! grep -q "can only contain letters" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected an error message about invalid domain characters but got different output." +fi + +# -------- Test case 2 -------- +test_case_number=2 +echo "" +info "${test_case_number}.) Should fail when --domain is a valid name but does not match any subdirectory in the domains directory." + +analyzeExpectingFailureUnderTest --domain "nonexistent-domain" +if ! grep -q "does not match" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected an error message about a non-existing domain but got different output." +fi + +# -------- Test case 3 -------- +test_case_number=3 +echo "" +info "${test_case_number}.) Should pass domain validation when --domain matches an existing subdirectory and set ANALYSIS_DOMAIN accordingly." + +analyzeExpectingSuccessUnderTest --domain "valid-test-domain" --explore +if ! grep -q "ANALYSIS_DOMAIN=valid-test-domain" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected ANALYSIS_DOMAIN to be set to 'valid-test-domain' in the log output." +fi + +# -------- Test case 4 -------- +test_case_number=4 +echo "" +info "${test_case_number}.) Should run unchanged (without domain filtering) when no --domain option is given." + +analyzeExpectingSuccessUnderTest --explore +if grep -q "ANALYSIS_DOMAIN=[^[:space:]]" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected ANALYSIS_DOMAIN to be empty when no domain option is given, but found a non-empty value in the log." +fi + +# -------- Test case 5 -------- +test_case_number=5 +echo "" +info "${test_case_number}.) Should fail when --domain is given as the last argument without a value." + +analyzeExpectingFailureUnderTest --domain +if ! grep -q "requires a value" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected an error message about --domain requiring a value but got different output." +fi + +# -------- Test case 6 -------- +test_case_number=6 +echo "" +info "${test_case_number}.) Should fail when --domain is followed by another option instead of a value." + +analyzeExpectingFailureUnderTest --domain --explore +if ! grep -q "requires a value" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected an error message about --domain requiring a value but got different output." +fi + +# -------- Test case 7 -------- +test_case_number=7 +echo "" +info "${test_case_number}.) Should fail when --report is given as the last argument without a value." + +analyzeExpectingFailureUnderTest --report +if ! grep -q "requires a value" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected an error message about --report requiring a value but got different output." +fi + +# -------- Test case 8 -------- +test_case_number=8 +echo "" +info "${test_case_number}.) Should fail when --profile is given as the last argument without a value." + +analyzeExpectingFailureUnderTest --profile +if ! grep -q "requires a value" "${temporaryTestDirectory}/${SCRIPT_NAME}-${test_case_number}.log" 2>/dev/null; then + fail "${test_case_number}.) Test failed: Expected an error message about --profile requiring a value but got different output." +fi + +successful