Skip to content

Commit 157eaf5

Browse files
committed
Add external dependencies analysis modularized as domain
1 parent 42ef1ec commit 157eaf5

46 files changed

Lines changed: 6097 additions & 1 deletion

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# External Dependencies Domain
2+
3+
This directory contains the implementation and resources for analysing external dependencies within the Code Graph Analysis Pipeline. It follows the vertical-slice domain pattern: all Cypher queries, Python chart scripts, and report templates needed for this analysis live here.
4+
5+
## Entry Points
6+
7+
The following scripts are discovered and invoked automatically by the central compilation scripts in [scripts/reports/compilations/](../../scripts/reports/compilations/). They are found by filename pattern.
8+
9+
- [externalDependenciesCsv.sh](./externalDependenciesCsv.sh): Entry point for CSV reports based on Cypher queries. Discovered by `CsvReports.sh` (`*Csv.sh` pattern).
10+
- [externalDependenciesPython.sh](./externalDependenciesPython.sh): Entry point for Python-based chart generation. Discovered by `PythonReports.sh` (`*Python.sh` pattern).
11+
- [externalDependenciesMarkdown.sh](./externalDependenciesMarkdown.sh): Entry point for the Markdown summary report. Discovered by `MarkdownReports.sh` (`*Markdown.sh` pattern).
12+
13+
## Folder Structure
14+
15+
- [explore](./explore/): Original Jupyter notebooks for interactive, exploratory analysis. Marked with `code_graph_analysis_pipeline_data_validation: ValidateAlwaysFalse` so they are not automatically executed by the pipeline.
16+
- [queries](./queries/): All Cypher queries for identifying and quantifying external dependencies. These are self-contained copies from [cypher/External_Dependencies/](../../cypher/External_Dependencies/).
17+
- [summary](./summary/): Markdown template and assembly script for the summary report.
18+
19+
## Prerequisites
20+
21+
This domain requires the following to be in place before running. These are provided by the central pipeline ([scripts/prepareAnalysis.sh](../../scripts/prepareAnalysis.sh)) and are **not** set up by this domain.
22+
23+
### Graph database
24+
25+
- Neo4j must be running and accessible at `bolt://localhost:7687`.
26+
- `NEO4J_INITIAL_PASSWORD` environment variable must be set.
27+
- Artifacts must already have been scanned and loaded by jQAssistant, creating `DEPENDS_ON` relationships between `Type` nodes.
28+
29+
### Node labels (Java)
30+
31+
The following labels must exist on `Type` nodes before external dependency analysis can run. They are created by Cypher queries in [cypher/Types/](../../cypher/Types/):
32+
33+
- `PrimitiveType` — primitive types like `int`, `boolean`
34+
- `Void` — void return type
35+
- `JavaType` — built-in Java standard library types (e.g. `java.lang.*`, `java.util.*`)
36+
- `ResolvedDuplicateType` — deduplicated types that appear in multiple jars
37+
38+
Without these labels, `Label_external_types_and_annotations.cypher` cannot correctly distinguish external types from internal and built-in ones.
39+
40+
### Relationship weight properties (Java)
41+
42+
The following properties must exist on `DEPENDS_ON` relationships between `Package` nodes. They are set by Cypher queries in [cypher/DependsOn_Relationship_Weights/](../../cypher/DependsOn_Relationship_Weights/):
43+
44+
- `weight` — sum of type-level dependency weights between two packages
45+
- `weightInterfaces` — subset of `weight` attributable to interface dependencies
46+
47+
### TypeScript enrichment
48+
49+
For TypeScript projects, the following must be completed by [cypher/Typescript_Enrichment/](../../cypher/Typescript_Enrichment/):
50+
51+
- Module properties set on `Module` nodes: `namespace`, `moduleName`, `isNodeModule`, `isExternalImport`
52+
- `IS_IMPLEMENTED_IN` relationships linking `ExternalModule` nodes to their resolved internal `Module` nodes
53+
- `DEPENDS_ON` relationships propagated to resolved modules
54+
- NPM packages linked to their corresponding `ExternalModule` nodes via `PROVIDED_BY_NPM_DEPENDENCY`
55+
56+
### General enrichment
57+
58+
- `name` and `extension` properties on `File` nodes — set by [cypher/General_Enrichment/](../../cypher/General_Enrichment/).
59+
60+
## What This Domain Produces
61+
62+
### CSV reports (`reports/external-dependencies/`)
63+
64+
One CSV file per Cypher query covering:
65+
66+
- **Java packages**: overall usage, spread, per-artifact, per-type, distribution, aggregated, Maven POM declared dependencies
67+
- **TypeScript modules and namespaces**: overall usage, spread, per-internal-module, aggregated
68+
- **Package management**: `package.json` dependency occurrence and combinations
69+
70+
### SVG charts (`reports/external-dependencies/`)
71+
72+
Python-generated charts from [externalDependencyCharts.py](./externalDependencyCharts.py):
73+
74+
- **Java**: pie charts for most-used and most-spread packages (by types and by packages, with drill-down into "others"), stacked bar charts for per-artifact breakdown, scatter plots for aggregated usage patterns
75+
- **TypeScript**: pie charts for modules and namespaces (usage and spread, with drill-down)
76+
77+
### Markdown summary (`reports/external-dependencies/external_dependencies_report.md`)
78+
79+
A structured report designed to be readable by both humans and AI agents. Includes key findings, tables, chart references, and architectural recommendations such as Hexagonal Architecture patterns and Anti-Corruption Layer candidates.

0 commit comments

Comments
 (0)