Skip to content

Add project name to code unit nodes during enrichment#622

Merged
JohT merged 4 commits into
mainfrom
feature/enrich-project-name
Jul 1, 2026
Merged

Add project name to code unit nodes during enrichment#622
JohT merged 4 commits into
mainfrom
feature/enrich-project-name

Conversation

@JohT

@JohT JohT commented Jun 29, 2026

Copy link
Copy Markdown
Owner

@JohT JohT changed the title Feature/enrich project name Add project name to code unit nodes during enrichment Jun 29, 2026
@JohT JohT self-assigned this Jun 29, 2026
@JohT JohT marked this pull request as ready for review June 29, 2026 14:43
@JohT JohT requested a review from Copilot June 29, 2026 14:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a normalized projectName property to Java/TypeScript “code unit” nodes during analysis preparation, then refactors multiple downstream Cypher queries (node-embeddings, anomaly-detection, archetypes) to read codeUnit.projectName directly instead of repeating OPTIONAL MATCH traversal logic.

Changes:

  • Add two new General Enrichment Cypher queries to set projectName on Java and TypeScript code units, and wire them into scripts/prepareAnalysis.sh.
  • Refactor many reporting/embedding/anomaly Cypher queries to use coalesce(codeUnit.projectName, '') in result projections.
  • Update an exploratory anomaly-detection notebook (currently producing a very large diff).

Reviewed changes

Copilot reviewed 33 out of 35 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
scripts/prepareAnalysis.sh Runs the new projectName enrichment queries during preparation.
cypher/General_Enrichment/Set_projectName_for_Java_code_units.cypher New enrichment query to persist projectName onto Java code units.
cypher/General_Enrichment/Set_projectName_for_Typescript_modules.cypher New enrichment query to persist projectName onto TS modules from TS projects.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_4d_GraphSAGE_Stream.cypher Streams embeddings with projectName read from codeUnit.projectName.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_3d_Node2Vec_Tuneable_Stream.cypher Same refactor for Node2Vec tuneable streaming.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_3d_Node2Vec_Stream.cypher Same refactor for Node2Vec streaming.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_2d_Hash_GNN_Tuneable_Stream.cypher Same refactor for HashGNN tuneable streaming.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_2d_Hash_GNN_Stream.cypher Same refactor for HashGNN streaming.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_1d_Fast_Random_Projection_Tuneable_Stream.cypher Same refactor for FastRP tuneable streaming.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_1d_Fast_Random_Projection_Stream.cypher Same refactor for FastRP streaming.
domains/node-embeddings/queries/node-embeddings/Node_Embeddings_0a_Query_Calculated.cypher Same refactor for calculated embeddings queries.
domains/archetypes/queries/AnomalyDetectionUnexpectedCentralNodes.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionSilentCoordinators.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionPotentialOverEngineerOrIsolated.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionPotentialImbalancedRoles.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionPopularBottlenecks.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionOverReferencesUtilities.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionHiddenBridgeNodes.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionFragileStructuralBridges.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/queries/AnomalyDetectionDependencyHungryOrchestrators.cypher Uses codeUnit.projectName instead of traversal-derived values.
domains/archetypes/labels/ArchetypeHub.cypher Uses codeUnit.projectName for label result context.
domains/archetypes/labels/ArchetypeBottleneck.cypher Uses codeUnit.projectName for label result context.
domains/archetypes/labels/ArchetypeAuthority.cypher Uses codeUnit.projectName for label result context.
domains/anomaly-detection/summary/AnomalyDeepDiveTopAnomalies.cypher Uses codeUnit.projectName for summary output.
domains/anomaly-detection/queries/node-embeddings/Node_Embeddings_3d_Node2Vec_Tuneable_Stream.cypher Uses codeUnit.projectName in anomaly-detection embedding stream output.
domains/anomaly-detection/queries/node-embeddings/Node_Embeddings_2d_Hash_GNN_Tuneable_Stream.cypher Uses codeUnit.projectName in anomaly-detection embedding stream output.
domains/anomaly-detection/queries/node-embeddings/Node_Embeddings_1d_Fast_Random_Projection_Tuneable_Stream.cypher Uses codeUnit.projectName in anomaly-detection embedding stream output.
domains/anomaly-detection/queries/node-embeddings/Node_Embeddings_0a_Query_Calculated.cypher Uses codeUnit.projectName in anomaly-detection calculated embedding output.
domains/anomaly-detection/labels/AnomalyDetectionTopAnomalies.cypher Uses codeUnit.projectName in label output.
domains/anomaly-detection/labels/AnomalyDetectionArchetypeOutlier.cypher Uses codeUnit.projectName in label output.
domains/anomaly-detection/labels/AnomalyDetectionArchetypeBridge.cypher Uses codeUnit.projectName in label output.
domains/anomaly-detection/features/AnomalyDetectionFeatures.cypher Uses codeUnit.projectName in feature export output.
domains/anomaly-detection/explore/AnomalyDetectionIsolationForestExploration.ipynb Updates notebook content related to project name selection (currently malformed).
.github/prompts/plan-addProjectNameEnrichment.prompt.md Adds a planning document describing the enrichment/refactor approach and rationale.

Comment thread scripts/prepareAnalysis.sh Outdated
@JohT JohT force-pushed the feature/enrich-project-name branch from 7703709 to 7ac22a8 Compare June 30, 2026 05:59
@JohT JohT merged commit fceb06e into main Jul 1, 2026
10 checks passed
@JohT JohT deleted the feature/enrich-project-name branch July 1, 2026 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants