Skip to content

Introduce git-history domain#559

Merged
JohT merged 2 commits intomainfrom
feature/introduce-git-history-domain
Apr 20, 2026
Merged

Introduce git-history domain#559
JohT merged 2 commits intomainfrom
feature/introduce-git-history-domain

Conversation

@JohT
Copy link
Copy Markdown
Owner

@JohT JohT commented Apr 16, 2026

@JohT JohT self-assigned this Apr 16, 2026
@JohT JohT force-pushed the feature/introduce-git-history-domain branch from befb3f8 to ca4db88 Compare April 16, 2026 19:59
@JohT JohT requested a review from Copilot April 18, 2026 13:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new git-history vertical-slice domain that encapsulates git-history import, enrichment/validation/statistics Cypher queries, CSV + Markdown report generation, and Python-based chart generation, with supporting documentation and exploration notebooks.

Changes:

  • Added domains/git-history/ with entrypoint scripts for CSV, Markdown summary, and Python chart reports.
  • Added git-history Cypher query sets (enrichment/statistics/validation) plus report templates and assembly logic.
  • Added domain-local git import scripts and noted current core→domain dependency tension in scripts/resetAndScan.sh.

Reviewed changes

Copilot reviewed 59 out of 59 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
scripts/resetAndScan.sh Adds TODO note about current importGit.sh sourcing and ownership direction.
domains/git-history/summary/report_no_git_data.template.md Fallback template when git history data isn’t available.
domains/git-history/summary/report.template.md Markdown report template with include placeholders for tables/charts.
domains/git-history/summary/gitHistorySummary.sh Assembles Markdown report by generating include files from Cypher + SVGs.
domains/git-history/queries/validation/Verify_git_to_code_file_unambiguous.cypher Validation query for ambiguous git→code resolution.
domains/git-history/queries/validation/Verify_git_missing_create_date.cypher Validation query for missing createdAtEpoch on git files.
domains/git-history/queries/validation/Verify_git_missing_CHANGED_TOGETHER_WITH_properties.cypher Validation query for missing relationship properties.
domains/git-history/queries/validation/Verify_code_to_git_file_unambiguous.cypher Validation query for ambiguous code→git resolution.
domains/git-history/queries/validation/ValidateGitHistory.cypher Minimal presence check for git-history graph data.
domains/git-history/queries/statistics/Words_for_git_author_Wordcloud_with_frequency.cypher Statistics query for author wordcloud input.
domains/git-history/queries/statistics/List_unresolved_git_files.cypher Statistics query for unresolved code files vs git data.
domains/git-history/queries/statistics/List_pairwise_changed_files_with_dependencies.cypher Statistics query correlating co-change pairs with dependencies.
domains/git-history/queries/statistics/List_pairwise_changed_files_top_selected_metric.cypher Statistics query for top co-change pairs by selected metric.
domains/git-history/queries/statistics/List_pairwise_changed_files.cypher Statistics query listing co-changed file pairs and metrics.
domains/git-history/queries/statistics/List_git_files_with_commit_statistics_by_author.cypher Statistics query for per-file commit stats by author.
domains/git-history/queries/statistics/List_git_files_that_were_changed_together_with_another_file_all_in_one.cypher Statistics query for “all-in-one commit” co-change aggregation.
domains/git-history/queries/statistics/List_git_files_that_were_changed_together_with_another_file.cypher Statistics query for file-level co-change partner rates.
domains/git-history/queries/statistics/List_git_files_that_were_changed_together_all_in_one.cypher Statistics query for co-change pairs from large commits.
domains/git-history/queries/statistics/List_git_files_that_were_changed_together.cypher Statistics query for basic co-changed file pairs.
domains/git-history/queries/statistics/List_git_files_per_commit_distribution.cypher Statistics query for files-changed-per-commit distribution.
domains/git-history/queries/statistics/List_git_files_by_resolved_label_and_extension.cypher Statistics query for resolved/unresolved git files by extension.
domains/git-history/queries/statistics/List_git_file_directories_with_commit_statistics.cypher Statistics query for directory-level commit/author stats.
domains/git-history/queries/statistics/List_ambiguous_git_files.cypher Statistics query for ambiguous file resolution cases.
domains/git-history/queries/enrichment/Set_number_of_git_plugin_update_commits.cypher Enrichment to set updateCommitCount on git/code files (plugin mode).
domains/git-history/queries/enrichment/Set_number_of_git_plugin_commits.cypher Enrichment to set numberOfGitCommits from plugin-provided commits.
domains/git-history/queries/enrichment/Set_number_of_git_log_commits.cypher Enrichment to set numberOfGitCommits from imported git log commits.
domains/git-history/queries/enrichment/Set_number_of_aggregated_git_commits.cypher Enrichment to set numberOfGitCommits from aggregated change spans.
domains/git-history/queries/enrichment/Set_commit_classification_properties.cypher Enrichment to classify commits (merge/bot/maven/automated/manual).
domains/git-history/queries/enrichment/Index_file_relative_path.cypher Adds index intended for File.relativePath.
domains/git-history/queries/enrichment/Index_file_name.cypher Adds index for File.fileName.
domains/git-history/queries/enrichment/Index_commit_sha.cypher Adds index intended for Commit.sha.
domains/git-history/queries/enrichment/Index_commit_parent.cypher Adds index for Commit.parent.
domains/git-history/queries/enrichment/Index_commit_hash.cypher Adds index for Commit.hash.
domains/git-history/queries/enrichment/Index_change_span_year.cypher Adds index for ChangeSpan.year.
domains/git-history/queries/enrichment/Index_author_name.cypher Adds index for Author.name.
domains/git-history/queries/enrichment/Index_absolute_file_name.cypher Adds index for File.absoluteFileName.
domains/git-history/queries/enrichment/Import_git_log_csv_data.cypher Imports full git log CSV to graph schema.
domains/git-history/queries/enrichment/Import_aggregated_git_log_csv_data.cypher Imports aggregated git log CSV to graph schema.
domains/git-history/queries/enrichment/Delete_plain_git_directory_file_nodes.cypher Deletes scanned /.git directory file nodes to reduce noise.
domains/git-history/queries/enrichment/Delete_git_log_data.cypher Deletes existing git data from the graph.
domains/git-history/queries/enrichment/Create_git_repository_node.cypher Creates/merges Git:Repository node for a repo import.
domains/git-history/queries/enrichment/Add_RESOLVES_TO_relationships_to_git_files_for_Typescript.cypher Attempts git↔TypeScript file resolution via path suffix matching.
domains/git-history/queries/enrichment/Add_RESOLVES_TO_relationships_to_git_files_for_Java.cypher Attempts git↔Java file resolution via simplified suffix matching.
domains/git-history/queries/enrichment/Add_HAS_PARENT_relationships_to_commits.cypher Adds parent relationships between commits.
domains/git-history/queries/enrichment/Add_CHANGED_TOGETHER_WITH_relationships_to_git_files.cypher Creates CHANGED_TOGETHER_WITH between git files + metrics.
domains/git-history/queries/enrichment/Add_CHANGED_TOGETHER_WITH_relationships_to_code_files.cypher Propagates co-change relationships to resolved code files.
domains/git-history/import/importGit.sh Domain-local orchestrator for git import/enrichment (plugin/full/aggregated).
domains/git-history/import/createGitLogCsv.sh Generates full git log CSV from a repository.
domains/git-history/import/createAggregatedGitLogCsv.sh Generates aggregated git log CSV from a repository.
domains/git-history/gitHistoryPython.sh Entrypoint to generate SVG charts via Python; can bootstrap CSVs.
domains/git-history/gitHistoryMarkdown.sh Entrypoint delegating to Markdown summary assembly script.
domains/git-history/gitHistoryCsv.sh Entrypoint to generate git-history CSV outputs from statistics queries.
domains/git-history/gitHistoryCharts.py Python chart generator (treemaps, histograms, bar chart, wordcloud).
domains/git-history/explore/GitHistoryGeneralExploration.ipynb Exploration notebook (general git-history visuals).
domains/git-history/explore/GitHistoryCorrelationExploration.ipynb Exploration notebook (correlation analysis).
domains/git-history/README.md Domain overview, entrypoints, structure, outputs.
domains/git-history/PREREQUISITES.md Domain prerequisites and required upstream pipeline steps.
domains/git-history/COPIED_FILES.md Mapping of original→copied files for future deprecation cleanup.
.github/prompts/plan-git-history-domain.prompt.md Planning prompt documenting intended implementation steps/decisions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread domains/git-history/queries/enrichment/Index_file_relative_path.cypher Outdated
Comment thread domains/git-history/queries/enrichment/Index_commit_sha.cypher Outdated
Comment thread domains/git-history/queries/validation/Verify_git_missing_create_date.cypher Outdated
Comment thread domains/git-history/README.md Outdated
Comment thread domains/git-history/PREREQUISITES.md Outdated
Comment thread domains/git-history/queries/enrichment/Import_aggregated_git_log_csv_data.cypher Outdated
Comment thread domains/git-history/import/importGit.sh Outdated
Comment thread domains/git-history/queries/statistics/List_ambiguous_git_files.cypher Outdated
Comment thread domains/git-history/summary/gitHistorySummary.sh
Comment thread scripts/resetAndScan.sh Outdated
@JohT JohT force-pushed the feature/introduce-git-history-domain branch from 431dcf2 to a752514 Compare April 19, 2026 07:14
@JohT JohT marked this pull request as ready for review April 19, 2026 07:37
@JohT JohT force-pushed the feature/introduce-git-history-domain branch from a752514 to 2b9ddf0 Compare April 19, 2026 08:06
@JohT JohT force-pushed the feature/introduce-git-history-domain branch from 2b9ddf0 to f4e074f Compare April 19, 2026 08:40
@JohT JohT merged commit 25a46e5 into main Apr 20, 2026
11 checks passed
@JohT JohT deleted the feature/introduce-git-history-domain branch April 20, 2026 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants