|
| 1 | +# Plan: Internal Dependencies Domain — Additional Reports |
| 2 | + |
| 3 | +## Domain Fit Assessment |
| 4 | + |
| 5 | +| Addition | Fits? | Rationale | |
| 6 | +|---|---|---| |
| 7 | +| DependenciesGraphExploration (Java/TS) | ✅ YES | Visualizes internal dependency hierarchy — core domain content | |
| 8 | +| OOP Design Metrics | ✅ YES | Instability + Abstractness measure coupling quality — dependency design metrics defined in terms of dependency ratios | |
| 9 | +| Visibility Metrics | ✅ YES | Public API surface encapsulation shapes how modules/packages can depend on each other | |
| 10 | +| Wordcloud (code names) | ⚠️ BORDERLINE | General vocabulary analysis, not dependency data; no better domain exists currently | |
| 11 | + |
| 12 | +--- |
| 13 | + |
| 14 | +## Extends: plan-internal_dependencies_domain.prompt.md |
| 15 | + |
| 16 | +All steps below are ADDITIONS to that plan. Phase numbering continues from original. |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## New Files (delta from original plan) |
| 21 | + |
| 22 | +**New Python scripts** (in `domains/internal-dependencies/`): |
| 23 | +- `objectOrientedDesignMetricsCharts.py` — scatter + bar charts for instability/abstractness/main sequence |
| 24 | +- `visibilityMetricsCharts.py` — scatter subplots for visibility percentiles |
| 25 | +- `wordcloudChart.py` — code unit names wordcloud as SVG |
| 26 | + |
| 27 | +**New query directories**: |
| 28 | +- `queries/ood-metrics/` — 29 files from `cypher/Metrics/` |
| 29 | +- `queries/visibility/` — 4 files from `cypher/Visibility/` |
| 30 | +- Add to `queries/exploration/` — `Words_for_universal_Wordcloud.cypher` |
| 31 | + |
| 32 | +**New explore notebooks** (7 additional, copies of jupyter/): |
| 33 | +- `explore/DependenciesGraphExplorationJava.ipynb` |
| 34 | +- `explore/DependenciesGraphExplorationTypescript.ipynb` |
| 35 | +- `explore/ObjectOrientedDesignMetricsJava.ipynb` |
| 36 | +- `explore/ObjectOrientedDesignMetricsTypescript.ipynb` |
| 37 | +- `explore/VisibilityMetricsJava.ipynb` |
| 38 | +- `explore/VisibilityMetricsTypescript.ipynb` |
| 39 | +- `explore/Wordcloud.ipynb` |
| 40 | + |
| 41 | +**Updated files**: |
| 42 | +- `internalDependenciesCsv.sh` — add OOD metrics + visibility metrics sections |
| 43 | +- `internalDependenciesPython.sh` — call 3 new chart scripts |
| 44 | +- `summary/report.template.md` — add OOD metrics, visibility, wordcloud sections |
| 45 | +- `COPIED_FILES.md` — add all new original→copy mappings |
| 46 | + |
| 47 | +--- |
| 48 | + |
| 49 | +## Steps |
| 50 | + |
| 51 | +### Phase 1 Extension: Cypher Queries |
| 52 | + |
| 53 | +**1.10** Copy 29 files from `cypher/Metrics/` into `queries/ood-metrics/`: |
| 54 | +- Java (without subpackages): `Get_Incoming_Java_Package_Dependencies.cypher`, `Set_Incoming_Java_Package_Dependencies.cypher`, `Get_Outgoing_Java_Package_Dependencies.cypher`, `Set_Outgoing_Java_Package_Dependencies.cypher`, `Get_Instability_for_Java.cypher`, `Calculate_and_set_Instability_for_Java.cypher`, `Get_Abstractness_for_Java.cypher`, `Calculate_and_set_Abstractness_for_Java.cypher`, `Calculate_distance_between_abstractness_and_instability_for_Java.cypher` |
| 55 | +- Java (including subpackages): same 9 files with `_Including_Subpackages` / `_including_subpackages` suffix variants |
| 56 | +- TypeScript: `Get_Incoming_Typescript_Module_Dependencies.cypher`, `Set_Incoming_Typescript_Module_Dependencies.cypher`, `Get_Outgoing_Typescript_Module_Dependencies.cypher`, `Set_Outgoing_Typescript_Module_Dependencies.cypher`, `Get_Instability_for_Typescript.cypher`, `Calculate_and_set_Instability_for_Typescript.cypher`, `Get_Abstractness_for_Typescript.cypher`, `Calculate_and_set_Abstractness_for_Typescript.cypher`, `Calculate_distance_between_abstractness_and_instability_for_Typescript.cypher` |
| 57 | +- Shared prerequisite: `Count_and_set_abstract_types.cypher` (required before abstractness calculation) |
| 58 | + |
| 59 | +Exact file count: verify with `ls cypher/Metrics/ | wc -l` to ensure no files are missed; the CSV script uses a specific subset, notebooks use more — copy all needed by either. |
| 60 | + |
| 61 | +**1.11** Copy all 4 files from `cypher/Visibility/` into `queries/visibility/`: |
| 62 | +- `Global_relative_visibility_statistics_for_types.cypher` |
| 63 | +- `Relative_visibility_public_types_to_all_types_per_package.cypher` |
| 64 | +- `Global_relative_visibility_statistics_for_elements_for_Typescript.cypher` |
| 65 | +- `Relative_visibility_exported_elements_to_all_elements_per_module_for_Typescript.cypher` |
| 66 | + |
| 67 | +**1.12** Copy `cypher/Overview/Words_for_universal_Wordcloud.cypher` → `queries/exploration/` (explore notebook reference + wordcloud chart). |
| 68 | + |
| 69 | +### Phase 2 Extension: CSV Entry Point |
| 70 | + |
| 71 | +**2.x** Extend `internalDependenciesCsv.sh` — add after existing topological sort section: |
| 72 | + |
| 73 | +**OOP Metrics block** (follow `scripts/reports/ObjectOrientedDesignMetricsCsv.sh` ordering): |
| 74 | +- Java without subpackages (5 queries): incoming, outgoing, instability, abstractness, main-sequence distance → `Java_Package/` |
| 75 | +- Java with subpackages (5 queries): same set with `_Including_Subpackages` suffix → `Java_Package/` |
| 76 | +- TypeScript (5 queries): TypeScript equivalents → `Typescript_Module/` |
| 77 | +- Note: `Count_and_set_abstract_types.cypher` must run before abstractness queries; check if `ObjectOrientedDesignMetricsCsv.sh` runs it explicitly and replicate that order. |
| 78 | + |
| 79 | +**Visibility Metrics block** (follow `scripts/reports/VisibilityMetricsCsv.sh` ordering): |
| 80 | +- Java: global visibility stats per artifact → `Java_Artifact/`, per-package visibility → `Java_Package/` |
| 81 | +- TypeScript: global stats per project → `Typescript_Module/`, per-module visibility → `Typescript_Module/` |
| 82 | + |
| 83 | +### Phase 3 Extension: Python Chart Scripts (*parallel with Phase 4 extension*) |
| 84 | + |
| 85 | +**3.6** Create `objectOrientedDesignMetricsCharts.py`: |
| 86 | +- `Parameters` class: `--report_directory`, `--queries_directory` (default: `queries/ood-metrics/`), `--verbose` |
| 87 | +- Run `Count_and_set_abstract_types.cypher` first (prerequisite for abstractness) |
| 88 | +- Run `Calculate_and_set_*` queries (idempotent write-back to graph), then read results |
| 89 | +- Chart functions (all save SVG): |
| 90 | + - `plot_top_dependencies_bar(data, title, x_col, y_col, file_path)` — horizontal bar, top 30 packages by incoming/outgoing deps |
| 91 | + - `plot_main_sequence_scatter(data, title, file_path)` — scatter: X=abstractness, Y=instability; point size=type count; color=distance from main sequence (green=near, red=far); green dashed diagonal reference line |
| 92 | +- Sections: Java packages (without subpackages), Java packages (including subpackages), TypeScript modules |
| 93 | +- Output to: `Java_Package/` and `Typescript_Module/` subdirs within report directory |
| 94 | +- Handle "no data" gracefully with warning + skip |
| 95 | + |
| 96 | +**3.7** Create `visibilityMetricsCharts.py`: |
| 97 | +- `Parameters` class: `--report_directory`, `--queries_directory` (default: `queries/visibility/`), `--verbose` |
| 98 | +- Chart functions: |
| 99 | + - `plot_visibility_scatter(data, title, file_path, percentile_col, y_col)` — scatter: X=visibility percentile (25/50/75), Y=package/module count (log scale); custom Y ticks: 1, 2, 5, 10, 20, 50, 100, 200, 500, 1K, 2K, 5K, 10K |
| 100 | + - `plot_visibility_subplots(java_data, ts_data, report_dir)` — 3-subplot layout matching notebook style |
| 101 | +- Output to: `Java_Package/` for Java, `Typescript_Module/` for TypeScript |
| 102 | + |
| 103 | +**3.8** Create `wordcloudChart.py`: |
| 104 | +- `Parameters` class: `--report_directory`, `--queries_directory` (default: `queries/exploration/`), `--verbose` |
| 105 | +- Run `Words_for_universal_Wordcloud.cypher` to get word list |
| 106 | +- Apply same stopwords list as Wordcloud.ipynb (builder, exception, abstract, helper, util, callback, factory, result, handler, test, impl, plugin, etc.) |
| 107 | +- Generate wordcloud using `WordCloud.to_svg()` for pure-vector SVG output (800x800, max 600 words, viridis colormap) |
| 108 | +- Output: `reports/internal-dependencies/CodeNamesWordcloud.svg` |
| 109 | +- Handle "no data" (no nodes with names) with warning + skip |
| 110 | + |
| 111 | +### Phase 4 Extension: Python Entry Point |
| 112 | + |
| 113 | +**4.1 Update** `internalDependenciesPython.sh` — after existing `pathFindingCharts.py` call, add: |
| 114 | +``` |
| 115 | +python objectOrientedDesignMetricsCharts.py --report_directory "${FULL_REPORT_DIRECTORY}" ${verboseMode} |
| 116 | +python visibilityMetricsCharts.py --report_directory "${FULL_REPORT_DIRECTORY}" ${verboseMode} |
| 117 | +python wordcloudChart.py --report_directory "${FULL_REPORT_DIRECTORY}" ${verboseMode} |
| 118 | +``` |
| 119 | +Follow same pattern as existing call in `externalDependenciesPython.sh`. |
| 120 | + |
| 121 | +### Phase 6 Extension: Markdown Summary |
| 122 | + |
| 123 | +**6.1 Update** `summary/report.template.md` — add new sections after existing content: |
| 124 | + |
| 125 | +**New Section: OOP Design Metrics** |
| 126 | +- Introductory paragraph: Condense from notebook — *"Based on Robert C. Martin's stable dependencies principle. **Instability** = outgoing/(incoming+outgoing): 0 = fully stable (many dependents, hard to change), 1 = fully unstable (no dependents, easy to change). **Abstractness** = abstract types / total types: 0 = fully concrete, 1 = fully abstract. The **Main Sequence** diagonal (A + I = 1) defines the ideal balance. Distance from main sequence measures how far a package deviates from this ideal: near 0 = well-balanced, near 1 = problematic ('Zone of Pain' = concrete+stable; 'Zone of Uselessness' = abstract+unstable)."* |
| 127 | +- Java packages (without subpackages): table links + scatter chart references with 1-3 sentence descriptions |
| 128 | +- Java packages (including subpackages): same |
| 129 | +- TypeScript modules: same |
| 130 | +- Glossary additions: `Instability`, `Abstractness`, `Distance from Main Sequence`, `Zone of Pain`, `Zone of Uselessness` |
| 131 | + |
| 132 | +**New Section: Visibility Metrics** |
| 133 | +- Introductory paragraph: *"Measures the ratio of publicly visible types/elements to all types/elements per package or module. High visibility means most internals are exposed (low encapsulation). The percentile25/50/75 metrics per artifact show whether low-visibility packages are the norm or outliers within each artifact."* |
| 134 | +- Java scatter subplot references with chart description |
| 135 | +- TypeScript scatter subplot references |
| 136 | +- Linked tables: top 40 packages/modules with lowest encapsulation |
| 137 | + |
| 138 | +**New Section: Code Vocabulary (Wordcloud)** |
| 139 | +- Introductory paragraph: *"Words derived from code element names across all artifacts/modules (types, methods, variables). Constructed by splitting camelCase/snake_case identifiers, filtering common stopwords (util, helper, factory, etc.), and weighting by frequency. Larger words appear more often in the codebase — revealing dominant concerns and naming patterns."* |
| 140 | +- Wordcloud SVG reference (conditional include if file exists) |
| 141 | + |
| 142 | +**6.2 Update** `summary/internalDependenciesSummary.sh` — add: |
| 143 | +- Execute OOD metrics read queries → Markdown table includes (instability/abstractness/distance tables for Java + TypeScript) |
| 144 | +- Execute visibility read queries → Markdown table includes (low-visibility packages/modules tables) |
| 145 | +- Conditional SVG chart references (OOD scatter plots, visibility subplots, wordcloud) |
| 146 | + |
| 147 | +### Phase 7 Extension: Exploration Notebooks (7 additional) |
| 148 | + |
| 149 | +**7.5** Copy `jupyter/DependenciesGraphExplorationJava.ipynb` → `explore/DependenciesGraphExplorationJava.ipynb` |
| 150 | +- Already has `ValidateAlwaysFalse` — no metadata change needed |
| 151 | + |
| 152 | +**7.6** Copy `jupyter/DependenciesGraphExplorationTypescript.ipynb` → `explore/DependenciesGraphExplorationTypescript.ipynb` |
| 153 | +- Already has `ValidateAlwaysFalse` — no metadata change needed |
| 154 | + |
| 155 | +**7.7** Copy `jupyter/ObjectOrientedDesignMetricsJava.ipynb` → `explore/ObjectOrientedDesignMetricsJava.ipynb` |
| 156 | +- Change `"code_graph_analysis_pipeline_data_validation"` from `"ValidateJavaPackageDependencies"` → `"ValidateAlwaysFalse"` |
| 157 | + |
| 158 | +**7.8** Copy `jupyter/ObjectOrientedDesignMetricsTypescript.ipynb` → `explore/ObjectOrientedDesignMetricsTypescript.ipynb` |
| 159 | +- Change `"ValidateTypescriptModuleDependencies"` → `"ValidateAlwaysFalse"` |
| 160 | + |
| 161 | +**7.9** Copy `jupyter/VisibilityMetricsJava.ipynb` → `explore/VisibilityMetricsJava.ipynb` |
| 162 | +- Change `"ValidateJavaTypes"` → `"ValidateAlwaysFalse"` |
| 163 | + |
| 164 | +**7.10** Copy `jupyter/VisibilityMetricsTypescript.ipynb` → `explore/VisibilityMetricsTypescript.ipynb` |
| 165 | +- Change `"ValidateTypescriptModuleDependencies"` → `"ValidateAlwaysFalse"` |
| 166 | + |
| 167 | +**7.11** Copy `jupyter/Wordcloud.ipynb` → `explore/Wordcloud.ipynb` |
| 168 | +- Change data validation to `"ValidateAlwaysFalse"` |
| 169 | +- Note in COPIED_FILES.md: only "Wordcloud of names in code" section (`Words_for_universal_Wordcloud.cypher`) is replicated in `wordcloudChart.py`; the git authors wordcloud section is explore-only |
| 170 | + |
| 171 | +--- |
| 172 | + |
| 173 | +## Relevant Files (delta) |
| 174 | + |
| 175 | +**To create**: |
| 176 | +- `domains/internal-dependencies/objectOrientedDesignMetricsCharts.py` |
| 177 | +- `domains/internal-dependencies/visibilityMetricsCharts.py` |
| 178 | +- `domains/internal-dependencies/wordcloudChart.py` |
| 179 | + |
| 180 | +**To copy** (new Cypher, ~34 files): |
| 181 | +- `cypher/Metrics/` → `queries/ood-metrics/` (29 files: 9 Java + 9 Java-subpackages + 9 TypeScript + `Count_and_set_abstract_types.cypher` + 1 TBD from `ObjectOrientedDesignMetricsCsv.sh` reference) |
| 182 | +- `cypher/Visibility/` (all 4 files) → `queries/visibility/` |
| 183 | +- `cypher/Overview/Words_for_universal_Wordcloud.cypher` → `queries/exploration/` |
| 184 | + |
| 185 | +**To copy** (new notebooks, 7 files): |
| 186 | +- `jupyter/DependenciesGraphExplorationJava.ipynb` → `explore/` |
| 187 | +- `jupyter/DependenciesGraphExplorationTypescript.ipynb` → `explore/` |
| 188 | +- `jupyter/ObjectOrientedDesignMetricsJava.ipynb` → `explore/` |
| 189 | +- `jupyter/ObjectOrientedDesignMetricsTypescript.ipynb` → `explore/` |
| 190 | +- `jupyter/VisibilityMetricsJava.ipynb` → `explore/` |
| 191 | +- `jupyter/VisibilityMetricsTypescript.ipynb` → `explore/` |
| 192 | +- `jupyter/Wordcloud.ipynb` → `explore/` |
| 193 | + |
| 194 | +**To modify**: |
| 195 | +- `domains/internal-dependencies/internalDependenciesCsv.sh` — add OOD metrics + visibility sections |
| 196 | +- `domains/internal-dependencies/internalDependenciesPython.sh` — add 3 new chart script calls |
| 197 | +- `domains/internal-dependencies/summary/report.template.md` — add 3 new sections |
| 198 | +- `domains/internal-dependencies/summary/internalDependenciesSummary.sh` — add table/chart includes |
| 199 | +- `domains/internal-dependencies/COPIED_FILES.md` — add new mappings |
| 200 | + |
| 201 | +**Reference** (read-only, for Python chart implementation): |
| 202 | +- `scripts/reports/ObjectOrientedDesignMetricsCsv.sh` — query ordering + output file names |
| 203 | +- `scripts/reports/VisibilityMetricsCsv.sh` — query ordering + output file names |
| 204 | +- `jupyter/ObjectOrientedDesignMetricsJava.ipynb` — scatter plot implementation details |
| 205 | +- `jupyter/VisibilityMetricsJava.ipynb` — 3-subplot scatter implementation + Y-axis tick list |
| 206 | +- `jupyter/Wordcloud.ipynb` — stopwords list + wordcloud parameters (800x800, max 600 words, viridis) |
| 207 | +- `domains/external-dependencies/externalDependencyCharts.py` — Parameters class pattern |
| 208 | + |
| 209 | +--- |
| 210 | + |
| 211 | +## Verification (delta) |
| 212 | + |
| 213 | +1. **Cypher count**: `find domains/internal-dependencies/queries/ -name "*.cypher" | wc -l` = 43 (original) + ~34 (new) ≈ 77 |
| 214 | +2. **Python compile**: `python -m py_compile domains/internal-dependencies/objectOrientedDesignMetricsCharts.py visibilityMetricsCharts.py wordcloudChart.py` |
| 215 | +3. **Shell lint**: `shellcheck` on updated `internalDependenciesCsv.sh` and `internalDependenciesPython.sh` |
| 216 | +4. **Notebook metadata**: All 11 explore/ notebooks contain `"code_graph_analysis_pipeline_data_validation": "ValidateAlwaysFalse"` |
| 217 | +5. **No external changes**: No modifications outside `domains/internal-dependencies/` |
| 218 | + |
| 219 | +--- |
| 220 | + |
| 221 | +## Further Considerations |
| 222 | + |
| 223 | +1. **Wordcloud domain fit**: Code vocabulary analysis is not directly a dependency metric. If an Overview domain is planned later, `wordcloudChart.py` and `explore/Wordcloud.ipynb` could move there. For now, including it is the pragmatic choice. |
| 224 | +2. **Count_and_set_abstract_types prerequisite**: Verify whether `ObjectOrientedDesignMetricsCsv.sh` runs `Count_and_set_abstract_types.cypher` explicitly before abstractness queries — replicate that order in the domain CSV script. If it doesn't (i.e., it's expected to be a prior pipeline step), document it as a prerequisite in PREREQUISITES.md instead of running it inline. |
| 225 | +3. **Wordcloud SVG method**: `WordCloud.to_svg()` produces pure-vector SVG. If the installed `wordcloud` library version doesn't support it, fall back to rendering to a matplotlib figure and saving as SVG (rasterized but valid). |
0 commit comments