Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/copilot-instructions.md
30 changes: 30 additions & 0 deletions .github/instructions/agent-files.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
applyTo: ".github/skills/**/*.md,.github/instructions/**/*.md,AGENTS.md"
---
# Agent Files — Caveman Style

Files in `.github/skills/`, `.github/instructions/`, and `AGENTS.md` must use [caveman skill](./../skills/caveman/SKILL.md) to keep context short and tokens efficient.

## Applies To

- `AGENTS.md`
- `.github/skills/**/*.md` (skill files and support docs)
- `.github/instructions/**/*.md` (instruction files)

## Caveman Rule

All text except code blocks and security warnings:
- Drop articles (a/an/the)
- No filler (just, really, basically, actually, simply)
- No pleasantries (sure, certainly, of course, happy to)
- Fragments OK. Short synonyms (big not extensive)
- Technical terms exact
- Pattern: `[thing] [action] [reason]. [next step].`

**Default intensity:** full

See [caveman skill](./../skills/caveman/SKILL.md) for lite/ultra and examples.

## Exception

Security warnings + irreversible action confirmations → normal English. Resume caveman after warning complete.
51 changes: 51 additions & 0 deletions .github/instructions/cypher-queries.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
applyTo: "**/*.cypher"
---
# Cypher Query Conventions

## Style

- **First line: description comment** — format: `// Description sentence. Requires "Other_Query.cypher".`
Used in generated [CYPHER.md](../../CYPHER.md) reference
- Blank line between comment and statement
- One statement per file — no combining without explicit approval
- Standard Cypher where possible. Use APOC sparingly — confirm with user first
- Parameterize with `$parameterName`
- Keywords uppercase: `MATCH`, `WHERE`, `RETURN`, `WITH`, `SET`, `ORDER BY`, `LIMIT`, etc.
- Literals lowercase: `null`, `true`, `false`
- Each clause own line — no inline chaining
- `ORDER BY` and `LIMIT` on own lines
- Single quotes for strings: `'value'` not `"value"`
- No semicolons at end

## Naming

- Files: `Snake_Case_Description.cypher`
- Node labels: `PascalCase`
- Relationship types: `UPPER_SNAKE_CASE`
- Properties: `camelCase`
- No backtick-quoted identifiers — use clean label/property names

## Spacing

- One space between label predicate and property predicate: `(n:Label {prop: 1})`
- Space after each comma: `a, b, c` not `a,b,c`
- Wrapping space around operators: `a <> b`, `p = (s)-->(e)`
- No padding space inside function calls: `count(n)` not `count( n )`

## Patterns

- Anonymous nodes/rels when variable unused: `[:LIKES]` not `[r:LIKES]`

## Verification

Before finalizing query, verify against live Neo4j to catch syntax errors and warnings.
Requires Neo4j running — use [neo4j-management skill](./../skills/neo4j-management/SKILL.md) first.

```shell
# From repo root — NEO4J_INITIAL_PASSWORD must be exported
./scripts/executeQuery.sh <path/to/query.cypher>
```

- Empty result ≠ broken query — data may be absent
- `executeQueryFunctions.sh` for script composition only — not for ad-hoc verification
28 changes: 28 additions & 0 deletions .github/instructions/generated-reference-docs.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
applyTo: "CYPHER.md,SCRIPTS.md,ENVIRONMENT_VARIABLES.md"
---
# Generated Reference Files

[CYPHER.md](../../CYPHER.md), [SCRIPTS.md](../../SCRIPTS.md), [ENVIRONMENT_VARIABLES.md](../../ENVIRONMENT_VARIABLES.md) — **auto-generated**. Do not edit directly. Changes get overwritten.

## Do Not Run Generators Manually

Generators run automatically in pipeline — output **auto-committed**. Running locally, including result in commit causes **git merge conflicts** in pipeline. Only run generator if user explicitly insists.

## Influence Indirectly

| File | Edit instead |
|------|--------------|
| `CYPHER.md` | First line `// Description.` of each `.cypher` file |
| `SCRIPTS.md` | First line `# Description.` of each `.sh` file |
| `ENVIRONMENT_VARIABLES.md` | Inline `# comment` after env var declaration in `.sh` files |

Moved/renamed files detected and reflected automatically on next pipeline run.

## Generators

See [scripts/documentation/](../../scripts/documentation/):

- `generateCypherReference.sh` → `CYPHER.md`
- `generateScriptReference.sh` → `SCRIPTS.md`
- `generateEnvironmentVariableReference.sh` + `appendEnvironmentVariables.sh` → `ENVIRONMENT_VARIABLES.md`
49 changes: 49 additions & 0 deletions .github/instructions/python-dependencies.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
applyTo: "requirements.txt,conda-environment.yml"
---
# Python Dependency Conventions

Applies to [requirements.txt](../../requirements.txt) and [conda-environment.yml](../../conda-environment.yml).

## Sync Rule

Both files must stay **in sync**: same packages, same pinned versions. Package in one must be in other.

Known exceptions (document inline with comment if they differ):
- `conda-environment.yml` may split packages (e.g. `nbconvert` + `nbconvert-webpdf` vs. `nbconvert[webpdf]` in pip)
- `conda-environment.yml` may use `pip:` block for packages not on conda-forge

## Versioning

- Pin all versions — no ranges, no `>=`, no unpinned entries
- `requirements.txt`: `==` (e.g. `pandas==2.2.3`)
- `conda-environment.yml`: `=` (e.g. `pandas=2.2.3`)

## Adding or Updating

- **Ask user before introducing new dependencies**
- Update both files when changing version

## Verification

**`requirements.txt`** — venv dry-run from [.github/workflows/internal-check-python-venv-support.yml](../../.github/workflows/internal-check-python-venv-support.yml):

```shell
python -m venv .venv
source ./scripts/activatePythonEnvironment.sh
.venv/bin/pip install --dry-run --quiet --requirement requirements.txt
! .venv/bin/pip install --dry-run --requirement requirements.txt 2>/dev/null | grep -q "Would install"
```

**`conda-environment.yml`** — via [scripts/activateCondaEnvironment.sh](../../scripts/activateCondaEnvironment.sh):

```shell
source ./scripts/activateCondaEnvironment.sh
```

Fallback (conda directly):

```shell
conda env update --file conda-environment.yml --prune
conda activate codegraph
```
56 changes: 56 additions & 0 deletions .github/instructions/python.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
applyTo: "**/*.py,**/*.ipynb"
---
# Python Conventions

Applies to Jupyter notebooks and standalone `.py` scripts.

## Style

- Type annotations everywhere — parameters, return types, non-obvious variables
- Verify with Pylance (strict mode preferred)
- No abbreviations. Write out names. `figure` not `fig`. `axis` not `ax`. `i`, `j`, `k` for indices OK
- Prefer named functions over code block comments. Comments explain *why*, not *what*
- Split into meaningful functions — ideally pure, easy to test
- Apply basic functional programming where it improves readability (map, filter, list comprehensions)
- f-strings preferred over `%` or `.format()` for string interpolation
- No `+` concatenation in loops — use `"".join()` or list accumulation
- `is`/`is not` for `None` comparisons, never `==`/`!=`
- Truthiness: `if seq:` not `if len(seq) > 0:`; `if x is not None:` not `if x != None:`
- Use `pathlib.Path` over `os.path` for file operations

## Type Annotations

- `X | None` over `Optional[X]`; `X | Y` over `Union[X, Y]` (PEP 604)
- Abstract container types in signatures: `Sequence` over `list`, `Mapping` over `dict`

## Imports

- Order: stdlib → third-party → local; one blank line between groups
- No wildcard imports (`from x import *`)
- One import statement per module line; multiple names from same module OK: `from x import A, B`

## Error Handling

- No bare `except:` — always name specific exceptions (`except ValueError:`)
- No mutable default arguments: `def f(x: list = [])` — use `None` + guard
- Use `with` for all file/resource management; never manual open/close

## Naming

- No generic names — no "util", "helper", "common", "shared"
- Module-level constants: `UPPER_CASE_WITH_UNDERSCORES`
- DRY only for domain-related concerns
- No magic numbers — extract named constants for literal comparisons

## Docstrings

- Public functions require `"""..."""` docstring (triple double quotes)
- First line imperative mood, ends with period: `"""Compute distances from center."""`
- Multi-line: blank line after summary, then `Args:`, `Returns:`, `Raises:` sections as needed

## Notebook to Script Conversion

- Extract charts and tables into standalone `.py` script with `main()` entry point
- Original notebook → `explore/` with metadata `"code_graph_analysis_pipeline_data_validation": "ValidateAlwaysFalse"` and "Exploration" added to title
- Charts → SVG, tables → CSV, descriptions → Markdown summary report
78 changes: 78 additions & 0 deletions .github/instructions/shell-scripts.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
applyTo: "**/*.sh"
---
# Shell Script Conventions

Target: **bash**. Must work on macOS, Linux, Windows (WSL/Git Bash).

## Header

- **First line: description comment** — used by [SCRIPTS.md](../../SCRIPTS.md). Format: `# Description sentence. Requires "other_script.sh".`
- Include required env vars, required tools, usage examples in description or `usage()` function
- Blank line after initial comment before code starts

## Shebang and Strict Mode

- First executable line: `#!/usr/bin/env bash`
- Enable strict mode immediately after: `set -euo pipefail`
- Set safe word-splitting: `IFS=$'\n\t'`
- `pipefail` propagates pipe failures — `set -e` bypassed in subshells and command groups; document where this applies

## Style

- POSIX-compliant as reasonable
- Always `"${variable}"` — curly braces + double quotes on every expansion
- Use `$(...)` for command substitution — not backticks
- No abbreviations. Write out variable and function names. `i`, `j`, `k` for indices OK
- Prefer named functions over code block comments. Comments explain *why*, not *what*
- Split into meaningful functions — ideally pure, easy to reason about
- Apply basic functional programming where it improves readability (piping, composition)

## Naming

- No generic names — no "util", "helper", "common", "shared"
- DRY only for domain-related concerns. Don't extract one-off operations

## Variables and Scope

- `local` for function-scoped variables
- `readonly` for constants
- `${VAR:-default}` for defaults
- `[[ ]]` over `[ ]`
- Quote all paths: `"${directory}/${filename}"`

## Argument Parsing

- Provide `usage()` — list parameters, flags, env vars, copy-pastable examples with expected output
- Support at minimum `--help` (print `usage()`, exit 0)
- Support `--dry-run` for destructive or irreversible operations

## Dependency Checks

- Check required tools early: `command -v <tool> || { echo "Install <tool>: <url>"; exit 1; }`

## Portability

- Prefer POSIX idioms; guard GNU vs BSD differences explicitly
- `sed -i` requires empty suffix on macOS: `sed -i ''` vs Linux `sed -i`
- `date -d` GNU only — use `date -v` on macOS or detect OS and branch

## Error Handling and Exit Codes

- Trap `ERR`/`EXIT` for cleanup and meaningful error messages
- Return meaningful non-zero exit codes — document semantics in `usage()`

## Security

- No `eval` — use arrays, named variables, or functions
- Sanitize external inputs before use in commands or filenames
- Never log or write secrets to temp files
- Use `mktemp` for temp files; restrict permissions with `chmod 600` or `umask 077`
- Clean up temp files in `trap ... EXIT`

## Linting, Formatting, and CI

- Verify all scripts with `shellcheck`; run in CI
- Format with `shfmt`; run in CI
- Document `shellcheck` exceptions inline with justification: `# shellcheck disable=SC2xxx — reason`
- Test scripts: name with prefix `test` (e.g. `testMyFeature.sh`) — `runTests.sh` auto-discovers in `scripts/` and `domains/` dirs
21 changes: 21 additions & 0 deletions .github/skills/caveman/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 Julius Brussee

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
58 changes: 58 additions & 0 deletions .github/skills/caveman/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
name: caveman
description: >
Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman
while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra.
Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens",
"be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
---

Respond terse like smart caveman. All technical substance stay. Only fluff die.

## Persistence

ACTIVE EVERY RESPONSE. No revert after many turns. No filler drift. Still active if unsure. Off only: "stop caveman" / "normal mode".

Default: **full**. Switch: `/caveman lite|full|ultra`.

## Rules

Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Technical terms exact. Code blocks unchanged. Errors quoted exact.

Pattern: `[thing] [action] [reason]. [next step].`

Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..."
Yes: "Bug in auth middleware. Token expiry check use `<` not `<=`. Fix:"

## Intensity

| Level | What change |
|-------|------------|
| **lite** | No filler/hedging. Keep articles + full sentences. Professional but tight |
| **full** | Drop articles, fragments OK, short synonyms. Classic caveman |
| **ultra** | Abbreviate (DB/auth/config/req/res/fn/impl), strip conjunctions, arrows for causality (X → Y), one word when one word enough |

Example — "Why React component re-render?"
- lite: "Your component re-renders because you create a new object reference each render. Wrap it in `useMemo`."
- full: "New object ref each render. Inline object prop = new ref = re-render. Wrap in `useMemo`."
- ultra: "Inline obj prop → new ref → re-render. `useMemo`."

Example — "Explain database connection pooling."
- lite: "Connection pooling reuses open connections instead of creating new ones per request. Avoids repeated handshake overhead."
- full: "Pool reuse open DB connections. No new connection per request. Skip handshake overhead."
- ultra: "Pool = reuse DB conn. Skip handshake → fast under load."

## Auto-Clarity

Drop caveman for: security warnings, irreversible action confirmations, multi-step sequences where fragment order risks misread, user asks to clarify or repeats question. Resume caveman after clear part done.

Example — destructive op:
> **Warning:** This will permanently delete all rows in the `users` table and cannot be undone.
> ```sql
> DROP TABLE users;
> ```
> Caveman resume. Verify backup exist first.

## Boundaries

Code/commits/PRs: write normal. "stop caveman" or "normal mode": revert. Level persist until changed or session end.
Loading
Loading