ghcp(crawl): ex0 bootstrap scaffolding

RitaChen609 · RitaChen609 · commit 79a579774343 · 2026-03-17T16:32:59.000-04:00
diff --git a/.copilot-track/crawl/README.md b/.copilot-track/crawl/README.md
@@ -0,0 +1,95 @@
+# Copilot Crawl Track — README
+
+This directory (`.copilot-track/crawl/`) holds AI-assisted crawl-and-modernisation artefacts for the **MarkLogic Java Client API** repository. It is not part of the production library; contents are for developer guidance and AI context only.
+
+---
+
+## What Is the Crawl Track?
+
+The crawl track is an incremental, evidence-driven workflow for making large-scale changes to this codebase using AI assistance (GitHub Copilot / agent mode). Changes are broken into small, reviewable PRs that form a **chain** — each PR builds on the previous one.
+
+---
+
+## Chain-PR Pattern
+
+A chain-PR is a sequence of pull requests where each PR:
+
+1. Targets the **previous PR's branch** (not `main`) as its base, creating a linear dependency chain.
+2. Carries a **single, focused concern** (e.g., "migrate HTTP client from HttpClient to OkHttp", "update Jackson version", "replace deprecated API calls in DocMgr").
+3. Is reviewed and merged in order — do **not** merge PR N+1 before PR N is merged and its branch updated.
+
+```
+main ← PR-1 (foundation) ← PR-2 (layer A) ← PR-3 (layer B) ← ...
+```
+
+When the base PR merges, rebase subsequent PRs down the chain to keep them conflict-free:
+
+```bash
+git fetch origin
+git checkout feature/crawl-layer-B
+git rebase origin/feature/crawl-layer-A
+git push --force-with-lease
+```
+
+---
+
+## Evidence in PRs
+
+Every crawl PR must include evidence that the change is safe. Accepted evidence types:
+
+| Evidence                                         | Where to add it                                                            |
+| ------------------------------------------------ | -------------------------------------------------------------------------- |
+| Passing CI green-check (unit + functional tests) | Shown automatically on the PR by Jenkins                                   |
+| Before/after compile output                      | Paste in PR description under `## Build evidence`                          |
+| Test-coverage delta                              | Add `## Test delta` section; attach Gradle test report if coverage dropped |
+| Copilot prompt used                              | Add `## Prompt used` section (see below)                                   |
+| Manual verification steps                        | Add `## Manual verification` with exact commands run                       |
+
+PRs that lack evidence will be marked **needs-evidence** and not merged.
+
+---
+
+## Prompt Usage
+
+AI prompts that drove a crawl change belong in the PR description under `## Prompt used`. This creates an audit trail and lets reviewers reproduce or adjust the change.
+
+**Template:**
+
+```markdown
+## Prompt used
+
+> Agent mode, model: claude-sonnet-4-6
+>
+> "Migrate all usages of `com.marklogic.client.impl.OkHttpServices` constructor that
+> pass a plain `String` password to instead use `char[]` and call `Arrays.fill` after use.
+> Do not modify test files. Only change files under marklogic-client-api/src/main/."
+
+Files changed by prompt: <!-- list them -->
+Files reviewed manually: <!-- list them -->
+```
+
+Storing prompts in PRs helps future crawl passes understand _why_ a change was made, not just _what_ changed.
+
+---
+
+## Adding New Crawl Artefacts
+
+Place any generated files, diff summaries, or migration notes inside this directory as flat Markdown or JSON files. Suggested naming:
+
+```
+.copilot-track/crawl/
+├── README.md             ← this file
+├── 001-<topic>.md        ← plan / notes for crawl step 1
+├── 002-<topic>.md        ← plan / notes for crawl step 2
+└── ...
+```
+
+Keep each step file small (< 200 lines). Reference `ai-track-docs/` for system-level context.
+
+---
+
+## Related Docs
+
+- [ai-track-docs/SYSTEM-OVERVIEW.md](../../ai-track-docs/SYSTEM-OVERVIEW.md) — what this project does and how it is structured
+- [ai-track-docs/build-test.md](../../ai-track-docs/build-test.md) — how to build and run tests locally
+- [ai-track-docs/architecture.mmd](../../ai-track-docs/architecture.mmd) — Mermaid architecture diagram
diff --git a/ai-track-docs/SYSTEM-OVERVIEW.md b/ai-track-docs/SYSTEM-OVERVIEW.md
@@ -0,0 +1,41 @@
+# System Overview — MarkLogic Java Client API
+
+## Purpose
+
+The **MarkLogic Java Client API** (`com.marklogic:marklogic-client-api`) is a Java library that exposes MarkLogic Server's REST API as a type-safe, fluent Java interface. It supports reading, writing, deleting, and querying JSON, XML, binary, and text documents, as well as ACID multi-statement transactions, semantic (SPARQL/RDF), Full-text search, alerting, Data Services, and Row Manager (Optic API).
+
+## Repository Root
+
+`marklogic-client-api-parent` (Gradle multi-project).
+
+## Modules
+
+| Module                                 | Description                                                                  |
+| -------------------------------------- | ---------------------------------------------------------------------------- |
+| `marklogic-client-api`                 | Core library — all production source code                                    |
+| `marklogic-client-api-functionaltests` | Functional / integration tests requiring a live MarkLogic instance           |
+| `ml-development-tools`                 | Kotlin-based developer tooling (code generation helpers)                     |
+| `test-app`                             | MarkLogic application deployed to the test server (modules, schemas, config) |
+| `examples`                             | Standalone usage examples                                                    |
+
+## Runtime Requirements
+
+- **Java 17** (minimum; Java 21 also supported and tested in CI)
+- **MarkLogic Server** (for integration/functional tests) — started via `docker-compose.yaml`
+
+## Technology Stack
+
+- Build: **Gradle** (wrapper at `./gradlew`)
+- Test framework: **JUnit 5** (unit) + MarkLogic functional test harness
+- CI: **Jenkins** (`Jenkinsfile`) — Docker-based MarkLogic, parallel Java 17/21 builds
+- Primary language: **Java**; developer tooling in **Kotlin**
+
+## Key External Dependencies
+
+- OkHttp (HTTP client transport)
+- Jackson (JSON serialization)
+- SLF4J / Logback (logging)
+
+## Relationship to MarkLogic Server
+
+All network communication travels over the **MarkLogic REST Management and Client APIs** (typically port 8000/8002). The library never connects directly to MarkLogic's internal ports; authentication is via HTTP Digest or certificate.
diff --git a/ai-track-docs/architecture.mmd b/ai-track-docs/architecture.mmd
@@ -0,0 +1,50 @@
+%%{init: {"theme": "neutral"}}%%
+graph TD
+    subgraph callers["Calling Code"]
+        APP["Java Application / Examples"]
+    end
+
+    subgraph core["marklogic-client-api (core)"]
+        DatabaseClient["DatabaseClient\n(entry-point factory)"]
+        DocMgr["Document Managers\n(JSON / XML / Text / Binary / Generic)"]
+        QueryMgr["QueryManager\n(search, cts, SPARQL)"]
+        TxMgr["TransactionManager\n(ACID multi-statement)"]
+        RowMgr["RowManager\n(Optic / SQL)"]
+        DataSvc["Data Services\n(generated proxies)"]
+        RESTServices["RESTServices\n(OkHttp transport layer)"]
+    end
+
+    subgraph devtools["ml-development-tools"]
+        CodeGen["Proxy Code Generator\n(Kotlin)"]
+    end
+
+    subgraph testapp["test-app"]
+        MLConfig["ml-config\n(DB, forests, REST server)"]
+        MLModules["ml-modules\n(XQuery / SJS modules)"]
+        MLSchemas["ml-schemas\n(TDE templates)"]
+    end
+
+    subgraph server["MarkLogic Server (Docker / remote)"]
+        REST["REST Client API\n(port 8000 / 8002)"]
+        XDBC["e-node internal"]
+    end
+
+    APP --> DatabaseClient
+    DatabaseClient --> DocMgr
+    DatabaseClient --> QueryMgr
+    DatabaseClient --> TxMgr
+    DatabaseClient --> RowMgr
+    DatabaseClient --> DataSvc
+    DocMgr --> RESTServices
+    QueryMgr --> RESTServices
+    TxMgr --> RESTServices
+    RowMgr --> RESTServices
+    DataSvc --> RESTServices
+    RESTServices -->|"HTTP Digest / Cert auth"| REST
+    REST --> XDBC
+
+    CodeGen -->|"generates Java proxy classes"| DataSvc
+
+    MLConfig -->|"mlDeploy"| REST
+    MLModules -->|"mlDeploy"| REST
+    MLSchemas -->|"mlReloadSchemas"| REST
diff --git a/ai-track-docs/build-test.md b/ai-track-docs/build-test.md
@@ -0,0 +1,95 @@
+# Build & Test Guide
+
+## Prerequisites
+
+| Requirement    | Notes                                                                                        |
+| -------------- | -------------------------------------------------------------------------------------------- |
+| JDK 17+        | JDK 21 also works; set `JAVA_HOME` or rely on Gradle toolchain auto-provisioning             |
+| Docker         | Required for functional tests (MarkLogic container)                                          |
+| Gradle wrapper | Use `./gradlew` (Linux/macOS) or `gradlew.bat` (Windows); do **not** install Gradle globally |
+
+---
+
+## Quick Build (no tests)
+
+```bash
+./gradlew clean build -x test
+```
+
+---
+
+## Unit Tests — core library only
+
+```bash
+./gradlew :marklogic-client-api:test
+```
+
+Unit tests have **no external dependencies**; they run without MarkLogic.
+
+---
+
+## Developer-Tools Tests
+
+```bash
+./gradlew :ml-development-tools:test
+```
+
+---
+
+## Functional / Integration Tests
+
+Functional tests require a running MarkLogic instance. Start it with Docker Compose first:
+
+```bash
+docker compose up -d
+```
+
+Then deploy the test application and run the functional tests:
+
+```bash
+./gradlew :test-app:mlDeploy :test-app:mlReloadSchemas
+./gradlew :marklogic-client-api-functionaltests:test
+```
+
+> **Tip:** The `Jenkinsfile` contains the authoritative CI test sequence if the local steps diverge.
+
+---
+
+## Running a Specific Test Class
+
+```bash
+./gradlew :marklogic-client-api:test --tests "com.marklogic.client.test.SomeTest"
+```
+
+---
+
+## Gradle Properties
+
+Key properties live in `gradle.properties` and `marklogic-client-api/gradle.properties`. Override on the command line with `-P<key>=<value>`:
+
+```bash
+./gradlew :marklogic-client-api:test -PmlHost=localhost -PmlPort=8000
+```
+
+---
+
+## Build Artifacts
+
+After a successful build the JAR is at:
+
+```
+marklogic-client-api/build/libs/marklogic-client-api-<version>.jar
+```
+
+---
+
+## Linting / Static Analysis
+
+No dedicated lint step is configured in the current Gradle build.
+
+---
+
+## Common Pitfalls
+
+- Functional tests **will hang or fail** if Docker is not running or MarkLogic has not finished starting. Wait ~30 s after `docker compose up -d` before deploying.
+- Java toolchain auto-provisioning requires internet access on first run. On air-gapped machines set `org.gradle.java.installations.paths` in `~/.gradle/gradle.properties`.