Skip to content

Commit c887add

Browse files
committed
Add AGENTS.md
1 parent 2babe3a commit c887add

1 file changed

Lines changed: 114 additions & 0 deletions

File tree

AGENTS.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# AGENTS.md
2+
3+
Guidance for AI coding agents working in this repository.
4+
5+
## What this repo is
6+
7+
`FSharp.Stats` is an F# library implementing statistical and machine learning methods (descriptive statistics, distributions, hypothesis tests, regression, clustering, ML algorithms, etc.).
8+
9+
This repo focuses on **statistical/ML methods**. The underlying numerical primitives — matrix math, linear algebra, vector operations, BLAS/LAPACK bindings — live in the reference library [**FsMath**](https://github.com/fslaborg/FsMath). When you need low-level numeric routines, prefer pulling them from FsMath rather than re-implementing them here. If something fundamental is missing in FsMath, raise it there instead of duplicating math primitives in this repo.
10+
11+
Source layout:
12+
- [src/FSharp.Stats/](src/FSharp.Stats/) — main library
13+
- [src/FSharp.Stats.Interactive/](src/FSharp.Stats.Interactive/)`dotnet interactive` integration
14+
- [tests/FSharp.Stats.Tests/](tests/FSharp.Stats.Tests/) — Expecto test suite
15+
- [docs/](docs/) — fsdocs tutorials and examples that should stay in sync with public API changes
16+
- [benchmarks/](benchmarks/) — BenchmarkDotNet benchmark projects and checked-in benchmark outputs
17+
18+
## Building
19+
20+
This repo uses a **FAKE** build project ([build/build.fsproj](build/build.fsproj), entrypoint [build/Build.fs](build/Build.fs)). Treat the FAKE targets as the build/test contract for final verification, CI parity, docs, packaging, and release work.
21+
22+
For inner-loop iteration, narrowly scoped raw `dotnet build` / `dotnet test --no-build --filter ...` commands are acceptable as a local optimization when they help you move faster. Do not stop there: before considering the work done or ready for PR, run the repository entrypoint and finish with `./build.sh RunTests` (or `build.cmd RunTests` on Windows).
23+
24+
Entry points:
25+
- Windows: [build.cmd](build.cmd)
26+
- Unix: [build.sh](build.sh)
27+
28+
Both forward arguments to `dotnet run --project ./build/build.fsproj <target>`.
29+
30+
### Targets
31+
32+
Defined across [build/Build.fs](build/Build.fs), [build/BasicTasks.fs](build/BasicTasks.fs), [build/TestTasks.fs](build/TestTasks.fs), [build/PackageTasks.fs](build/PackageTasks.fs), [build/DocumentationTasks.fs](build/DocumentationTasks.fs), [build/ReleaseTasks.fs](build/ReleaseTasks.fs), [build/ReleaseNotesTasks.fs](build/ReleaseNotesTasks.fs):
33+
34+
| Target | Purpose |
35+
|---|---|
36+
| `Clean` | Remove `src/**/bin`, `src/**/obj`, `tests/**/bin`, `tests/**/obj`, `pkg`. |
37+
| `Build` *(default)* | `dotnet build` the solution (Release). Depends on `Clean`. |
38+
| `RunTests` | `dotnet test` the test project with detailed console logger. Depends on `Clean`, `Build`. |
39+
| `RunTestsWithCodeCov` | Same as `RunTests` plus AltCover Cobertura output to `codeCov.xml`. |
40+
| `Pack` / `PackPrerelease` | Produce NuGet packages into `pkg/`. Prompts interactively for confirmation. |
41+
| `BuildDocs` / `BuildDocsPrerelease` | `fsdocs build --eval --clean` against the project. |
42+
| `WatchDocs` / `WatchDocsPrerelease` | `fsdocs watch` for local doc preview. |
43+
| `SetPrereleaseTag` | Reads a prerelease suffix from stdin and sets package version metadata. |
44+
| `ReleaseDocs` / `PrereleaseDocs` | Push built docs. |
45+
| `CreateTag` / `CreatePrereleaseTag`, `PublishNuget` / `PublishNugetPrerelease` | Tag git, push package to NuGet. |
46+
| `UpdateReleaseNotes` | Regenerate `RELEASE_NOTES.md` from commits since the last release. |
47+
| `Release` | Aggregate: `Clean → Build → RunTests → Pack → BuildDocs → CreateTag → PublishNuget → ReleaseDocs`. |
48+
| `PreRelease` | Aggregate prerelease variant of `Release`. |
49+
| `ReleaseNoDocs` / `PreReleaseNoDocs` | Release aggregates without doc steps. |
50+
51+
Common usage:
52+
53+
```sh
54+
./build.sh # default: Build
55+
./build.sh RunTests
56+
./build.sh BuildDocs
57+
./build.sh WatchDocs
58+
```
59+
60+
`Pack` and the `Release*` targets are interactive (prompt for confirmation, prerelease suffix, etc.) — do not run them in non-interactive automation.
61+
62+
## F# project files and compile order
63+
64+
F# file order is load-bearing in this repo. If you add, remove, rename, or move a `.fs` file, you must update the corresponding project file and place it in the correct compile order:
65+
66+
- [src/FSharp.Stats/FSharp.Stats.fsproj](src/FSharp.Stats/FSharp.Stats.fsproj) — main library compile order
67+
- [tests/FSharp.Stats.Tests/FSharp.Stats.Tests.fsproj](tests/FSharp.Stats.Tests/FSharp.Stats.Tests.fsproj) — test compile order
68+
69+
An otherwise correct code change can fail to compile if the new file is missing from the project file or inserted in the wrong slot.
70+
71+
## Adding a new statistical / ML method
72+
73+
When a PR or commit introduces a new statistical method (test, estimator, distribution, ML algorithm, etc.), it is expected to cite a **reference implementation** so reviewers can validate numerics. The current repo is not fully uniform about this yet, but new method work should follow this rule — undocumented numeric code is effectively unreviewable.
74+
75+
What to include:
76+
77+
1. **Link to a canonical reference implementation** in the PR description and/or in a comment above the function. Acceptable references (in rough order of preference):
78+
- R (`stats`, `MASS`, CRAN packages) — link to source or function docs.
79+
- Python (`numpy`, `scipy.stats`, `scikit-learn`, `statsmodels`) — link to source on GitHub or stable docs.
80+
- A peer-reviewed paper (DOI) when no canonical implementation exists.
81+
82+
2. **A small reproducible script** in the reference language that produces the expected numbers. Put it either:
83+
- Inline in the PR description (preferred for review), and/or
84+
- As a comment block above the corresponding test in [tests/FSharp.Stats.Tests/](tests/FSharp.Stats.Tests/), so the expected values in the test are traceable.
85+
86+
3. **Tests that pin the numbers from that script.** The test should assert the same values the reference script produces (within an explicit tolerance), and the comment should make the provenance obvious.
87+
88+
Example comment style for a test:
89+
90+
```fsharp
91+
// Reference: scipy.stats.shapiro
92+
// https://github.com/scipy/scipy/blob/v1.13.0/scipy/stats/_morestats.py
93+
//
94+
// >>> from scipy import stats
95+
// >>> stats.shapiro([1.0, 2.0, 3.0, 4.0, 5.0])
96+
// ShapiroResult(statistic=0.9868..., pvalue=0.9672...)
97+
let ``shapiro matches scipy on [1..5]`` () = ...
98+
```
99+
100+
If you cannot find a reference implementation, say so explicitly in the PR and propose how the numbers were validated (hand derivation, paper, cross-check against another method). Do not silently ship unverified numerics.
101+
102+
## Conventions
103+
104+
- Match the surrounding F# style; prefer adding to existing modules over creating new top-level ones.
105+
- The code styling in this repo changed over time. Follow the style of the area you are editing, not necessarily the style of the oldest code in the repo.
106+
- For older functional and nested-module style, see [src/FSharp.Stats/Correlation.fs](src/FSharp.Stats/Correlation.fs) and [src/FSharp.Stats/Quantile.fs](src/FSharp.Stats/Quantile.fs).
107+
- For newer ergonomic APIs with static members, overloads, and optional parameters, see [src/FSharp.Stats/Integration/Integration.fs](src/FSharp.Stats/Integration/Integration.fs), [src/FSharp.Stats/Signal/QQPlot.fs](src/FSharp.Stats/Signal/QQPlot.fs), [src/FSharp.Stats/Testing/ConfusionMatrix.fs](src/FSharp.Stats/Testing/ConfusionMatrix.fs), and [src/FSharp.Stats/Fitting/LinearRegression.fs](src/FSharp.Stats/Fitting/LinearRegression.fs).
108+
- When adding new ergonomic APIs, prefer a two-tier shape: a core implementation that takes all parameters explicitly, plus overloads or convenience entrypoints for common defaults.
109+
- If you change public API or user-facing behavior, update the relevant docs script in [docs/](docs/) and keep XML documentation comments in sync. Good public API examples to mirror include [src/FSharp.Stats/Integration/Integration.fs](src/FSharp.Stats/Integration/Integration.fs) and [src/FSharp.Stats/Fitting/LinearRegression.fs](src/FSharp.Stats/Fitting/LinearRegression.fs).
110+
- Run `./build.sh RunTests` before opening a PR.
111+
- Target the `developer` branch with PRs.
112+
- Avoid churning checked-in benchmark output under `benchmarks/**/BenchmarkDotNet.Artifacts` unless you are intentionally refreshing benchmark results.
113+
- Keep PRs focused — one method (or one tightly-related family) per PR makes the reference-implementation review tractable.
114+
- Absolutely no changes to code should come without (regression) tests, even if no reference implementation is available. If you add code, you must add tests that validate it.

0 commit comments

Comments
 (0)