Skip to content

Commit 6e4e0ce

Browse files
committed
Add Copilot instructions files
- generic instruction file: .github/copilot-instructions.md - instruction file specific to the CI workflows: .github/instructions/ci-workflows.instructions.md - instruction file specific to the documentation generation: .github/instructions/doc-changes.instructions.md - instruction file specific to the development Docker image maintenance: .github/instructions/docker-changes.instructions.md - instruction file specific to the maintenance of the Python code itself: .github/instructions/python-changes.instructions.md
1 parent b3216b2 commit 6e4e0ce

5 files changed

Lines changed: 665 additions & 0 deletions

File tree

.github/copilot-instructions.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Copilot Instructions for khiops-python
2+
3+
Use this file as the shared repository guide. When you work in a path covered by a
4+
scoped instruction file, apply both this document and the matching file in
5+
`.github/instructions/`.
6+
7+
## Scoped Instruction Files
8+
9+
- `.github/instructions/python-changes.instructions.md` — Python source and test
10+
changes (`**/*.py`)
11+
- `.github/instructions/docker-changes.instructions.md` — development Docker image
12+
changes (`packaging/docker/khiopspydev/**`)
13+
- `.github/instructions/doc-changes.instructions.md` — documentation source changes
14+
(`doc/**`)
15+
- `.github/instructions/ci-workflows.instructions.md` — GitHub Actions workflow
16+
changes (`.github/workflows/**`)
17+
18+
## Architecture
19+
20+
Khiops Python is a Python interface to the **Khiops AutoML suite** for building
21+
supervised models (classifiers, regressors, encoders) and unsupervised models
22+
(coclusterings). It provides two ways to use Khiops from Python:
23+
24+
- **`khiops.core`** — The low-level API that drives the Khiops binaries via
25+
dictionary files (`.kdic`, `.kdicj`) and tabular data files. The code which implements this API must depend only on Python built-in modules.
26+
- `core.api` — public functions such as `train_predictor` and
27+
`train_recoder`
28+
- `core.dictionary`, — data classes for Khiops dictionary files (in the
29+
`.kdic` and JSON `.kdicj` formats)
30+
`core.coclustering_results` — data classes for Khiops report files
31+
(`.khj`, `.khcj`)
32+
- `core.internals.runner` — backend abstraction for local, Docker, and other
33+
execution modes, configurable with `get_runner()` and `set_runner()`
34+
- `core.internals.filesystems` — filesystem abstraction for local, S3, GCS and
35+
Azure access
36+
- `core.internals.task`, `core.internals.tasks/` — task definitions for
37+
Khiops operations
38+
- **`khiops.sklearn`** — Scikit-Learn compatible estimators built on top of
39+
`khiops.core`. The code which implements these estimators may depend on Pandas and Scikit-learn only.
40+
```
41+
KhiopsEstimator(ABC, BaseEstimator)
42+
├── KhiopsCoclustering(ClusterMixin)
43+
└── KhiopsSupervisedEstimator
44+
├── KhiopsPredictor
45+
│ ├── KhiopsClassifier(ClassifierMixin)
46+
│ └── KhiopsRegressor(RegressorMixin)
47+
└── KhiopsEncoder(TransformerMixin)
48+
```
49+
- `sklearn.dataset` — normalizes DataFrames, file paths, and multi-table
50+
dictionaries into Khiops-compatible datasets
51+
- **`khiops.extras`** — Optional integrations such as the Docker runner
52+
- **`khiops.samples`** — Sample scripts, also used to generate parts of the
53+
documentation via `doc/convert-samples-hook`
54+
55+
Keep changes inside these layer boundaries.
56+
57+
## Shared Conventions
58+
59+
### Python Style
60+
61+
- Use **paragraph-oriented programming**: group code into short paragraphs with
62+
a comment header describing the intent, separated by blank lines. Avoid
63+
commenting every line.
64+
- Format Python code with **Black** (88-character line length) and sort imports
65+
with **isort** using the Black profile. Configuration is in `pyproject.toml`.
66+
- Black does not wrap long literal strings. Wrap those manually and use
67+
`pylint --disable=all --enable=line-too-long khiops/` to find violations.
68+
- Address all pylint **errors** (code E). Other pylint warnings are lower
69+
priority — do not be a slave of the linter.
70+
- Keep code and comments in English.
71+
- `pylint: disable=invalid-name` is used in `khiops/sklearn/estimators.py` to
72+
allow scikit-learn's `X` and `y` naming convention. Do not add that
73+
suppression elsewhere.
74+
75+
### Dependency Rules
76+
77+
- `khiops.core` must only import Python built-in modules.
78+
- `khiops.sklearn` may directly depend on Pandas and Scikit-learn only.
79+
- Do not add new external dependencies without discussion. Minimize external
80+
package dependencies to reduce installation problems.
81+
- Development and documentation generation dependencies (e.g., `black`,
82+
`isort`, `sphinx`, `wrapt`, `furo`) can be more permissive, but still avoid
83+
unnecessary additions.
84+
- Test dependencies are listed in `test-requirements.txt` (`coverage`, `wrapt`).
85+
Package dependencies are extracted from `pyproject.toml` at CI time via
86+
`scripts/extract_dependencies_from_pyproject_toml.py`.
87+
88+
### Python Support Policy
89+
90+
- CI tests run against Python 3.10–3.14.
91+
92+
### Versioning
93+
94+
The project uses `MAJOR.MINOR.PATCH.INCREMENT[-PRE_RELEASE]`, where
95+
`MAJOR.MINOR.PATCH` tracks the compatible Khiops native version and `INCREMENT`
96+
tracks the Python package's own evolution.
97+
98+
For Pip and Conda packages, the dash before the pre-release atom is removed to
99+
comply with
100+
[Python version specifiers](https://packaging.python.org/en/latest/specifications/version-specifiers/#version-specifiers)
101+
(e.g., `11.0.0.2a1` instead of `11.0.0.2-a.1`).
102+
103+
## License
104+
105+
BSD 3-Clause-Clear. See `LICENSE.md`.
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
---
2+
applyTo: ".github/workflows/**"
3+
---
4+
5+
# CI Workflow Changes
6+
7+
Use these rules for files under `.github/workflows/`. Apply the shared guidance
8+
from `.github/copilot-instructions.md` first, then this workflow-specific
9+
guidance.
10+
11+
## Workflow Overview
12+
13+
This repository has seven GitHub Actions workflows in `.github/workflows/`. Most
14+
workflows use concurrency groups to cancel in-progress runs when superseded,
15+
except `release.yml` (no concurrency group) and `api-docs.yml` (which uses a
16+
`pages` concurrency group that does not cancel in-progress runs).
17+
18+
### `quick-checks.yml`
19+
20+
Runs pre-commit hooks on every pull request and on `workflow_dispatch`. The
21+
hooks (configured in
22+
`.pre-commit-config.yaml`) are: Black, pylint, isort (with special no-sections
23+
config for sample files), yamlfix, shellcheck, GitHub workflow/action schema
24+
validation (`check-github-workflows`, `check-github-actions`), and a local
25+
`samples-generation` hook that regenerates reST samples when
26+
`khiops/samples/samples.py` or `khiops/samples/samples_sklearn.py` change.
27+
28+
### `tests.yml`
29+
30+
The main test suite. Triggers on PRs that touch `khiops/**/*.py`,
31+
`tests/**/*.py`, `tests/resources/**` (excluding `tests/resources/**/*.md`), or
32+
the workflow file itself. Also supports `workflow_dispatch`.
33+
34+
Three job groups:
35+
36+
- **`run`** (Linux matrix): Runs across Python 3.10–3.14 in custom Docker
37+
containers (`ghcr.io/khiopsml/khiops-python/khiopspydev-ubuntu22.04`). Each
38+
Python version uses a dedicated Conda environment with native Khiops.
39+
Coverage is collected with `coverage` and reported as XML. Test results use
40+
JUnit XML via `unittest-xml-reporting`.
41+
- **`check-khiops-integration-on-linux`**: Runs integration tests on multiple
42+
Linux containers (ubuntu22.04, rocky8, rocky9, debian13). Validates Khiops
43+
status, runs samples, tests major-version mismatch detection with a
44+
`py3_khiops10_conda` environment, and runs the integration test suite.
45+
- **`check-khiops-integration-on-windows`**: Installs Khiops Desktop via NSIS
46+
installer on Windows 2022 with Python 3.12. Runs integration tests and
47+
samples outside a Python virtual environment, then installs khiops-python
48+
inside a venv and validates the installation status.
49+
50+
**Expensive tests** (remote file access with S3/GCS/Azure): Skipped by default
51+
on feature branches. Enabled on `main`/`main-v10` branches or via the
52+
`run-expensive-tests` workflow dispatch input. These require GCP Workload
53+
Identity Federation, a local fake S3 server, and Azure storage credentials.
54+
55+
**Environment variables**: `KHIOPS_SAMPLES_DIR` points to a checkout of
56+
`khiopsml/khiops-samples`. `KHIOPS_PROC_NUMBER=4` forces MPI multi-process
57+
execution. MPI oversubscribe flags are set for Open MPI 4.x and 5+.
58+
59+
### `pip.yml`
60+
61+
Builds an **sdist** package (no wheel) and tests it in Docker containers
62+
(ubuntu22.04, rocky9, debian13). Triggers on:
63+
64+
- Tag pushes (any tag) — automatically publishes to GitHub Releases
65+
- PRs touching `pyproject.toml`, `LICENSE.md`, or the workflow file
66+
- `workflow_dispatch` with optional `pypi-target` choice (`None`, `testpypi`,
67+
`pypi`)
68+
69+
Publishing to TestPyPI/PyPI uses OIDC Trusted Publishing and requires the
70+
corresponding GitHub environment (`testpypi` or `pypi`). Only runs for the
71+
`KhiopsML` org on tag pushes.
72+
73+
### `release.yml`
74+
75+
Manual workflow that merges `dev` into `main`, tags the merge commit with the
76+
provided version, and resets `dev` to `main`. Only triggered via
77+
`workflow_dispatch` with a `version` input.
78+
79+
### `api-docs.yml`
80+
81+
Builds Sphinx documentation inside a dev Docker container. Triggers on:
82+
83+
- Tag pushes — builds docs and uploads a zip archive to GitHub Releases
84+
- PRs touching `doc/**/*.rst`, `doc/create-doc`, `doc/clean-doc`, `doc/*.py`,
85+
`khiops/**/*.py`, or the workflow file
86+
- `workflow_dispatch` with optional tutorial and samples revision inputs
87+
88+
Uses the `khiopspydev-ubuntu22.04` Docker image and runs
89+
`./create-doc -t -d -g <revision>`. Uses a `pages` concurrency group that does
90+
**not** cancel in-progress runs (to avoid interrupting production deployments).
91+
92+
### `dev-docker.yml`
93+
94+
Builds development Docker images for multiple OS targets (ubuntu22.04, rocky8,
95+
rocky9, debian13) with configurable Khiops revision, server revision, Python
96+
versions (3.10–3.14), and remote file driver versions (GCS, S3, Azure).
97+
Triggers on PRs touching `packaging/docker/khiopspydev/Dockerfile.*` or the
98+
workflow file, and on `workflow_dispatch`. Images are pushed to
99+
`ghcr.io/khiopsml/khiops-python/khiopspydev-*` only when manually requested via
100+
`push: true`. The `set-latest` flag only works on the `main` or `main-v10`
101+
branches.
102+
103+
### `test-conda-forge-package.yml`
104+
105+
Manual-only workflow that tests the released `khiops` Conda package on the
106+
`conda-forge` channel across a broad matrix: Python 3.10–3.14 × multiple OS
107+
environments (Ubuntu 20.04/22.04/24.04, Rocky 8/9, Windows 2022/2025, macOS
108+
14/15/15-Intel). Tests both normal Conda environments and "Conda-based
109+
environments" (where `CONDA_PREFIX` is unset to simulate non-Conda invocation).
110+
111+
## Editing Rules
112+
113+
- Workflow YAML files are validated by pre-commit hooks
114+
(`check-github-workflows`, `check-github-actions`) and formatted by `yamlfix`.
115+
- The dev Docker images are the test environment for both `tests.yml` and
116+
`pip.yml`. If you need new system dependencies in CI, they go into the
117+
Dockerfiles under `packaging/docker/khiopspydev/`.
118+
- Test dependencies are in `test-requirements.txt` (`coverage`, `wrapt`).
119+
Package dependencies are extracted from `pyproject.toml` at CI time via
120+
`scripts/extract_dependencies_from_pyproject_toml.py`.

0 commit comments

Comments
 (0)