Skip to content

Commit 1fb18d4

Browse files
authored
Revamp metadata writing and add kernel id (#471)
* kernel-builder: revamp metadata writing * Strongly type the kernel identifier within kernel-builder. * Write the kernel identifier to the kernel metadata. * Make noarch kernels include the backend in the identifier as well. * Fix tvm-ffi build-time metadata update to include the arches. * Rename build-time metadata update script to be more general. * Run the build-time metadata update script for non-CUDA/ROCm backends. This is not strictly necessary now, but allows us to add other metadata bits in the future. * Add a build hook that validates the metadata after build using the Rust-side data structures. This ensures that build-time changes are valid and schema-conforming. * kernels: support kernel id metadata and use as kernel name This change adds support for kernel id metadata and uses the kernel id (when present) as the identifier in the Python module table in place of the path hash. * Clippy fix * Update CLI docs * Add additional metadata fields to the docs * Add neuron to noarch _ops * metadata warning: tell user how to fix it * Make id optional kernels-data is going to be used by `kernels`, so we need to make sure that we can also read kernels that do not have an id yet.
1 parent c5462f1 commit 1fb18d4

31 files changed

Lines changed: 562 additions & 296 deletions

File tree

docs/source/builder-cli.md

Lines changed: 29 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,13 @@ This document contains the help content for the `kernel-builder` command-line pr
1111
* [`kernel-builder build-and-copy`](#kernel-builder-build-and-copy)
1212
* [`kernel-builder build-and-upload`](#kernel-builder-build-and-upload)
1313
* [`kernel-builder upload`](#kernel-builder-upload)
14+
* [`kernel-builder check-config`](#kernel-builder-check-config)
15+
* [`kernel-builder check-builds`](#kernel-builder-check-builds)
1416
* [`kernel-builder create-pyproject`](#kernel-builder-create-pyproject)
1517
* [`kernel-builder devshell`](#kernel-builder-devshell)
1618
* [`kernel-builder list-variants`](#kernel-builder-list-variants)
1719
* [`kernel-builder testshell`](#kernel-builder-testshell)
1820
* [`kernel-builder update-build`](#kernel-builder-update-build)
19-
* [`kernel-builder validate`](#kernel-builder-validate)
2021
* [`kernel-builder skills`](#kernel-builder-skills)
2122
* [`kernel-builder skills add`](#kernel-builder-skills-add)
2223
* [`kernel-builder clean-pyproject`](#kernel-builder-clean-pyproject)
@@ -35,12 +36,13 @@ Build Hugging Face Hub kernels
3536
* `build-and-copy` — Build the kernel and copy artifacts locally
3637
* `build-and-upload` — Build the kernel and upload to Hugging Face Hub
3738
* `upload` — Upload kernel build artifacts to the Hugging Face Hub
39+
* `check-config` — Validate the build.toml file
40+
* `check-builds` — Validate kernel builds
3841
* `create-pyproject` — Generate CMake files for a kernel extension build
3942
* `devshell` — Spawn a kernel development shell
4043
* `list-variants` — List build variants
4144
* `testshell` — Spawn a kernel test shell
4245
* `update-build` — Update a `build.toml` to the current format
43-
* `validate` — Validate the build.toml file
4446
* `skills` — Install skills for AI coding assistants (Claude, Codex, OpenCode)
4547
* `clean-pyproject` — Clean generated artifacts
4648

@@ -169,6 +171,30 @@ Upload kernel build artifacts to the Hugging Face Hub
169171

170172

171173

174+
## `kernel-builder check-config`
175+
176+
Validate the build.toml file
177+
178+
**Usage:** `kernel-builder check-config [KERNEL_DIR]`
179+
180+
###### **Arguments:**
181+
182+
* `<KERNEL_DIR>`
183+
184+
185+
186+
## `kernel-builder check-builds`
187+
188+
Validate kernel builds
189+
190+
**Usage:** `kernel-builder check-builds [KERNEL_DIR]`
191+
192+
###### **Arguments:**
193+
194+
* `<KERNEL_DIR>`
195+
196+
197+
172198
## `kernel-builder create-pyproject`
173199

174200
Generate CMake files for a kernel extension build
@@ -183,7 +209,7 @@ Generate CMake files for a kernel extension build
183209
###### **Options:**
184210

185211
* `-f`, `--force` — Force-overwrite existing files
186-
* `--ops-id <OPS_ID>` — This is an optional unique identifier that is suffixed to the kernel name to avoid name collisions. (e.g. Git SHA)
212+
* `--unique-id <UNIQUE_ID>` — This is an optional unique identifier that is suffixed to the kernel name to avoid name collisions. (e.g. Git SHA)
187213

188214

189215

@@ -253,18 +279,6 @@ Update a `build.toml` to the current format
253279

254280

255281

256-
## `kernel-builder validate`
257-
258-
Validate the build.toml file
259-
260-
**Usage:** `kernel-builder validate [KERNEL_DIR]`
261-
262-
###### **Arguments:**
263-
264-
* `<KERNEL_DIR>`
265-
266-
267-
268282
## `kernel-builder skills`
269283

270284
Install skills for AI coding assistants (Claude, Codex, OpenCode)

docs/source/builder/writing-kernels.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ The following sections enumerate all supported options for `build.toml`.
184184

185185
- `name` (required): the name of the kernel. The Python code for a Torch
186186
extension must be stored in `torch-ext/<name>`.
187-
- `version` (int, **experimental**): the major version of the kernel.
187+
- `version` (int): the major version of the kernel.
188188
The version is written to the kernel's `metadata.json` and is used
189189
by the `kernels upload` command to upload the kernel to a version
190190
branch named `v<version>`.

docs/source/kernel-requirements.md

Lines changed: 40 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,16 +36,52 @@ must be available for that combination.
3636

3737
## Kernel metadata
3838

39-
The build variant directory can optionally contain a `metadata.json` file.
40-
Currently the metadata specifies the kernel's version and Python dependencies,
41-
for example:
39+
The build variant directory must contain a `metadata.json` file with kernel
40+
metadata. Currently the following top-level keys are supported:
41+
42+
- `id` (`str`, required): a unique identifier for the kernel. This
43+
identifier must also be a valid Python module name. If the kernel
44+
registers Torch ops, they must be registered as `torch.ops.<id>`
45+
- `version` (`int`, required): the kernel version number.
46+
- `backend` (`dict`, required): information about the compute backend that
47+
this build variant supports.
48+
- `python-depends` (`list[str]`, optional): list of Python dependencies
49+
from a curated set of Python dependencies.
50+
51+
Example `metadata.json`:
4252

4353
```json
4454
{
55+
"id": "_mykernel_cuda_be238e4",
4556
"python-depends": ["einops"],
46-
"version": 1
57+
"version": 1,
58+
"backend": {
59+
"type": "cuda",
60+
"archs": ["7.0", "7.2", "7.5", "8.0", "8.6", "8.7", "8.9", "9.0+PTX"]
61+
}
62+
}
63+
```
64+
65+
The `metadata.json` file is generated automatically by `kernel-builder`.
66+
67+
## Backend
68+
69+
The `backend` specifies a dictionary of the following form:
70+
71+
```json
72+
{
73+
# ...
74+
"backend": {
75+
"type": "cuda",
76+
"archs": ["7.0", "7.2", "7.5", "8.0", "8.6", "8.7", "8.9", "9.0+PTX"]
77+
}
78+
}
4779
```
4880

81+
The backend `type` must be one of `cann`, `cpu`, `cuda`, `metal`, `neuron`,
82+
`rocm`, or `xpu`. For CUDA and ROCm, the supported architectures must
83+
be specified in the `archs` field.
84+
4985
### Python dependencies
5086

5187
You can specify Python dependencies that your kernel requires. Dependencies can be either general (required for all backends) or backend-specific (required only for certain compute backends like CUDA, ROCm, XPU, Metal, or CPU).

kernel-builder/src/main.rs

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,9 @@ mod skills;
4040
mod util;
4141
use util::{check_or_infer_kernel_dir, parse_and_validate};
4242

43+
mod validate_builds;
44+
use validate_builds::check_builds;
45+
4346
#[derive(Args, Debug)]
4447
struct NixArgs {
4548
/// Maximum number of parallel Nix build jobs.
@@ -127,6 +130,18 @@ enum Commands {
127130
/// Upload kernel build artifacts to the Hugging Face Hub.
128131
Upload(UploadArgs),
129132

133+
/// Validate the build.toml file.
134+
CheckConfig {
135+
#[arg(name = "KERNEL_DIR")]
136+
kernel_dir: Option<PathBuf>,
137+
},
138+
139+
/// Validate kernel builds.
140+
CheckBuilds {
141+
#[arg(name = "KERNEL_DIR")]
142+
kernel_dir: Option<PathBuf>,
143+
},
144+
130145
/// Generate CMake files for a kernel extension build.
131146
CreatePyproject {
132147
#[arg(name = "KERNEL_DIR")]
@@ -144,7 +159,7 @@ enum Commands {
144159
/// This is an optional unique identifier that is suffixed to the
145160
/// kernel name to avoid name collisions. (e.g. Git SHA)
146161
#[arg(long)]
147-
ops_id: Option<String>,
162+
unique_id: Option<String>,
148163
},
149164

150165
/// Spawn a kernel development shell.
@@ -201,12 +216,6 @@ enum Commands {
201216
kernel_dir: Option<PathBuf>,
202217
},
203218

204-
/// Validate the build.toml file.
205-
Validate {
206-
#[arg(name = "KERNEL_DIR")]
207-
kernel_dir: Option<PathBuf>,
208-
},
209-
210219
/// Install skills for AI coding assistants (Claude, Codex, OpenCode).
211220
Skills {
212221
#[command(subcommand)]
@@ -333,8 +342,8 @@ fn main() -> Result<()> {
333342
kernel_dir,
334343
force,
335344
target_dir,
336-
ops_id,
337-
} => create_pyproject(kernel_dir, target_dir, force, ops_id),
345+
unique_id,
346+
} => create_pyproject(kernel_dir, target_dir, force, unique_id),
338347
Commands::Devshell {
339348
kernel_dir,
340349
variant,
@@ -359,8 +368,12 @@ fn main() -> Result<()> {
359368
variant,
360369
),
361370
Commands::UpdateBuild { kernel_dir } => update_build(kernel_dir),
362-
Commands::Validate { kernel_dir } => {
363-
validate(kernel_dir)?;
371+
Commands::CheckConfig { kernel_dir } => {
372+
check_config(kernel_dir)?;
373+
Ok(())
374+
}
375+
Commands::CheckBuilds { kernel_dir } => {
376+
check_builds(kernel_dir)?;
364377
Ok(())
365378
}
366379
Commands::GenerateDocs => {
@@ -390,7 +403,7 @@ fn main() -> Result<()> {
390403
}
391404
}
392405

393-
fn validate(kernel_dir: Option<PathBuf>) -> Result<()> {
406+
fn check_config(kernel_dir: Option<PathBuf>) -> Result<()> {
394407
let kernel_dir = check_or_infer_kernel_dir(kernel_dir)?;
395408
parse_and_validate(kernel_dir)?;
396409
Ok(())

kernel-builder/src/pyproject/common.rs

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,11 @@ use itertools::Itertools;
66
use kernels_data::config::{Backend, General};
77
use kernels_data::metadata::{BackendInfo, Metadata};
88

9+
use crate::pyproject::ops_identifier::KernelIdentifier;
910
use crate::pyproject::FileSet;
1011

1112
static COMPAT_PY: &str = include_str!("templates/compat.py");
13+
static ADD_BUILD_METADATA_PY: &str = include_str!("templates/torch/add_build_metadata.py");
1214

1315
pub fn write_compat_py(file_set: &mut FileSet) -> Result<()> {
1416
let mut path = PathBuf::new();
@@ -18,7 +20,11 @@ pub fn write_compat_py(file_set: &mut FileSet) -> Result<()> {
1820
Ok(())
1921
}
2022

21-
pub fn write_metadata(general: &General, file_set: &mut FileSet) -> Result<()> {
23+
pub fn write_metadata(
24+
general: &General,
25+
kernel_id: &KernelIdentifier,
26+
file_set: &mut FileSet,
27+
) -> Result<()> {
2228
for backend in &Backend::all() {
2329
let writer = file_set.entry(format!("metadata-{backend}.json"));
2430

@@ -33,6 +39,7 @@ pub fn write_metadata(general: &General, file_set: &mut FileSet) -> Result<()> {
3339
.collect::<Result<Vec<_>>>()?;
3440

3541
let metadata = Metadata {
42+
id: Some(kernel_id.to_string_for_backend(*backend)),
3643
version: general.version,
3744
license: general.license.clone(),
3845
upstream: general.upstream.clone(),
@@ -61,6 +68,14 @@ where
6168
.join(";")
6269
}
6370

71+
pub fn write_add_build_metadata_py(file_set: &mut FileSet) {
72+
write_cmake_file(
73+
file_set,
74+
"add_build_metadata.py",
75+
ADD_BUILD_METADATA_PY.as_bytes(),
76+
);
77+
}
78+
6479
/// Helper function to write a file to the cmake subdirectory
6580
pub fn write_cmake_file(file_set: &mut FileSet, filename: &str, content: &[u8]) {
6681
let mut path = PathBuf::new();

kernel-builder/src/pyproject/mod.rs

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,10 @@ use eyre::{bail, Result};
88
use kernels_data::config::{Build, Framework};
99
use minijinja::Environment;
1010

11-
use crate::util::{check_or_infer_kernel_dir, check_or_infer_target_dir, parse_build};
11+
use crate::{
12+
pyproject::ops_identifier::KernelIdentifier,
13+
util::{check_or_infer_kernel_dir, check_or_infer_target_dir, parse_build},
14+
};
1215

1316
pub(crate) mod common;
1417
pub mod deps;
@@ -21,21 +24,17 @@ mod tvm_ffi;
2124
pub use fileset::FileSet;
2225
pub use kernels_data::metadata::parse_metadata;
2326

24-
pub fn create_pyproject_file_set(
25-
build: Build,
26-
target_dir: impl AsRef<Path>,
27-
ops_id: Option<String>,
28-
) -> Result<FileSet> {
27+
pub fn create_pyproject_file_set(build: Build, kernel_id: &KernelIdentifier) -> Result<FileSet> {
2928
let mut env = Environment::new();
3029
env.set_trim_blocks(true);
3130
minijinja_embed::load_templates!(&mut env);
3231

3332
let file_set = if matches!(build.framework, Framework::TvmFfi(_)) {
34-
tvm_ffi::write_tvm_ffi_ext(&env, &build, target_dir, ops_id)?
33+
tvm_ffi::write_tvm_ffi_ext(&env, &build, kernel_id)?
3534
} else if build.is_noarch() {
36-
torch::write_torch_ext_noarch(&env, &build, target_dir, ops_id)?
35+
torch::write_torch_ext_noarch(&env, &build, kernel_id)?
3736
} else {
38-
torch::write_torch_ext(&env, &build, target_dir, ops_id)?
37+
torch::write_torch_ext(&env, &build, kernel_id)?
3938
};
4039

4140
Ok(file_set)
@@ -45,12 +44,13 @@ pub fn create_pyproject(
4544
kernel_dir: Option<PathBuf>,
4645
target_dir: Option<PathBuf>,
4746
force: bool,
48-
ops_id: Option<String>,
47+
unique_id: Option<String>,
4948
) -> Result<()> {
5049
let kernel_dir = check_or_infer_kernel_dir(kernel_dir)?;
5150
let target_dir = check_or_infer_target_dir(&kernel_dir, target_dir)?;
5251
let build = parse_build(&kernel_dir)?;
53-
let file_set = create_pyproject_file_set(build, &target_dir, ops_id)?;
52+
let kernel_id = KernelIdentifier::new(&kernel_dir, build.general.name.python_name(), unique_id);
53+
let file_set = create_pyproject_file_set(build, &kernel_id)?;
5454
file_set.write(&target_dir, force)?;
5555

5656
Ok(())
@@ -61,14 +61,14 @@ pub fn clean_pyproject(
6161
target_dir: Option<PathBuf>,
6262
dry_run: bool,
6363
force: bool,
64-
ops_id: Option<String>,
64+
unique_id: Option<String>,
6565
) -> Result<()> {
6666
let kernel_dir = check_or_infer_kernel_dir(kernel_dir)?;
6767
let target_dir = check_or_infer_target_dir(&kernel_dir, target_dir)?;
68-
6968
let build = parse_build(&kernel_dir)?;
70-
let generated_files =
71-
create_pyproject_file_set(build, target_dir.clone(), ops_id)?.into_names();
69+
let kernel_id = KernelIdentifier::new(&kernel_dir, build.general.name.python_name(), unique_id);
70+
71+
let generated_files = create_pyproject_file_set(build, &kernel_id)?.into_names();
7272

7373
if generated_files.is_empty() {
7474
eprintln!("No generated artifacts found to clean.");

0 commit comments

Comments
 (0)