Skip to content

Commit 351aff6

Browse files
committed
Added build/drop index and renamed low-level R/C++ API
Fixes #67 #98
1 parent 37c8525 commit 351aff6

31 files changed

Lines changed: 1553 additions & 1131 deletions

.github/workflows/document.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ jobs:
3838
shell: Rscript {0}
3939

4040
- name: Commit and push changes
41+
working-directory: RcppTskit
4142
run: |
4243
git config --local user.name "$GITHUB_ACTOR"
4344
git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"

.pre-commit-config.yaml

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,14 +28,14 @@ repos:
2828
hooks:
2929
- id: air-format
3030
name: air format
31-
entry: RcppTskit/tools/run-local-tool.sh air format .
31+
entry: RcppTskit/tools/run_local_tool.sh air format .
3232
language: system
3333
pass_filenames: false
3434
files: '\.(R|Rmd|rmd|qmd|Qmd)$'
3535

3636
- id: jarl-lint
3737
name: jarl lint
38-
entry: RcppTskit/tools/run-local-tool.sh jarl check .
38+
entry: RcppTskit/tools/run_local_tool.sh jarl check .
3939
language: system
4040
pass_filenames: false
4141
files: '\.(R|Rmd|rmd|qmd|Qmd)$'
@@ -48,6 +48,13 @@ repos:
4848

4949
- id: clang-tidy
5050
name: clang-tidy for RcppTskit
51-
entry: python RcppTskit/tools/clang-tidy.py
52-
language: python
51+
entry: RcppTskit/tools/clang_tidy.py
52+
language: system
5353
files: '\.(c|cc|cpp|cxx|h|hh|hpp|hxx)$'
54+
55+
- id: check-sync-between-cpp-and-hpp
56+
name: check sync between cpp and hpp options and defaults
57+
entry: RcppTskit/tools/check_sync_between_cpp_and_hpp.R
58+
language: system
59+
pass_filenames: false
60+
files: '^(RcppTskit/src/RcppTskit\.cpp|RcppTskit/inst/include/RcppTskit_public\.hpp)$'

AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ Hook responsibilities:
167167
* `air format .`: format R, Rmd, and qmd files.
168168
* `jarl check .`: lint R, Rmd, and qmd files.
169169
* `clang-format -i --style=file`: format C/C++ sources and headers.
170-
* `python RcppTskit/tools/clang-tidy.py`: run clang-tidy checks for C/C++.
170+
* `python RcppTskit/tools/clang_tidy.py`: run clang-tidy checks for C/C++.
171171
* Standard pre-commit hygiene hooks:
172172
whitespace, line endings, YAML checks,
173173
merge-conflict markers, and large-file checks.
@@ -202,7 +202,7 @@ export CLANG_TIDY="$(brew --prefix llvm)/bin/clang-tidy"
202202
Then you can run the wrapper script directly:
203203

204204
```sh
205-
python RcppTskit/tools/clang-tidy.py RcppTskit/src/RcppTskit.cpp
205+
python RcppTskit/tools/clang_tidy.py RcppTskit/src/RcppTskit.cpp
206206
```
207207

208208
### Coverage with covr

README.md

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,12 @@ The `Python` API can be called from `R` via the `reticulate` `R` package to
1717
seamlessly load and analyse a tree sequence, as described at
1818
https://tskit.dev/tutorials/RcppTskit.html.
1919
`RcppTskit` provides `R` access to the `tskit C` API for use cases where the
20-
`reticulate` option is not optimal. For example, for high-performance and
21-
low-level work with tree sequences. Currently, `RcppTskit` provides a limited
22-
number of `R` functions due to the availability of extensive `Python` API and
23-
the `reticulate` option.
20+
`reticulate` option is not optimal.
21+
For example, for high-performance and low-level work with tree sequences.
22+
Currently, `RcppTskit` provides a limited number of functions
23+
due to the availability of extensive `Python` API and the `reticulate` option.
24+
The provided `RcppTskit R` API mirrors the `tskit Python` API,
25+
while the `RcppTskit C++` API mirrors the `tskit C` API.
2426

2527
See more details on the state of the tree sequence ecosystem and aims of
2628
`RcppTskit` in [the introduction vignette](https://highlanderlab.r-universe.dev/articles/RcppTskit/RcppTskit_intro.html) ([source](RcppTskit/vignettes/RcppTskit_intro.qmd)).
@@ -153,9 +155,13 @@ Specifically, we use:
153155
To install the hooks, run:
154156

155157
```
156-
pre-commit install
158+
pre-commit install --install-hooks
159+
pre-commit install --hook-type pre-push
157160
```
158161

162+
Run these once per clone.
163+
This enables automatic checks on `commit` and `push`.
164+
159165
### tskit
160166

161167
If you plan to update `tskit`, follow instructions in `extern/README.md`.
@@ -203,12 +209,18 @@ On Windows, replace `tar.gz` with `zip`.
203209

204210
### Pre-commit run
205211

206-
Before committing your changes, run the `pre-commit` hooks to ensure code quality:
212+
When committing your changes,
213+
`pre-commit` hooks should kick-in automatically
214+
to ensure code quality.
215+
Manually, you can run them using:
207216

208217
```
209-
# pre-commit autoupdate # to update the hooks
210-
pre-commit run --all-files
211-
# pre-commit run <hook_id>
218+
pre-commit autoupdate # to update the hooks
219+
pre-commit run # on changed files
220+
pre-commit run --all-files # on all files
221+
pre-commit run <hook_id> # just a specific hook
222+
pre-commit run <hook_id> --all-files # ... on all files
223+
# see also --hook-stage option
212224
```
213225

214226
### Continuous integration

RcppTskit/DESCRIPTION

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ Type: Package
22
Package: RcppTskit
33
Title: 'R' Access to the 'tskit C' API
44
Version: 0.3.0
5-
Date: 2026-01-27
5+
Date: 2026-03-01
66
Authors@R: c(
77
person("Gregor", "Gorjanc", , "gregor.gorjanc@gmail.com", role = c("aut", "cre", "cph"),
88
comment = c(ORCID = "0000-0001-8008-2787")),
@@ -16,14 +16,16 @@ Description: 'Tskit' enables efficient storage, manipulation, and analysis
1616
described in Jeffrey et al. (2026) <doi:10.48550/arXiv.2602.09649>.
1717
See also <https://tskit.dev> for project news, documentation, and
1818
tutorials. 'Tskit' provides 'Python', 'C', and 'Rust' application
19-
programming interfaces (APIs). The 'Python' API can be called from 'R' via
20-
the 'reticulate' package to load and analyse tree sequences as
19+
programming interfaces (APIs). The 'Python' API can be called from 'R'
20+
via the 'reticulate' package to load and analyse tree sequences as
2121
described at <https://tskit.dev/tutorials/tskitr.html>. 'RcppTskit'
2222
provides 'R' access to the 'tskit C' API for cases where the
2323
'reticulate' option is not optimal; for example, high-performance or
2424
low-level work with tree sequences. Currently, 'RcppTskit' provides a
25-
limited set of 'R' functions because the 'Python' API and 'reticulate'
26-
already covers most needs.
25+
limited set of functions because the 'Python' API and 'reticulate'
26+
already cover most needs. The provided `RcppTskit R` API mirrors the
27+
`tskit Python` API, while the `RcppTskit C++` API mirrors the `tskit
28+
C` API.
2729
License: MIT + file LICENSE
2830
URL: https://github.com/HighlanderLab/RcppTskit
2931
BugReports: https://github.com/HighlanderLab/RcppTskit/issues

RcppTskit/NEWS.md

Lines changed: 34 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@ All notable changes to `RcppTskit` are documented in this file.
44
The file format is based on [Keep a Changelog](https://keepachangelog.com),
55
and releases adhere to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
66

7-
## [0.3.0] 2026-MM-DD
7+
## [0.3.0] 2026-03-02
88

99
### Added (new features)
1010

11-
- Added the following scalar getters to match tskit C/Python API
11+
- Added the following scalar getters to match `tskit C/Python` API
1212
- `TreeSequence$discrete_genome()` to query whether genome coordinates
1313
are discrete integer values.
1414
- `TreeSequence$has_reference_sequence()` to query whether a tree sequence
@@ -25,41 +25,56 @@ and releases adhere to [Semantic Versioning](https://semver.org/spec/v2.0.0.html
2525
- `TableCollection$has_index()` to query whether edge indexes are present.
2626
- Added a public header and defaults for downstream use of `C++` functions in
2727
`inst/include/RcppTskit_public.hpp`, included by `inst/include/RcppTskit.hpp`.
28+
- Added `TableCollection$build_index()` to build indexes and
29+
`TableCollection$drop_index()` to drop indexes.
2830
- TODO
2931

3032
### Changed
3133

32-
- Renamed low-level external-pointer API names from `*_ptr_*` to `*_xptr_*`
33-
(for example, `ts_ptr_load()` to `ts_xptr_load()`) to make external-pointer
34-
vs standard/raw-pointer semantics explicit.
34+
- Renamed low-level C++ and R API names such that we map onto `tskit C` API,
35+
for example, `ts_ptr_load()` to `rtsk_treeseq_load()`.
36+
This is an internal breaking change, but in a good direction now that the
37+
package is still young and in experimental mode.
38+
- Renamed `TreeSequence` and `TableCollection` external-pointer field and
39+
constructor argument from `pointer` to `xptr`.
40+
- Ensured `TableCollection$tree_sequence()` matches `tskit Python` API:
41+
it now builds indexes on the `TableCollection`, if indexes are not present.
42+
- TODO
43+
44+
### Maintenance
45+
46+
- Turn vignette URL as hyperlinks and similar cosmetics.
47+
- State mirroring of the `R/Python` APIs and `C++/C` APIs across the package.
48+
- TODO
3549

3650
## [0.2.0] - 2026-02-22
3751

3852
### Added (new features)
3953

40-
- Added TableCollection R6 class alongside `tc_load()` or `TableCollection$new()`,
41-
as well as `dump()`, `tree_sequence()`, and `print()` methods.
54+
- Added `TableCollection` `R6` class alongside `tc_load()` or
55+
`TableCollection$new()`, as well as `dump()`, `tree_sequence()`, and
56+
`print()` methods.
4257

43-
- Added `TreeSequence$dump_tables()` to copy tables into a TableCollection.
58+
- Added `TreeSequence$dump_tables()` to copy tables into a `TableCollection`.
4459

45-
- Added TableCollection and reticulate Python round-trip helpers:
60+
- Added `TableCollection` and reticulate `Python` round-trip helpers:
4661
`TableCollection$r_to_py()` and `tc_py_to_r()`.
4762

48-
- Changed the R API to follow tskit Python API for loading:
63+
- Changed the `R` API to follow `tskit Python` API for loading:
4964
`ts_load()`, `tc_load()`, `TreeSequence$new()`, and `TableCollection$new()`
5065
now use `skip_tables` and `skip_reference_sequence` logical arguments instead
5166
of an integer `options` bitmask.
5267

5368
- Removed user-facing `options` from `TreeSequence$dump()`,
5469
`TreeSequence$dump_tables()`, `TableCollection$dump()`, and
55-
`TableCollection$tree_sequence()` to match R API with the tskit Python API,
56-
while C++ API has the bitwise `options` like the tskit C API.
70+
`TableCollection$tree_sequence()` to match `R` API with the `tskit Python` API,
71+
while `C++` API has the bitwise `options` like the `tskit C` API.
5772

58-
- The bitwise options passed to C++ are now validated.
73+
- The bitwise options passed to `C++` are now validated.
5974

6075
### Changed
6176

62-
- We now specify C++20 standard to go around the CRAN Windows issue,
77+
- We now specify `C++20` standard to go around the CRAN Windows issue,
6378
see #63 for further details.
6479

6580
### Maintenance
@@ -81,21 +96,21 @@ This is the first release.
8196

8297
### Added (new features)
8398

84-
- Initial version of RcppTskit using the tskit C API (1.3.0).
99+
- Initial version of `RcppTskit` using the `tskit C` API (1.3.0).
85100

86-
- TreeSequence R6 class so R code looks Pythonic.
101+
- `TreeSequence R6` class so `R` code looks Pythonic.
87102

88-
- `ts_load()` or `TreeSequence$new()` to load a tree sequence from file into R.
103+
- `ts_load()` or `TreeSequence$new()` to load a tree sequence from file into `R`.
89104

90105
- Methods to summarise a tree sequence and its contents `ts$print()`,
91106
`ts$num_nodes()`, etc.
92107

93108
- Method to save a tree sequence to a file `ts$dump()`.
94109

95-
- Method to push tree sequence between R and reticulate Python
110+
- Method to push tree sequence between `R` and reticulate `Python`
96111
`ts$r_to_py()` and `ts_py_to_r()`.
97112

98-
- Most methods have an underlying (unexported) C++ function that works with
113+
- Most methods have an underlying (unexported) `C++` function that works with
99114
a pointer to tree sequence object, for example, `RcppTskit:::ts_ptr_load()`.
100115

101116
- All implemented functionality is documented and demonstrated with a vignette.

0 commit comments

Comments
 (0)