TypeScript implementation of the Train Benchmark for incremental SPARQL query engines.
The benchmark runs query/transform cycles on a railway RDF model and records query latency, memory use, and incremental result changes (additions/deletions).
- Node.js + npm
- A local checkout of
incremunicanext to this repository (this project usesfile:../incremunica/...dependencies inpackage.json)
npm install
npm run build- Create a local benchmark config JSON (see example below).
- Run:
node bin/index.js -f data/benchmarkConfigs/config-local.jsonThe -f argument is required.
The repository contains data/benchmarkConfigs/config-complete.json, but it uses machine-specific absolute paths. Create your own config file with local paths.
{
"commonConfig": {
"randomSeed": "incremunica",
"baseResultPath": "results/resultsData/",
"baseConfigPath": "results/benchmarkConfigs/",
"baseIncremunicaConfigPath": "data/configs/"
},
"benchmarkConfigs": [
{
"matchTransformPercentage": 30,
"joinAlgorithm": ["full-hash-join"],
"dataPath": ["data/models/railway-batch-1-inferred.ttl"],
"operationStrings": [
["BatchConnectedSegments", "InjectConnectedSegments", "RepairConnectedSegments"]
],
"numberOfTransforms": 2,
"numberOfRuns": 1
}
]
}randomSeed: random seed used for selecting transformation matches.baseResultPath: directory where CSV result files are written.baseConfigPath: directory where generated expanded benchmark configs are written.baseIncremunicaConfigPath: directory containing join algorithm config folders (<name>/engine.js).
joinAlgorithm: one or more algorithm names. Supported out of the box:computational-bind-joindelta-queryfull-hash-joinmemory-bind-joinnestedloop-joinpartial-match-hash-joinpartial-delete-hash-join
dataPath: one or more RDF model files.operationStrings: one operation chain or a list of operation chains.numberOfTransforms: number of transform/recheck rounds per run.numberOfRuns: number of measured runs (a warm-up run happens before these).matchTransformPercentageormatchTransformAmount: how many matches are transformed per round.
- CSV files are created in
commonConfig.baseResultPath. - Expanded per-run configs are written to
commonConfig.baseConfigPath. - Cache files are written to
data/cachedResults/(derived frombaseIncremunicaConfigPath).
This benchmark runs in Node.js workers, so a browser-based engine needs a Node-facing adapter.
Replace the current engine construction:
const queryEngine = new QueryEngineBase(require(config.queryEngineConfig));with your own adapter instance (or factory) that exposes:
queryBindings(queryString, { sources })
You will also need to decide what to use for transformationQueryEngine:
- same engine adapter as
queryEngine, or - a separate engine dedicated to transformation operations.
Driver currently creates a StreamingStore from a Turtle file. If your engine requires a different source/store type, change this creation step accordingly and keep driver.streamingStore compatible with the operations listed below.
There are two important integration points:
- Store cloning in the constructor:
- currently hardcoded to
new StreamingStore<Quad>(). - replace this with your store/adapter type if needed.
- currently hardcoded to
- Incremental result consumption in
query():- assumes the bindings stream supports Node readable semantics (
readableevent +.read()). - assumes each binding has
binding.diffwheretruemeans deletion andfalse/missing means addition.
- assumes the bindings stream supports Node readable semantics (
If your engine emits a different change format, normalize it here before the changeBindingsMap logic.
getResults() currently uses @comunica/query-sparql-rdfjs as a non-incremental reference engine for cached baseline results. If that does not work with your store/source type, replace this part with your own baseline evaluator.
The benchmark logic expects these methods/properties on the store used by Driver and Operation:
addQuad(quad)removeQuad(quad)match(...)returning a readable quad stream (data/end)import(quadStream)halt()resume()flush()isHalted()getStore()returning an object that:- is iterable (
for (const quad of getStore())) - supports
match(...) - supports
getQuads(subject, predicate, object, graph)
- is iterable (
bin/index.ts: add your engine name ingetJoinConfigPath()if you still want to configure engines viajoinAlgorithm.lib/Driver.tsandlib/operations/Operation.ts: relax Incremunica-specific TypeScript types to local interfaces if your adapter uses different runtime types.lib/operations/inject/*andlib/operations/repair/*: update if your store needs quads in a different concrete representation.