@@ -213,8 +213,35 @@ Runnable examples:
213213
214214- Built-in loaders: MNIST, Fashion-MNIST, CIFAR-10
215215- URI-backed data sources: ` file:// ` , ` https:// ` , ` hf+https:// ` , and ` hf://... `
216+ - Dataset operations: deterministic shuffle/split, stratified split, filter/map/transform views, batch flows, and epoch flows
217+ - Raw dataset parsers: CSV, TSV, JSON arrays/objects, JSON Lines (` .jsonl ` , ` .ndjson ` )
218+ - Type-safe transform DSLs: image/tensor transforms plus suspendable raw data pipelines
216219- Formats: GGUF, ONNX, SafeTensors, JSON, Image (JPEG, PNG)
217- - Type-safe transform DSL: resize, crop, normalize, toTensor
220+
221+ ``` kotlin
222+ val raw = JvmDataSourceResolver ().rawDataset {
223+ from(" hf://datasets/org/repo@main/train.jsonl" )
224+ format(DataFormat .JSON_LINES )
225+ cachePolicy(CachePolicy .Use )
226+ }
227+
228+ val withoutLabel = dataPipeline<RawDataset >()
229+ .stage(
230+ dataTransformer(
231+ name = " drop-label" ,
232+ outputSchema = { schema -> DataSchema (schema.columns - " label" ) }
233+ ) { dataset ->
234+ val columns = dataset.schema.columns - " label"
235+ dataset.copy(
236+ schema = DataSchema (columns),
237+ rows = dataset.rows.map { row ->
238+ RawDataRow (row.values.filterKeys { key -> key in columns })
239+ }
240+ )
241+ }
242+ )
243+ .execute(raw)
244+ ```
218245
219246
220247### Edge AI: Arduino / C99 Export
0 commit comments