For the etn R package, it would be useful to have a function that takes a dataframe + schema, and casts the columns of that dataframe to the types defined in the schema, thereby guaranteeing that the types are the expected ones. This function likely has wider use, which is why it is suggested here.
It is similar to:
readr::type_convert() function. By default it guesses appropriate types for character columns, but col_types can also be provided.
frictionless::read_resource(), which takes a CSV + schema. We can reuse the internal cols() function.
library(tibble)
library(readr)
library(frictionless)
# df where last column is a character
(df <- tibble::tribble(
~txt,~txt_bool,
"a","1",
"b","0"
))
#> # A tibble: 2 × 2
#> txt txt_bool
#> <chr> <chr>
#> 1 a 1
#> 2 b 0
schema <- create_schema(df)
schema$fields[[2]]$type <- "boolean"
schema
#> $fields
#> $fields[[1]]
#> $fields[[1]]$name
#> [1] "txt"
#>
#> $fields[[1]]$type
#> [1] "string"
#>
#>
#> $fields[[2]]
#> $fields[[2]]$name
#> [1] "txt_bool"
#>
#> $fields[[2]]$type
#> [1] "boolean"
# readr::type_convert() guesses dbl for last column
readr::type_convert(df)
#>
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> txt = col_character(),
#> txt_bool = col_double()
#> )
#> # A tibble: 2 × 2
#> txt txt_bool
#> <chr> <dbl>
#> 1 a 1
#> 2 b 0
# readr::type_convert() with provided col_types uses correct lgl for last column
readr::type_convert(df, col_types = frictionless:::cols(schema))
#> # A tibble: 2 × 2
#> txt txt_bool
#> <chr> <lgl>
#> 1 a TRUE
#> 2 b FALSE
Created on 2025-10-07 with reprex v2.1.1
Todo
For the etn R package, it would be useful to have a function that takes a dataframe + schema, and casts the columns of that dataframe to the types defined in the schema, thereby guaranteeing that the types are the expected ones. This function likely has wider use, which is why it is suggested here.
It is similar to:
readr::type_convert()function. By default it guesses appropriate types for character columns, butcol_typescan also be provided.frictionless::read_resource(), which takes a CSV + schema. We can reuse the internalcols()function.Created on 2025-10-07 with reprex v2.1.1
Todo
add_resource())readr::type_convert()add_resource()