Add DuckDB functionality: Changes (99739ac) from SAE environment#35
Add DuckDB functionality: Changes (99739ac) from SAE environment#35stephhazlitt wants to merge 1 commit into
Conversation
| person("Sam", "Albers", , "sam.albers@gov.bc.ca", role = c("aut", "cre")), | ||
| person("Andy", "Teucher", , "andy.teucher@gov.bc.ca", role = "aut", | ||
| comment = c(ORCID = "0000-0002-7840-692X")), | ||
| person("Bonnie", "Robert", , "bonnie.robbie@gov.bc.ca", role = "aut"), |
There was a problem hiding this comment.
| person("Bonnie", "Robert", , "bonnie.robbie@gov.bc.ca", role = "aut"), | |
| person("Bonnie", "Ashcroft", , "bonnie.ashcroft@gov.bc.ca", role = "aut"), |
| #' `list(thousands = ",", comment = "#", force_not_null = c("a", "b"), null_padding = TRUE, buffer_size = 5)` | ||
| #' | ||
| #' @return a character string of the query, to be called with [DBI::dbExecute()] | ||
| #' @export |
There was a problem hiding this comment.
an example here would be helpful - using one of the built-in datasets, perhaps?
There was a problem hiding this comment.
| #' @export | |
| #' @export | |
| #' @examples | |
| #' \dontrun{ | |
| #' con <- DBI::dbConnect(duckdb::duckdb()) | |
| #' p <- "./inst/extdata/starwars-dict.csv" | |
| #' spec <- dput(sniff_csv_duckdb(p)$schema) | |
| #' csv_to_duckdb_query(con, p, "starwars", spec) | |
| #' } |
| nullstr = NULL, | ||
| opts = list() | ||
| ) { | ||
|
|
There was a problem hiding this comment.
Maybe not necessary given the users and use cases of dipr, but I wonder about guarding against SQL injection. In this case via the tbl_name parameter:
if (grepl(";", tbl_name) stop(glue::glue("possible SQL injection introduced via tablename: {tbl_name}"))
| old_column_names <- names(schema) | ||
|
|
||
| names(new_column_names) <- old_column_names | ||
| names(old_column_names) <- old_column_names |
There was a problem hiding this comment.
This logic is confusing; code comments would be helpful.
|
|
||
| check_params(con, opts) | ||
|
|
||
| varchar_scalar_opts <- Filter(\(x) is.character(x) & length(x) == 1, opts) |
There was a problem hiding this comment.
I'm tempted to extract the code that constructs the various *_opts_parsed variables into individual functions. Consider refactoring into smaller, focused functions like:
format_varchar_scalar_opts <- function(opts) { ... }
format_varchar_array_opts <- function(opts) { ... } etc.
| ) | ||
| } | ||
|
|
||
| make_select_statement <- function(schema, col_rename = NULL, dt_override = NULL) { |
There was a problem hiding this comment.
When the final SQL CREATE or REPLACE TABLE statement is executed via DBI::dbExecute(), column names with spaces (like First Name AS first_name) will cause SQL syntax errors. The dipr functions are behaving as expected, but it could be a "nice to have" to warn against this. Here's an example:
> print(strSQL)
CREATE OR REPLACE TABLE starwars AS SELECT Name Abbr AS name_abbr, Name AS name FROM read_csv( './inst/extdata/starwars-dict.csv', types = { 'Name Abbr': 'VARCHAR', 'Name': 'VARCHAR' })
> DBI::dbExecute(con,strSQL)
This will cause dbSendQuery() to error.
|
I focused my review on the code in the |
Changes from @ateucher from the DIP SRE.