Skip to content

Use CHAR/VARCHAR types in TPCDSTables #198

@maropu

Description

@maropu

TPC-DS schemas are different between spark-sql-perf TPCDSTables and spark-master/branch-3.1 TPCDSBase (string v.s. char/varchar). For example;

// spark
    "reason" ->
      """
        |`r_reason_sk` INT,
        |`r_reason_id` CHAR(16),
        |`r_reason_desc` CHAR(100)
      """.stripMargin,

// spark-sql-perf
    Table("reason",
      partitionColumns = Nil,
      'r_reason_sk               .int,
      'r_reason_id               .string,
      'r_reason_desc             .string),

To generated TPCDS table data for Spark (master/branch-3.1), it would be nice to use CHAR/VARCHAR types in TPCDSTables.

NOTE: This ticket comes from apache/spark#31886

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions