Skip to content

New tool: tskit2zarr #232

@jeromekelleher

Description

@jeromekelleher

It would be useful for simulation and other applications to be able to efficiently convert tskit tree sequences to VCF Zarr. While this can currently be done with tskit vcf and vcf2zarr it's not very efficient, and some information is lost in the VCF conversion.

The CLI would look something like

tskit2zarr convert input.trees output.vcz

we would include the standard options for multiple workers, etc. I don't think there's any need for distributed commands (but they could be added later, I guess).

Operationally, we could depend on tszip using the load function added in 0.2.3. Since tszip is basically tskit + zarr, there's no harm in adding support.

We should probably make this an optional dependency, though, since it is a relatively niche use-case?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions