It would be useful for simulation and other applications to be able to efficiently convert tskit tree sequences to VCF Zarr. While this can currently be done with tskit vcf and vcf2zarr it's not very efficient, and some information is lost in the VCF conversion.
The CLI would look something like
tskit2zarr convert input.trees output.vcz
we would include the standard options for multiple workers, etc. I don't think there's any need for distributed commands (but they could be added later, I guess).
Operationally, we could depend on tszip using the load function added in 0.2.3. Since tszip is basically tskit + zarr, there's no harm in adding support.
We should probably make this an optional dependency, though, since it is a relatively niche use-case?
It would be useful for simulation and other applications to be able to efficiently convert tskit tree sequences to VCF Zarr. While this can currently be done with
tskit vcfandvcf2zarrit's not very efficient, and some information is lost in the VCF conversion.The CLI would look something like
we would include the standard options for multiple workers, etc. I don't think there's any need for distributed commands (but they could be added later, I guess).
Operationally, we could depend on tszip using the load function added in 0.2.3. Since tszip is basically tskit + zarr, there's no harm in adding support.
We should probably make this an optional dependency, though, since it is a relatively niche use-case?