Hello,
While filtering a VCF file by MAF, I noticed that some variants with MAF below the specified threshold were kept in the output:
bcftools view -i 'MAF > 0.01' -O z -o file_MAF0.01.vcf.gz file.vcf.gz
bcftools +fill-tags file_MAF0.01.vcf.gz -O z -o file_MAF0.01_TAGS.vcf.gz -- -t MAF
Minimum MAF observed: 0.00055
When inverting the order of the steps, the MAF filter works correctly:
bcftools +fill-tags file.vcf.gz -O z -o file_TAGS.vcf.gz -- -t MAF
bcftools view -i 'MAF > 0.01' -O z -o file_TAGS_MAF0.01.vcf.gz file_TAGS.vcf.gz
Minimum MAF observed: 0.01001
The starting VCF file is the same in both cases and contains only biallelic SNPs. It seems like the fill-tags and on the fly calculations of MAF are working differently. How do they differ?
Thanks,
Annette
Hello,
While filtering a VCF file by MAF, I noticed that some variants with MAF below the specified threshold were kept in the output:
bcftools view -i 'MAF > 0.01' -O z -o file_MAF0.01.vcf.gz file.vcf.gz
bcftools +fill-tags file_MAF0.01.vcf.gz -O z -o file_MAF0.01_TAGS.vcf.gz -- -t MAF
Minimum MAF observed: 0.00055
When inverting the order of the steps, the MAF filter works correctly:
bcftools +fill-tags file.vcf.gz -O z -o file_TAGS.vcf.gz -- -t MAF
bcftools view -i 'MAF > 0.01' -O z -o file_TAGS_MAF0.01.vcf.gz file_TAGS.vcf.gz
Minimum MAF observed: 0.01001
The starting VCF file is the same in both cases and contains only biallelic SNPs. It seems like the fill-tags and on the fly calculations of MAF are working differently. How do they differ?
Thanks,
Annette