Hello,
I would like to use the GFF3toolkit to remove some gene models (all with one isoform, from an external list) from a gff3 file. I first run
gff3_QC -g assembly_MAKER1.gff -f assembly.fa -o QC_report1 -s QC_stats1
and got this report:
==> QC_report <==
Line_num Error_code Error_level Error_tag
['Line 1'] Esf0014 Error ["##gff-version" missing from the first line]
['Line 15079'] Esf0012 Info [Found 5 Ns in CDS feature of length 296 using the external FASTA, consists of 1 segment (start, length): (210940, 5)]
==> QC_stats <==
Error_code Number_of_problematic_models Error_level Error_tag
Esf0014 1 Error ##gff-version" missing from the first line
Esf0012 1 Info Found Ns in a feature using the external FASTA
(I can fix the header myself)
I wonder how I can use gff3_fix to remove ~1500 genes (gene, mRNA, exon, and CDS lines): is it possible to create a 4-column file to submit to -qc_r? Can I use any of the error codes that have a "delete_model" function? Is there a way to specify the gene ID instead of the line number?
Also, is there a feature to remove gene models whose protein sequence does not start with M?
Thanks,
Dario
Hello,
I would like to use the GFF3toolkit to remove some gene models (all with one isoform, from an external list) from a gff3 file. I first run
gff3_QC -g assembly_MAKER1.gff -f assembly.fa -o QC_report1 -s QC_stats1and got this report:
(I can fix the header myself)
I wonder how I can use
gff3_fixto remove ~1500 genes (gene, mRNA, exon, and CDS lines): is it possible to create a 4-column file to submit to-qc_r? Can I use any of the error codes that have a "delete_model" function? Is there a way to specify the gene ID instead of the line number?Also, is there a feature to remove gene models whose protein sequence does not start with M?
Thanks,
Dario