You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All structured lines require an ID which must be unique within their type, i.e., within all the meta-information lines with the same ``\verb|##|\emph{key}\verb|=|'' prefix.
125
-
For all of the structured lines (\verb|##INFO|, \verb|##FORMAT|, \verb|##FILTER|, etc.) described in this specification, extra fields can be included after the default fields.
125
+
For all of the structured lines (\verb|##INFO|, \verb|##FORMAT|, \verb|##FILTER|, etc.) described in this specification, optional fields can be included.
In the above example, the extra fields of ``Source'' and ``Version'' are provided.
130
+
In the above example, the optional fields of ``Source'' and ``Version'' are provided.
131
131
The values of optional fields must be written as quoted strings, even for numeric values.
132
-
Other structured lines not defined by this specification may also be used; the only default field for such lines is the required \verb|ID| field.
132
+
Other structured lines not defined by this specification may also be used; the only required field for such lines is the required \verb|ID| field.
133
133
134
134
It is recommended in VCF and required in BCF that the header includes tags describing the reference and contigs backing the data contained in the file.
135
135
These tags are based on the SQ field from the SAM spec; all tags are optional (see the VCF example above).
136
136
137
+
To aid human readability, the order of fields should be ID, Number, Type, Description, then any optional fields.
138
+
Implementation must not rely on the order of the fields within structured lines and are not required to preserve field ordering.
139
+
137
140
Meta-information lines are optional, but if they are present then they must be completely well-formed.
138
141
Other than \verb|##fileformat|, they may appear in any order.
139
142
Note that BCF, the binary counterpart of VCF, requires that all entries are present.
@@ -150,7 +153,7 @@ \subsubsection{File format}
150
153
151
154
152
155
\subsubsection{Information field format}
153
-
INFO fields are described as follows (first four keys are required, source and version are recommended):
156
+
INFO meta-information lines are structured lines with require fields of ID, Number, Type, and Description, and Source and Version recommended optional fields:
@@ -177,29 +180,31 @@ \subsubsection{Information field format}
177
180
Source and Version values likewise must be surrounded by double-quotes and specify the annotation source (case-insensitive, e.g.\ \verb|"dbsnp"|) and exact version (e.g.\ \verb|"138"|), respectively for computational use.
178
181
179
182
\subsubsection{Filter field format}
180
-
FILTERs that have been applied to the data are described as follows:
183
+
FILTER meta-information lines are structured lines with require fields of ID and Description that define the possible content of the FILTER column in the VCF records:
181
184
182
185
\begin{verbatim}
183
186
##FILTER=<ID=ID,Description="description">
184
187
\end{verbatim}
185
188
186
189
\subsubsection{Individual format field format}
187
-
Genotype fields specified in the FORMAT field are described as follows:
190
+
FORMAT meta-information lines are structured lines with require fields of ID, Number, Type, and Description that define the possible content of the per-sample/genotype columns in the VCF records:
Possible Types for FORMAT fields are: Integer, Float, Character, and String (this field is otherwise defined precisely as the INFO field).
197
+
The Number field is defined as per the INFO Number field.
194
198
195
199
\subsubsection{Alternative allele field format} \label{altfield}
196
-
Symbolic alternate alleles are described as follows:
200
+
ALT meta-information lines are structured lines with require fields of ID and Description that describe the possible symbolic alternate alleles in the ALT column of the VCF records:
201
+
197
202
\begin{verbatim}
198
203
##ALT=<ID=type,Description="description">
199
204
\end{verbatim}
200
205
201
206
\noindent\textbf{Structural Variants} \newline
202
-
In symbolic alternate alleles for imprecise structural variants, the ID field indicates the type of structural variant, and can be a colon-separated list of types and subtypes.
207
+
In symbolic alternate alleles for structural variants, the ID field indicates the type of structural variant, and can be a colon-separated list of types and subtypes.
203
208
ID values are case sensitive strings and must not contain whitespace, commas or angle brackets.
204
209
The first level type must be one of the following:
205
210
\begin{itemize}
@@ -232,7 +237,6 @@ \subsubsection{Alternative allele field format} \label{altfield}
232
237
##ALT=<ID=M,Description="IUPAC code M = A/C">
233
238
\end{verbatim}
234
239
235
-
236
240
\subsubsection{Assembly field format}
237
241
Breakpoint assemblies for structural variations may use an external file:
0 commit comments