Skip to content

Commit 3692643

Browse files
committed
Committee has decided not to elide excess brackets in character classes
This affects the existing [[:rname:^*=]]... and the new [[:print:]].
1 parent 4560a37 commit 3692643

1 file changed

Lines changed: 4 additions & 5 deletions

File tree

SAMv1.tex

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,8 @@
3333
\newcommand*{\firstbytebox}[2]{\byteboxAux{#1}{#2}{\put(0,0){\line(0,1){\bytetotalheight}}}}
3434
\newcommand*{\bytebox}[2]{\byteboxAux{#1}{#2}{}}
3535

36-
\newcommand*{\cclass}[1]{{\rm\sf :#1:}}
36+
\newcommand*{\cclass}[1]{[{\rm\sf :#1:}]}
37+
\newcommand*{\cclassexcept}[2]{[{\rm\sf :#1:}\caret #2]}
3738
\newcommand*{\caret}{\textsuperscript{$\wedge$}}
3839

3940
\newcommand*{\memlimited}{\textcolor{gray}{\footnotesize\it limited}}
@@ -81,7 +82,6 @@ \section{The SAM Format Specification}
8182
For example, floating-point values in SAM always use `{\tt .}' for the decimal-point character.
8283

8384
The regular expressions in this specification are written using the POSIX\,/\,IEEE Std 1003.1 extended syntax.
84-
For brevity, named character classes are written as~{\tt [\cclass{class}]} without an additional pair of brackets.
8585

8686
\subsection{An example}\label{sec:example}
8787
Suppose we have the following alignment with bases in lowercase
@@ -213,9 +213,7 @@ \subsubsection{Character set restrictions}\label{sec:charset}
213213
{\tt [\verb"0-9A-Za-z!#$%&+./:;?@^_|~-"][\verb"0-9A-Za-z!#$%&*+./:;=?@^_|~-"]*}
214214
\end{center}
215215
216-
% Pedantically this should be [[:rname:]^*=][[:rname:]]*, but we take advantage
217-
% of POSIX (Issue 7) section 9.3.5/8 to elide the excess brackets for clarity.
218-
\newcommand*{\rnameRegexp}{[\cclass{rname}\caret*=][\cclass{rname}]*}
216+
\newcommand*{\rnameRegexp}{[\cclassexcept{rname}{*=}][\cclass{rname}]*}
219217
220218
\noindent
221219
For clarity, elsewhere in this specification we write this set of allowed characters as a character class~{\tt [\cclass{rname}]} and extend the POSIX regular expression notation to use {\tt\caret *=} to indicate the omission of `{\tt *}' and `{\tt =}' from the character class.
@@ -305,6 +303,7 @@ \subsection{The header section}
305303
These alternative names are not used elsewhere within the SAM file;
306304
in particular, they must not appear in alignment records' {\sf RNAME}
307305
or~{\sf RNEXT} fields.
306+
\newline
308307
\emph{Regular expression}: \emph{name}{\tt (,}\emph{name}{\tt )*}
309308
where \emph{name} is {\tt\rnameRegexp}\\\cline{2-3}
310309
& {\tt AS} & Genome assembly identifier. \\\cline{2-3}

0 commit comments

Comments
 (0)