Allow for UTF-8 field values in header regular expression#726
Allow for UTF-8 field values in header regular expression#726jmarshall wants to merge 2 commits intosamtools:masterfrom
Conversation
|
I'm unsure on the extra brackets also. My inclination though is it's probably not worth the hassle of inventing our own syntax and just going with the official double bracket style. I'm guessing the extra brackets however were to permit things like |
|
|
|
Which specific |
|
Ah, yes, you're right. I had a set operation wrong when I tested with an example. |
|
I see I was assigned this in the last meeting. Personally my preference is |
|
Can I get clarity on who is progressing this please? I was assigned, but gave my feedback over a year ago and it's unchanged. As far as I'm concerned, the ball is back with @jmarshall , but if you wish me to just make an editorial decision then I will amend the |
|
The ball was indeed with me. See the new preview for how this uglifies |
Use `[:print:]` in the header regex and note that for ASCII it is equivalent to `[ -~]` and that the aim is to forbid control characters. Fixes samtools#719.
This affects the existing [[:rname:^*=]]... and the new [[:print:]].
Use
[:print:]in the header regex and note that for ASCII it is equivalent to[ -~]and that the aim is to forbid control characters. Fixes #719.To be honest, I'm tempted to add the extra
[]to the\cclassdefinition and waste a bit of space each time this appears rather than add the “For brevity” sentence.An alternative to this PR might be to just leave the regex as
[ -~]and add a footnote explaining that this is an oversimplification for fields that allow Unicode values.