Add language tag format to registry#79
Conversation
RFC5646 [1] defines a standardized format for language tags. This RFC is included in the more commonly known BCP47 [2]. [1]: https://www.rfc-editor.org/info/rfc5646/ [2]: https://www.rfc-editor.org/info/bcp47/
|
@handrews Do you mind reviewing this PR or let me know who else might be able to take a look? Thanks in advance for your time! |
|
It's unclear to me how this could possibly be validated. And for a pure annotation, an extension keyword specifically about language contents would seem better? |
|
Can you clarify what you mean with "an extension keyword"? I am not following what your alternative proposed solution would be for denoting that a particular value indicates a language in the format of this RFC. |
|
[EDIT: Never mind, see below]
|
|
@TimvdLippe wait... I may have this completely wrong 😅 Are you just validating that the text is a language tag? Er... yeah that's probably fine... lemme creep off and be embarrassed for a while and then respond properly 🤦 |
|
No worries at all, happy to elaborate more. Yes this is indeed to specify that a particular value is in the language format. As an example, given an API response: {
"header": {
"lang": "nl-NL",
"value": "Titel"
}
}Then we want to say that the |
|
edit: my mistake! I misunderstood what this patch was doing. |
|
@TimvdLippe OK great. To clarify, is this format intended to restrict values only to full tags, excluding subtags and codes? Or is it intended to allow any of those three ways of indicating a language? If it applies only to full tags, we might want to add several formats instead (e.g. |
|
It was intended to allow any format allowed by the RFC. I am also okay to split it up into more granular formats that reference specific grammar constructs in the RFC |
|
@TimvdLippe I suspect "any of the formats" is pretty common, so I think the best thing to do would be to clarify this PR to explicitly say what ABNF productions or IANA registries define the allowed values (I have not read the RFC in detail, please use whatever terminology makes sense). If someone wants more precise formats, they can add them. I just wanted to avoid tying |
|
I wanted to incorporate your feedback, but then I realised I actually already did that. Or at least that was the intent. |
|
@TimvdLippe wow I've really not been on the ball in this PR... my apologies. Yes, this looks good to me. |
RFC5646 1 defines a standardized format for language tags. This RFC is included in the more commonly known BCP47 2.