At 9:14 AM -0800 1/4/05, [email protected] wrote:
This whole question of what 'matches' is subtle. Consider the case
when I have a document that has variant content by language (e.g.
different sound tracks), and the user indicates a set of preferred
languages. If the content has "de-CH" and "fr-CH" (swiss german and
french), and a default "en" (english) and the user says he speaks
"de-DE" and "fr-FR", on the face of it nothing matches, and I fall
back to the catch-all default, which is almost certainly not the best
David, this isn't the half of it. The case you describe is actually one of the
easy ones, in that it can be handled by doing a "preferred" match on
tag, with a "generic" match on the primary tag only having lesser precedence
but higher precedence than a fallback to a default.
Yes, I picked off an easy example for which the 'matching' section of
the draft didn't seem adequate. This really is a tar-pit, of course.
Serbo-croatian used to be a language; now it's serbian and croatian.
I assume that they are mutually intelligible. Serbian is probably a
better substitute for croatian than some general default (or
silence), though saying this in some parts of the world might start
The whole question of what is a language, a variant or dialect of a
language, or a suitable substitute for a language, would benefit some
thought in any tagging scheme, though I agree the problem is not
I know of two other wrinkles in the RFC 1766 world:
(1) Matching may want to take into account the distinguished nature
of country subtags in some way.
(2) SGN- requires special handling, in that SGN-FR and SGN-EN are in fact
sufficiently different languages that a primary tag match should not be
taken to be a generic match. (Of course this only matters if sign
languages are relevant to your situation - in many cases they aren't.
In retrospect I think it was a mistake to register sign languages this
This proposed revision, however, opens pandora's box in regards to matching.
(a) Extension tags appear as the first subtags, and as such have to
be taken into account when looking for country subtags.
(b) Script tags change the complexion of the matching problem significantly,
in that they can interact with external factors like charset information
in odd ways.
(c) UN country numbers have been added (IMO for no good reason), requiring
handling similar to country codes.
The bottom line is that while I know how to write reasonable code to do RFC
1766 matching (and have in fact done so for widely deployed software), I
haven't a clue how to handle this new draft competently in regards
And the immediate consequence of this is that I, and I suspect many other,
implementors are going to adopt a "wait and see" attitude in regards to
implementing any of this.
Ietf mailing list