opera.beta
[Top] [All Lists]

Re: XML encoding guessing

Subject: Re: XML encoding guessing
From: Milan Holzäpfel
Date: Tue, 13 Jun 2006 22:40:25 +0200
Newsgroups: opera.beta
On Thu, 08 Jun 2006 07:23:49 +0100
"Peter Karlsson" <peter@xxxxxxxxx> wrote:

> Christoph Schneegans:
> 
> > Freudian slip? I'm sure Opera uses its XML parser to process  
> > <http://schneegans.de/sv/test-cases/?case=meta-only-encoding>.
> 
> Yes, and the HTML parser to understand. At least that is what I
> would assume, without being an expert on the parsing engine.
> 
> > Don't you see that this is evil? Opera now accepts XML documents
> > that all other XML software would reject or at least decode as
> > UTF-8. You're about to destroy XHTML and repeat the sad story which
> > happened to HTML.
> 
> We are making use of the information available to make the best of
> the situation. 

A conforming XML agent MUST NOT make use of this piece of information,
the XML specification explicitly forbids this by stating that it is a
fatal error for documents  for which no external information is
provided, and for which no encoding declaration (as part of the XML
declaration) is specified  to contain non-UTF-8 characters (or
non-UTF-16-chars in case of a BOM present). (see
<URL:http://www.w3.org/TR/REC-xml/#charencoding>, as already mentioned
in the other part of the thread)

The XML spec forbids making "the best out of the situation". 

This is an XML point of view.  I don't think that using any XHTML stuff
is appropriate for a document served as application/xml, as RFC3023 says
on application/xml:

<URL:http://www.rfc-editor.org/rfc/rfc3023.txt>,
section "3.2 Application/xml Registration"
|       If an application/xml entity is received where the charset
|       parameter is omitted, no information is being provided about the
|       charset by the MIME Content-Type header.  Conforming XML
|       processors MUST follow the requirements in section 4.3.3 of [XML]
|       that directly address this contingency.

(section 4.3.3 is what I am refering to above...) 

(btw I don't think that users even benefit from Opera's guessing game
at the moment, as I don't think that there is a significant number of
such invalid documents (or even a significant number of documents
served as application/xml) out there -- and people who use XHTML now
are usually eager to fix things...)

Regards,
Milan

-- 
Milan Holzaepfel <mail(a)mjh(d)name>             <URL:http://mjh.name/>
pub  4096R/C790FC23  EB8E 5E81 81E3 53A9 9B74  B895 5179 54C0 C790 FC23 


<Prev in Thread] Current Thread [Next in Thread>