|
|
Hi Michael,
On Sun, Mar 30, 2008 at 5:35 PM, Michael Glavassevich
<mrglavas@xxxxxxxxxx> wrote:
> Hi Daniel,
>
> "Daniel Yokomizo" <daniel.yokomizo@xxxxxxxxx> wrote on 03/29/2008 04:45:24
> PM:
>
>
> > Hi,
> >
> > I'm parsing (disabling validation) a document that declared a DTD
> > but I would like to get the raw attribute values instead of the
> > normalized values. In particular I need to keep entity references as
> > they were written. I came up with this FAQ
> > (http://xerces.apache.org/xerces-j/faq-write.html#faq-7) that seems to
> > declare that it is impossible (i.e. attribute normalization happens if
> > there's a DTD present) and I found the XMLScanner class that, via the
> > method scanAttributeValue, does the attribute normalization. I noticed
> > that we have a getNonNormalizedValue() method but the SAX parser layer
> > uses AttributesProxy which hides the getNonNormalizedValue() method.
>
> That method is part of XNI [1]. If you really need the non-normalized text
> you'd need to change your application so that it uses XNI directly (rather
> than SAX).
Thanks for your help (again). I was hoping to use the SAX interface
and not depend explicitly on Xerces, because I'm developing a library
which will be (hopefully) independent of the SAX implementation.
There's a hack I can do to "trick" Xerces, which will work with any
parser too, and I'll probably do it (essentially I'll decorate the
reader I'm giving to the parser transforming every & into & but
after it's resolved by the parser it'll become & again, so &
becomes &amp; which the parser transform into &.
> > Is there any way to configure Xerces to not normalize attribute
> > values even when the DTD is declared?
>
> Whether your document has a DTD or not is irrelevant. The FAQ (on the
> Xerces 1.x site) you read is wrong. Normalization [2] is required for every
> attribute value. You cannot disable this behaviour.
>
> > Best regards,
> > Daniel Yokomizo
> >
>
> Thanks.
>
> [1] http://xerces.apache.org/xerces2-j/javadocs/xni/index.html
> [2] http://www.w3.org/TR/2006/REC-xml-20060816/#AVNormalize
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@xxxxxxxxxx
> E-mail: mrglavas@xxxxxxxxxx
Best regards,
Daniel Yokomizo.
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: j-users-help@xxxxxxxxxxxxxxxxx
|
|