|
|
Stefan Ram wrote:
> gaijinco <gaijinco@xxxxxxxxx> writes:
>> <name first="Carlos" second="" />
>> <name>
>> <first>Carlos</first>
>> <second></second>
>> </name>
>
> When a new document type is to be defined, when should one
> choose child elements and when attributes?
>
> The criterion that makes sense regarding the meaning can not
> be used in XML due to syntactic restrictions.
That is too broad. Often it can.
> An element is describing something. A description is an
> assertion. An assertion might contain unary predicates or
> binary relations.
>
> Comparing this structure of assertions with the structure
> of XML, it seems to be natural to represent unary predicates
> with types and binary relations with attributes.
>
> Say, "x" is a rose and belongs to Jack. This assertion can
> be written in a more formal way to show the relations used:
>
> rose( x ) ^ owner( x, Jack )
>
> This is written in XML as:
>
> <rose owner="Jack" />
This is not true. It demonstrates very well a misunderstanding of text
markup that is unfortunately far too prevalent. Naming element types
after concrete objects is rare and almost always wrong. Possibly a DTD
for a horticulturalist might do this, but in normal text applications
you would write something like
<plant type="rose" owner="Jack">x</plant>
That is, "x" is an instance of a type of plant called a rose and this
one belongs to Jack.
> Thus, my answer would be: use element types for unary
> predicates and attributes for binary relations.
>
> Unfortunately, in XML, this is not always possible, because
> in XML:
>
> - there might be at most one type per element,
>
> - there might be at most one attribute value per attribute
> name, and
>
> - attribute values are not allowed to be structured in
> XML.
>
> Therefore, the designers of XML document types are forced to
> abuse element /types/ in order to describe the /relation/
> of an element to its parent element.
>
> This /is/ an abuse, because the designation "element type"
> obviously is supposed to give the /type of an element/,
> i.e., a property which is intrinsic to the element alone
> and has nothing to do with its relation to other elements.
Nearly. But you are trying to force XML into a very narrow,
computer-science style mould of logic, which it was never intended for.
> The document type designers, however, are being forced to
> commit this abuse, to reinvent poorly the missing structured
> attribute values using the means of XML. If a rose has two
> owners, the following element is not allowed in XML:
>
> <rose owner="Jack" owner="Jill" />
Again, not true. <rose owner="Jack Jill Stefan"/> is the normal solution
to multiple parallel values, where owner is declared as IDREFS or ENTITIES.
> One is made to use representations such as the following:
>
> <rose>
> <owner>Jack</owner>
> <owner>Jill</owner></rose>
This would be suboptimal for this case, where the owners are presumed to
be uniquely occurring individuals. But it would be possible.
> Here the notion "element type" suggests that it is marked
> that Jack is "an owner", in the sense that "owner" is
> supposed to be the type (the kind) of Jack. Not an
> "owner of ..." (which would make sense), but just "an owner".
The normal solution would be something like
...
<owners>
<owner id="Jack">Jack the Lad</owner>
<owner id="Jill">Jill the Lass</owner>
...
</owners>
...
<plant type="rose" owners="Jack Jill">x</plant>
(with id as ID and owners as IDREFS). Certainly you could choose to
expand the declaration of <owner> to allow subelements to provide finer
detail (see many of the TEI declarations for examples).
> The intention of the author, however, is that "owner" is
> supposed to give the /relation/ to the containing element
> "rose". This is the natural field of application for
> attributes, as the meaning of the word "attribute" outside
> of XML clearly indicates, but it is not possible to
> always use attributes for this purpose in XML.
>
> An alternative solution might be the following notation.
>
> <rose owner="Alexander Marie" />
>
> Here a /new/ mini language (not XML anymore) is used within
> an attribute value, which, of course, can not be checked
> anymore by XML validators. This is really done so, for
> example, in XHTML, where classes are written this way.
I suggest you re-read the XML Spec for IDREFS and ENTITIES.
> So in its most prominent XML application XHTML, the W3C
> has to abandon XML even to write class attributes. This
> is not such a good accomplishment given that the W3C
> was able to use the experience made with SGML and HTML
> when designing XML.
That was done for exogenous political reasons, as I understand it, not
for technical ones.
> The needless restrictions of XML inhibit the meaningful
> use of syntax. This makes many document type designers
> wonder, when attributes and when elements
> should be used, which actually is an evidence of
> incapacity for the design of XML: XML does not have many
> more notations than these two: attributes and elements.
> And now the W3C failed to give even these two
> notations a clear and meaningful dedication!
No-one is pretending that XML is perfect, but you must understand that
it was designed for text documents, not for database engineering.
> Without the restrictions described, XML alone would have
> nearly the expressive power of RDF/XML, which has to repair
> painfully some of the errors made in the XML-design.
>
> Now, some "experts" recommend to /always/ use subelements,
> because one can never know, whether an attribute value
> that seems to be unstructured today might need to become
> structured tomorrow. Other "experts" recommend to use
> attributes only when one is quite confident that they
> never will need to be structured. This recommendation
> does not even try to make a sense out of attributes,
> but just explains how to circumvent the obstacles
> the W3C has built into XML.
Please re-read the FAQ warning on this subject.
[snip]
///Peter
|
|