comp.text.xml
[Top] [All Lists]

Re: Tags v.s. Attributes

Subject: Re: Tags v.s. Attributes
From: Peter Flynn
Date: Sun, 13 Jul 2008 22:35:58 +0100
Newsgroups: comp.text.xml

Stefan Ram wrote:
> gaijinco <gaijinco@xxxxxxxxx> writes:
>> <name first="Carlos" second="" />
>> <name>
>>  <first>Carlos</first>
>>  <second></second>
>> </name>
> 
>   When a new document type is to be defined, when should one
>   choose child elements and when attributes?
> 
>   The criterion that makes sense regarding the meaning can not
>   be used in XML due to syntactic restrictions.

That is too broad. Often it can.

>   An element is describing something. A description is an
>   assertion. An assertion might contain unary predicates or
>   binary relations.
>
>   Comparing this structure of assertions with the structure
>   of XML, it seems to be natural to represent unary predicates
>   with types and binary relations with attributes.
> 
>   Say, "x" is a rose and belongs to Jack. This assertion can
>   be written in a more formal way to show the relations used:
> 
> rose( x ) ^ owner( x, Jack )
> 
>   This is written in XML as:
> 
> <rose owner="Jack" />

This is not true. It demonstrates very well a misunderstanding of text
markup that is unfortunately far too prevalent. Naming element types
after concrete objects is rare and almost always wrong. Possibly a DTD
for a horticulturalist might do this, but in normal text applications
you would write something like

<plant type="rose" owner="Jack">x</plant>

That is, "x" is an instance of a type of plant called a rose and this
one belongs to Jack.

>   Thus, my answer would be: use element types for unary
>   predicates and attributes for binary relations.
> 
>   Unfortunately, in XML, this is not always possible, because 
>   in XML:
> 
>     - there might be at most one type per element,
> 
>     - there might be at most one attribute value per attribute
>       name, and
> 
>     - attribute values are not allowed to be structured in
>       XML.
> 
>   Therefore, the designers of XML document types are forced to
>   abuse element /types/ in order to describe the /relation/ 
>   of an element to its parent element.
> 
>   This /is/ an abuse, because the designation "element type"
>   obviously is supposed to give the /type of an element/,
>   i.e., a property which is intrinsic to the element alone
>   and has nothing to do with its relation to other elements.

Nearly. But you are trying to force XML into a very narrow,
computer-science style mould of logic, which it was never intended for.

>   The document type designers, however, are being forced to
>   commit this abuse, to reinvent poorly the missing structured
>   attribute values using the means of XML. If a rose has two
>   owners, the following element is not allowed in XML:
>   
> <rose owner="Jack" owner="Jill" />

Again, not true. <rose owner="Jack Jill Stefan"/> is the normal solution
to multiple parallel values, where owner is declared as IDREFS or ENTITIES.

>   One is made to use representations such as the following:
> 
> <rose>
>   <owner>Jack</owner>
>   <owner>Jill</owner></rose>

This would be suboptimal for this case, where the owners are presumed to
be uniquely occurring individuals. But it would be possible.

>   Here the notion "element type" suggests that it is marked 
>   that Jack is "an owner", in the sense that "owner" is 
>   supposed to  be the type (the kind) of Jack. Not an
>   "owner of ..." (which would make sense), but just "an owner".

The normal solution would be something like

...
<owners>
  <owner id="Jack">Jack the Lad</owner>
  <owner id="Jill">Jill the Lass</owner>
  ...
</owners>
...
<plant type="rose" owners="Jack Jill">x</plant>

(with id as ID and owners as IDREFS). Certainly you could choose to
expand the declaration of <owner> to allow subelements to provide finer
detail (see many of the TEI declarations for examples).

>   The intention of the author, however, is that "owner" is
>   supposed to give the /relation/ to the containing element
>   "rose". This is the natural field of application for
>   attributes, as the meaning of the word "attribute" outside 
>   of XML clearly indicates, but it is not possible to 
>   always use attributes for this purpose in XML.
> 
>   An alternative solution might be the following notation.
> 
> <rose owner="Alexander Marie" />
> 
>   Here a /new/ mini language (not XML anymore) is used within 
>   an attribute value, which, of course, can not be checked 
>   anymore by XML validators. This is really done so, for 
>   example, in XHTML, where classes are written this way.

I suggest you re-read the XML Spec for IDREFS and ENTITIES.

>   So in its most prominent XML application XHTML, the W3C 
>   has to abandon XML even to write class attributes. This 
>   is not such a good accomplishment given that the W3C 
>   was able to use the experience made with SGML and HTML 
>   when designing XML.

That was done for exogenous political reasons, as I understand it, not
for technical ones.

>   The needless restrictions of XML inhibit the meaningful 
>   use of syntax. This makes many document type designers 
>   wonder, when attributes and when elements  
>   should be used, which actually is an evidence of 
>   incapacity for the design of XML: XML does not have many 
>   more notations than these two: attributes and elements. 
>   And now the W3C failed to give even these two
>   notations a clear and meaningful dedication!

No-one is pretending that XML is perfect, but you must understand that
it was designed for text documents, not for database engineering.

>   Without the restrictions described, XML alone would have
>   nearly the expressive power of RDF/XML, which has to repair
>   painfully some of the errors made in the XML-design.
> 
>   Now, some "experts" recommend to /always/ use subelements, 
>   because one can never know, whether an attribute value 
>   that seems to be unstructured today might need to become 
>   structured tomorrow. Other "experts" recommend to use 
>   attributes only when one is quite confident that they 
>   never will need to be structured. This recommendation 
>   does not even try to make a sense out of attributes, 
>   but just explains how to circumvent the obstacles
>   the W3C has built into XML.

Please re-read the FAQ warning on this subject.

[snip]

///Peter

<Prev in Thread] Current Thread [Next in Thread>
Privacy Policy