SGML vs XML

Kimbro Staken plug-discuss@lists.PLUG.phoenix.az.us
Thu, 31 May 2001 15:17:05 -0700


Trent Shipley wrote:
> 
> No.  This is essentially a content analysis problem.  The idea is to mark up
> *meaning* not structure or format.  In a sense the idea is to press XML |
> SGML into service as a semantic content markup language.  The problem with
> your break idea is that it:
> 
> 1: Makes it look like there are two sets of text that need to be tagged with
> content tag #1 when really there is only one element.
> 
> 2: It makes it look like the second section of #1 content is contained in a
> passage of type #2.  In reality passage #1 and passage #2 share text and
> content (sort of like using the same DLL).  If the underlying
> meaning-structure is misrepresented one can imagine that it could reduce the
> utility for linguistic analysis.  Methinks Larry Wall would *not* approve.

Well then you can't use XML, the spec does not allow overlapping tags.
If you don't use XML then you can't use XSL. SGML can probably do what
you want. You'll need to look elsewhere for styling then, maybe DSSL.

> 
> As for the desirability of non-nested elements, an in-elegant hack would be
> to put them in comments.  The HRAF codes are used for statistics and
> extracting data.  The markup parser doesn't really need to see them, but if
> it did it would save having to develop a content markup language and parser.
> 
> Still the idea of a Semantic Content Markup Language is intriguing.
>      It would even have some immediate commercial application in market
> research and data retrieval services.  It could even be extended to help
> automate esoteric tasks like compiling HRAF files and indexes.
> 

-- 
Kimbro Staken
The dbXML Project
http://www.dbxml.org/