SGML vs XML
Trent Shipley
plug-discuss@lists.PLUG.phoenix.az.us
Thu, 31 May 2001 14:17:57 -0700
No. This is essentially a content analysis problem. The idea is to mark up
*meaning* not structure or format. In a sense the idea is to press XML |
SGML into service as a semantic content markup language. The problem with
your break idea is that it:
1: Makes it look like there are two sets of text that need to be tagged with
content tag #1 when really there is only one element.
2: It makes it look like the second section of #1 content is contained in a
passage of type #2. In reality passage #1 and passage #2 share text and
content (sort of like using the same DLL). If the underlying
meaning-structure is misrepresented one can imagine that it could reduce the
utility for linguistic analysis. Methinks Larry Wall would *not* approve.
As for the desirability of non-nested elements, an in-elegant hack would be
to put them in comments. The HRAF codes are used for statistics and
extracting data. The markup parser doesn't really need to see them, but if
it did it would save having to develop a content markup language and parser.
Still the idea of a Semantic Content Markup Language is intriguing.
It would even have some immediate commercial application in market
research and data retrieval services. It could even be extended to help
automate esoteric tasks like compiling HRAF files and indexes.
> -----Original Message-----
> From: plug-discuss-admin@lists.PLUG.phoenix.az.us
> [mailto:plug-discuss-admin@lists.PLUG.phoenix.az.us]On Behalf Of Kimbro
> Staken
> Sent: Thursday, May 31, 2001 1:25 PM
> To: plug-discuss@lists.PLUG.phoenix.az.us
> Subject: Re: SGML vs XML
>
>
> Trent Shipley wrote:
> >
> > I have decided that my dissertation has outgrown WordPerfect
> and EndNote,
> > but especially WordPerfect. It looks like XML + XSL will work
> pretty well.
> > However, for future compatibility it would be nice to markup
> the document
> > with Human Relations Area File codes (HRAF codes). The problem
> is that text
> > referenced by HRAF codes might not nest.
> >
> > Note that a single code should never overlap itself. It is as
> if there were
> > N codes and the document was scanned for the applicability of each code,
> > effectively resulting in N documents. However, some of those N
> codes will
> > not apply anywhere in the document. In effect some subset m of the N
> > documents consists of NULL documents. Then all the interesting N-m
> > documents are projected into a single document.
> >
> > The markup could looks something like this:
> >
> > <HRAF code="1">The Raboof blah blah blah blah yadda yadda.
> <HRAF code="2">
> > Their women blah yadda blah blah.</HRAF code="1"> Meanwhile
> the children
> > blah.<HRAF code="2">
> >
>
> If you want to use XSL your documents have to be well formed XML which
> that definitely isn't. Can you do something like this?
>
> <HRAF code="1">The Raboof blah blah blah blah yadda yadda. </HRAF><HRAF
> code="2">
> <HRAF code="1">Their women blah yadda blah blah.</HRAF> Meanwhile the
> children
> blah.</HRAF>
>
> > Can SGML handle non-nested markup tags?
> >
> > ----------------------------------------------
> >
> > Trent Shipley
> >
> > Work:
> > (602) 522-7502
> > mailto:tshipley@symbio-tech.com
> > http://www.symbio-tech.com
> >
> > ________________________________________________
> > See http://PLUG.phoenix.az.us/navigator-mail.shtml if your mail
> doesn't post to the list quickly and you use Netscape to write mail.
> >
> > PLUG-discuss mailing list - PLUG-discuss@lists.PLUG.phoenix.az.us
> > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
> --
> Kimbro Staken
> The dbXML Project
> http://www.dbxml.org/
> ________________________________________________
> See http://PLUG.phoenix.az.us/navigator-mail.shtml if your mail
> doesn't post to the list quickly and you use Netscape to write mail.
>
> PLUG-discuss mailing list - PLUG-discuss@lists.PLUG.phoenix.az.us
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss