compiler theory question
Trent Shipley
plug-discuss@lists.PLUG.phoenix.az.us
Wed, 2 May 2001 15:17:47 -0700
This question derives from a discussion about XML a couple weeks ago where
someone said that DTD were basically superfluous.
I think of a DTD as instructions to a flexible parser engine.
The parser engine reads the DTD.
It reads the XML document.
It produces a parse tree based on the DTD and XML inputs.
((
As a tangent this means that a parser is itself a sub-species of translator
or compiler. Its source code is the source document and its output object
code is the parse tree. (The parse tree then is the immediate source code
for what we usually think of as *the* compiler proper.)
Incidentally, the assertion that a DTD language is unnecessary amounts to
the assumption that there exists a default set of instructions for the
parser engine such that the resulting parse tree for any XML document will
be in some sense a superset of the parse tree produced for any valid DTD
that might be applied to the same document. In other words, there exists a
universal DTD such that the parse tree for any XML document parsed according
to the rules of any DTD that is not the universal DTD either produces the
same structure as that produced by the u-DTD or a graph that is a
foreshortened version of the u-DTD parse tree.
))
The parse tree is turned into a final object using XSLT.
The compiler (formatting) engine reads the XSLT document.
It reads the parse tree from the parsing engine.
It produces the formatted document.
----------------------------------------------
This raises the question of whether a universal translator for all
interesting languages can be built.
--
First Problem:
A universal compiler is defined as a machine that takes any well formed,
valid source code and produces semantically correct object code for a
specified machine.
Hypothesis: A universal compiler for all interesting languages can be
built.
pre_Hypotheses: A universal compiler will require a universal parser
instruction language.
A universal compiler will require a universal translator
instruction language.
-------------------------------------------
XML is a simpler and more formal subset of SGML. The XML folks insist that
XML is semantically equivalent to SGML.
Analogy leads to the second question.
--------------------------------------------
Second problem:
An universal language is defined as a finite language such that any language
whose well formed, valid utterances can be translated by the universal
compiler can be put in a one to one (isomorphic) correspondence with the
universal language with the result that any object code produced is
semantically indistinguishable from that produced using the source language
directly. In addition the universal language does not include the object
language or any orthogonal subset of the object language.
Hypothesis: If there exists a universal compiler then there also exists a
universal language.
further hypothesis: Both the universal parser instruction language and the
universal compiler instruction language can be expressed using the universal
language.
------------------------------------------
Third problem:
Hypothesis: There exists a finite body code for the universal parser
instruction language such that any object code produced will be semantically
indistinguishable from object code produced using any other set of parser
instructions.
--------------------------------------
Trent Shipley
Work:
(602) 522-7502
mailto:tshipley@symbio-tech.com
http://www.symbio-tech.com