Trent Shipley wrote: > The parser engine reads the DTD. > It reads the XML document. > It produces a parse tree based on the DTD and XML inputs. The actual order is: - The parser reads in the XML document - The parser sees a reference to a DTD and tries to resolve it from wherever it may be. - As the parser parses into the Document Element, if there is a DTD, it uses the DTD rules to determine if the nestings and attributes it's parsing are valid. - It produces SAX events or a DOM tree based on the validly parsed XML. In the case of a programming language, the language syntax drives the parsing of the stream. In the case of XML, the parsing of the document, drives the validation process. So you can't really compare the two. The idea behind XML is that it should be very simple to write a parser to read an XML document. An XML document can be well-formed, but still not be valid (if it even requires validation). The problem with DTDs is that they can change the canonical value of a Document, which means that they can potentially break systems to a great degree. For example, Netscape yanking the RSS 0.9 DTD broke content syndication for a lot of people. But this breaks things beyond simply validating the Documents... even if I decided that the original DTD was gone, and that I would just not validate it anymore, the parser might still have to resolve the DTD if my document had any entity references in it that the DTD defined. Worse, if I had been creating elements, but was leaving out attributes that had defaulted values in the DTD, I'd completely lose that data if the DTD were lost. Another problem area is someone changing the default value of an attribute. Default column values in the relational database world can be changed without much fear of breaking a system because the values are filled in as they are inserted into the system and are retained from then on. In XML, to use a default value, you simply don't assign the attribute, and the parser will report to you its default value as if it had actually been in your document. So if you approached creating a document like you would approach creating a relational database record, you'd be SOL when the value assumptions you made at creation time come back to you completely differently the next time you retrieve the document. Issues like this are why projects like Minimal XML have been started, and are why discussion of XML Infosets and Canonical XML are so debated. Schemas are definitely a better approach than DTDs, but are far more complex, and still have some of the same DTD failings (like defaulted values) -- Tom Bradford --- The dbXML Project --- http://www.dbxml.org/ We store your XML data a hell of a lot better than /dev/null