GROVE groove database
Trent Shipley
plug-discuss@lists.PLUG.phoenix.az.us
Mon, 1 Oct 2001 00:44:50 -0700
============
============
Prolog:
=============
As I work on my dissertation I have
1) citations in the text
2) a list of sources
3) A master bibliography
It would be ideal to drop the text and master bibliography into an engine
that would produce a properly formatted list of sources. This would require
treating the master bibliography as a querable knowledge base.
=========
The rest of this has nothing to do with that.
========
It seems evident that automatically building a traditional relational schema
for a DTD is not very easy to do. Instances of *most* document types are
better represented as trees.
It seems to follow from this that if you wanted to implement a *general*
database that would store XML data automatically it should treat a document
type as the rough equivalent of a schema as its default behavior. Its next
default behavior would be to accession/store a document of a given type as an
item (row) in the document-default-heap (the equivalent of a table.)
Queries might often take forms like this:
Return all documents where John Doe is <docAuthor>
or
Return all $synthetic-document-id, $handle(parent <data-element>) from
%default-document-heap such-that the document (contains $leaf-content "xml")
=================
Of course, if you are going to have to transform serial XML strings into
trees to search them then you might as well just store the document in tree
form straight away.
And if you are going to search document trees you might as well go ahead and
build HyTime GROVEs. Having built the document GROVE you might as well go
ahead as store that as the tree-form document representation.
================
================
Questions
=================
=================
Are there any generic XML databases like this?
That is I give it a DTD.
Then I try to insert an instance of that DT and it automatically stores
it.
Later I can query the document base using set logic over node and string
regular expressions.
(Based on those queries I can probably also delete and update the
document base)
If there are such document bases, how (if at all) do they use GROVEs?
==================
What, exactly, is dbXML supposed to do?