GROVE groove database

Trent Shipley plug-discuss@lists.PLUG.phoenix.az.us
Mon, 1 Oct 2001 00:44:50 -0700


============
============
Prolog:
=============

As I work on my dissertation I have 
1) citations in the text
2) a list of sources
3) A master bibliography

It would be ideal to drop the text and master bibliography into an engine 
that would produce a properly formatted list of sources.  This would require 
treating the master bibliography as a querable knowledge base.

=========

The rest of this has nothing to do with that.

========

It seems evident that automatically building a traditional relational schema 
for a DTD is not very easy to do.  Instances of *most* document types are 
better represented as trees.  

It seems to follow from this that if you wanted to implement a *general* 
database that would store XML data automatically it should treat a document 
type as the rough equivalent of a schema as its default behavior.  Its next 
default behavior would be to accession/store a document of a given type as an 
item (row) in the document-default-heap (the equivalent of a table.) 

Queries might often take forms like this:

Return all documents where John Doe is <docAuthor>

or

Return all $synthetic-document-id, $handle(parent <data-element>) from  
%default-document-heap such-that the document (contains $leaf-content "xml")

=================

Of course, if you are going to have to transform serial XML strings into 
trees to search them then you might as well just store the document in tree 
form straight away.

And if you are going to search document trees you might as well go ahead and 
build HyTime GROVEs.  Having built the document GROVE you might as well go 
ahead as store that as the tree-form document representation.

================
================

Questions

=================
=================

Are there any generic XML databases like this?
    That is I give it a DTD.
    Then I try to insert an instance of that DT and it automatically stores 
it.
    Later I can query the document base using set logic over node and string 
regular expressions.
    (Based on those queries I can probably also delete and update the 
document base)

If there are such document bases, how (if at all) do they use GROVEs?

==================

What, exactly, is dbXML supposed to do?