Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12]


Journal of Information Science and Engineering, Vol. 28 No. 2, pp. 263-293 (March 2012)

Subtree Reconstruction, Query Node Intervals and Tree Pattern Query Evaluation*

Department of Applied Computer Science
University of Winnipeg
Winnipeg, Manitoba, Canada R3b 2E9

Since the extensible markup language XML emerged as a new standard for information representation and exchange on the Internet, the problem of storing, indexing, and querying XML documents has been among the major issues of database research. In this paper, we study the tree pattern matching and discuss a new algorithm for processing ordered tree pattern queries, by which not only ancestor/descendant relationships, but also left-to-right ordering of query nodes are considered. Such kind of tree matching has many applications in practice, such as the linguistic analysis, the video content-based retrieval, as well as the computational biology and the data mining. The time complexities of the new algorithm is bounded by O(|D|.|Q| + |T|. leafQ) and its space overhead is by O(leafT . leafQ), where T stands for a document tree, Q for a tree pattern and D is the largest data stream among all the data streams associated with the nodes in Q. Each data stream contains the database nodes that match the predicate at a node q. leafT (leafQ) represents the number of the leaf nodes of T (resp. Q). In addition, the algorithm can be adapted to an indexing environment with XB-trees being used. Experiments have been conducted, which shows that our algorithm is promising.

Keywords: XML documents, tree pattern queries, tree matching, tree encoding, XB-trees

Full Text () Retrieve PDF document (201203_03.pdf)

Received June 5, 2010; revised April 5, 2011; accepted August 1, 2011.
Communicated by Vincent S. Tseng.
* Results of this paper were partially presented at the 20th International Conference on Database and Expert Systems Applications, 2009.