Journal of Information Science and Engineering, Vol. 25 No. 2, pp. 603-617 (March 2009)

TwigX-Guide: An Efficient Twig Pattern Matching System Extending DataGuide Indexing and Region Encoding Labeling

Su-Cheng Haw and Chien-Sing Lee
Department of Electrical and Computer Engineering
Faculty of Information Technology
Multimedia University
63100 Cyberjaya, Malaysia
E-mail: {schaw, cslee}

With the rapid emergence of XML as an enabler for data exchange and data transfer over the Web, querying XML data has become a major concern. In this paper, we present a hybrid system, TwigX-Guide; an extension of the well-known DataGuide index and region encoding labeling to support twig query processing. With TwigX-Guide, a complex query can be decomposed into a set of path queries, which are evaluated individually by retrieving the path or node matches from the DataGuide index table and subsequently joining the results using the holistic twig join algorithm TwigStack. TwigX-Guide improves the performance of TwigStack for queries with parent-child relationships and mixed relationships by reducing the number of joins needed to evaluate a query. Experimental results indicate that TwigX-Guide can process twig queries on an average 38% better than the TwigStack algorithm, 30% better than TwigINLAB, 10% better than TwigStackList and about 5% better than TwigStackXB in terms of execution time.

Keywords: XML query, indexing, labeling, DataGuide, region encoding, query optimization

Received May 31, 2007; revised December 3; accepted April 24, 2008.
Communicated by Jonathan Lee.