Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16]


Journal of Information Science and Engineering, Vol. 22 No. 4, pp. 819-841 (July 2006)

D-Tree: A Multi-Dimensional Indexing Structure for Constructing Document Warehouses*

Frank S. C. Tseng and Wen-Ping Lin
Department of Information Management
National Kaohsiung First University of Science and Technology
Kaohsiung, 811 Taiwan

Document warehouses, unlike traditional document management systems, contain extensive semantic information about documents, cross-document feature relations, and document grouping or clustering, thus providing an accurate and efficient access to business intelligence information. Since documents are multi-dimensional in nature, we claim that traditional indexing methods are not really suitable for document warehousing. In this paper, we propose an indexing structure, called the D-tree, which can facilitate the construction of document cubes. We formally present the related definitions, the design of its storage structure and related algorithms for D-trees. The above are essential for establishing an infrastructure for combining text processing methods with numeric OLAP processing technologies. Hopefully, the proposed combination of data warehousing and document warehousing will be an important kernel for knowledge management and customer relationship management applications.

Keywords: data warehousing, document warehousing, knowledge management, multidimensional access method, OLAP

Full Text () Retrieve PDF document (200607_07.pdf)

Received March 30, 2004; revised September 22 & December 22, 2004; accepted December 30, 2004.
Communicated by Suh-Ying Lee.
* A shorter version of this paper was presented at the 20th Workshop on Combinatorial Mathematics and Computation Theory, TAIWAN, 2003. ([42] in Chinese)
* This research was partially supported by the National Science Council, Taiwan, R.O.C., under Contract No. NSC-91-2416-H-327-005.