| Previous | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
¡@
Frank S. C. Tseng and Wen-Ping Lin
Department of Information Management
National Kaohsiung First University of Science and Technology
Kaohsiung, 811 Taiwan
E-mail: imfrank@ccms.nkfust.edu.tw
Document warehouses, unlike traditional document management systems, contain
extensive semantic information about documents, cross-document feature relations, and
document grouping or clustering, thus providing an accurate and efficient access to
business intelligence information. Since documents are multi-dimensional in nature, we
claim that traditional indexing methods are not really suitable for document warehousing.
In this paper, we propose an indexing structure, called the D-tree, which can facilitate
the construction of document cubes. We formally present the related definitions, the design
of its storage structure and related algorithms for D-trees. The above are essential
for establishing an infrastructure for combining text processing methods with numeric
OLAP processing technologies. Hopefully, the proposed combination of data warehousing
and document warehousing will be an important kernel for knowledge management
and customer relationship management applications.
Received March 30, 2004; revised September 22 & December 22, 2004; accepted December 30, 2004.
Communicated by Suh-Ying Lee.
* A shorter version of this paper was presented at the 20th Workshop on Combinatorial Mathematics and Computation
Theory, TAIWAN, 2003. ([42] in Chinese)
* This research was partially supported by the National Science Council, Taiwan, R.O.C., under Contract No.
NSC-91-2416-H-327-005.