From the time of Oracle Bone Inscriptions in the Shang Dynasty, Chinese characters have been in use for about 3,400 years, but the computer and internet just began to play an important role over the past two decades. Today, these Chinese characters showed on the computer system are just basic ones, it still need urgent efforts for computer scientists to improve Chinese characters knowledge in serious short. Therefore, our researches aim to create a database of Chinese characters knowledge to be compatible with computer systems; and further, to lift ability of Chinese character processing techniques through these knowledge.
Generally speaking, Chinese character is composed of three parts: glyph, sound and meaning. In the aspect of glyph, we constructed a Chinese Glyph Structure Database and have carried out the latest edition of 2.65 in Feb, 2011. In this edition, 145,308 ancient and modern Chinese characters are collected, including characters of 89,641 Standard Scripts, 11,100 Small Seal Scripts, 22,729 Bronze Inscriptions, 19,138 Chu Bamboo Slips, 2,700 Oracle Bone Inscriptions and their variants. In addition, 12,208 sets of variant character tables from The Chinese Dictionary《漢語大字典》are also included. Four main features of this database are as follows: 1.Connecting ancient and modern characters to show how they evolved for years. 2.Collecting variants of each epoch to show how Chinese characters are connected on different historical levels. 3.Recording character structures of different epochs to demonstrate the trait of shape-by-definition. 4.Using glyph structures and style codes to solve unencoded Chinese character problem. Moreover, through the Chinese Glyph Structure Database, we developed a system to search for Chinese character components to resolve the missing character problem.
Presently, in the aspect of character sound, We have cooperated with the Department of Chinese Literature, National Taiwan University to develop the Ancient and Modern Chinese Character Readings Database. More than 1,000,000 phonetic items has been collected in the database. In the aspect of character meaning, we start to work out a portal site for searching Chinese characters, some major features are as follows: 1.People can search Chinese characters by way of radicals, strokes, sounds or components. 2.Making a hyper-link to relative website for more information about Chinese characters. 3.Offering techniques for solving missing character problem. 4.Helping to promote character books service over internet for the public.