Page 129 - untitled
P. 129
ᅃ
Juang, Der-Ming Ӻᔊʧ Research Description
Research Description
Ӻᔊʧ
ତί͜ཥ໘ஈଣဏο༟ࣘdʥϞεਪᕚܙ༆ Nowadays the existing Hanzi Interchange 2. Collecting variant forms: The Chi-
Code is not good enough for daily applications. It nese Glyph Structure Database collects many
ӔdՉʕ௰͉࣬ٙਪᕚ݊οʔ͜dவఱ݊ה
seems to never satisfy users§ needs. Obviously, the variant forms of some elucidations, including
ፗٙॹοਪᕚf࿁ᘱוဏ˖ʷٙήਜԸႭdॹο
problem of missing characters is a fatal evidence of 11,900 variant sets from Hanyu-da-zidian, 1,163
ਪᕚʊ݊ɓࡈΝٙྫྷfމəᏐ˹ॹοਪᕚdɓ
the insufficiency of the existing Hanzi Interchange Chungwen(repeated variant) from Shuowenjiezi.
ছطᅺٙ˙ج݊jίʹ౬ᇁٙԴ͜٫ிοਜʫd፯
Codes for people who use Chinese. In order to solve Moreover, 19,357 Chungwen from Bronze Inscrip-
ɓࡈᇁЗdԨிɪהॹٙοҖfவ၇ਂجո್̙˸
this problem, people usually define a code chosen tions Assembly and 19,250 glyphs from Chu Bam-
ί༈ཥ໘ɪᜑͪ̈༈οҖdШ݊˹̈ٙ˾ᄆ̶ɽd
within the existing interchange codes to represent a boo Slips Character Assembly are aimed to collect
˲Ӛঐॆ͍༆Ӕਪᕚf
missing character. This method can properly display in the nearest future. The searching of Chinese Research Engineers
the character on a computer; nevertheless, it takes a documents has been bothered by the occurrence of
ҢࡁႩމdࠅ༆Ӕॹοਪᕚdᒔ݊Ꮠ༈ܔͭ
ཥ໘ஈଣဏοਪᕚٙঐɢdவܼ̍ਗ਼̀ࠅٙ˖οኪ lot of laborious work and cannot solve this problem variants for quite a long time, and thus we strongly
completely. expect to resolve this problem efficiently by the
ٝᗆڌ༺ίཥ໘ʕdԨ̋˸л͜d˸ʿܔɓࡈஈ
sound variant form.
ଣοҖٙዚՓfவԬϋԸdҢࡁ˴ࠅٙϓ؈݊ܔͭ To resolve the missing character problem, we
əဏοҖ༟ࣘࢫfဏοҖ༟ࣘࢫٙतЍν think that the most important thing is to carry out a 3. Using any component to search characters:
ਿ ͉ ༟ ࣘ
ਿ ͉ ༟ ࣘ ɨj well-designed computer program which could deal A Hanzi consists of many components. Many Hanzi
Research Engineers
with the Hanzi set properly. We believe that the structures of different history periods are recorded
ɓeმટ̚ʦ˖οfဏοҖ༟ࣘࢫϗٙ
ᔖcc၈j ӺпҦࢪ οҖdৰəฺࣣʘ̮dΝࣛϗ͠৶˖eږ˖eู computer program can express related knowledge of in Chinese Glyph Structure Database, and any com-
Assistant Research Engineer (2004/7--) ӻᔊ֯˖οʿʃᇏഃ̚ဏοfତ˾ဏοၾ̚ဏο Philology and use it to construct a processing sys- ponent is ready to be searched for characters in this
tem for character, glyph, font, typeface, etc. There- Database. Generally speaking, components are more
௰৷ኪዝj M.S., The Institute of Computer ˢd͟Җᜊʷ˄ɽdဏοٙҖ່ᗫڷʊܘʔ fore, for the past few years, one of our main works convenient than radicals for searching a character,
Management, National Tsing Hua ᜑdޟЇҁΌᒯӚfה˸ซࠅᐝ༆ݔࡈοٙҖభ has been to carry out the Chinese Glyph Structure since each character is classified to one kind of radi-
່d̀ҬՑ̴ٙ̚˖οҖfმટ̚ʦ˖οdʔ
University (1986) Database. Some of its characteristics are explained cal, you have to check what radical the character is
ස̙ீཀତ˾ဏοԸႩᗆ̚ဏοdһ̙ᔟ̚͟ဏο below: sorted to before you search the character. However,
ཥcc༑j+886-2-2788-3799 ext. 2407 Ͼ̋ଉ࿁ତ˾ဏοٙଣ༆f a character is composed of some components, and
1. Connecting modern and ancient characters:
ෂccॆj+886-2-2782-4814 ɚeϗମοڌfဏοҖ༟ࣘࢫϗਞ Besides the Standard Script, Chinese Glyph Struc- each of them could be taken for searching.
ཥɿڦᇌjderming@iis.sinica.edu.tw ϽοࣣٙମοڌdͦۃʊႊဏႧɽοՊٙ ture Database also contains characters of Oracle 4. Completely resolving the encoding problem
11,900ଡ଼ମοdႭ˖༆οٙ1,163ࡈࠠ˖iϾ Bone Inscriptions, Bronze Inscriptions, Chu Bam- of Hanzi: The reason why missing character prob-
ၣccࠫjhttp://www.iis.sinica.edu.tw/pages/derming
ږ˖ᇜٙ19,357ࡈࠠ˖d˸ʿูӻᔊ֯˖ο boo Slips Characters and Small Seal Inscriptions. lems exist all the time is because the existing Hanzi
ᇜٙ19,250ࡈοҖɰуਗ਼ҁϓϗfڗɮ˸Ըd Comparing modern Hanzi with ancient ones, due coding structure is the implied assumption that the
ମοிϓ˖Ꮸ॰ٙѢᓔdҎૐᔟഹମοڌٙ to lots of changes, the relationship between glyph set of Hanzi is a closed finite set just like that of
• Research Assistant, Institute of Information Science, alphabets, and totally ignore that each Chinese char-
ܔໄdঐϞࣖ༆ӔϤਪᕚf and meaning of a character is getting more opaque.
Academia Sinica (1992/10--2004/6) Therefore, if you want to understand the structure acter is composed of limited basic components and
ɧe̙͜ᏨοfဏοٙҖఊЗ݊f and the meaning of a character, you must go back could be differentiated by the meanings it carried.
• Technician, Chung-shan Institute of Sci-
ဏοҖ༟ࣘࢫাəʔΝዝ̦ࣛಂٙဏοഐd to the glyph of its ancient character. By showing the However, no one knows how many characters there
ence and Technology. Armaments Bureau.
οҖഐʕٙॴே̙͜ԸᏨ॰οҖfᏨ relationship between modern and ancient characters, are for Hanzi so far. Now that interchange code is
M.N.D.(1986--1992)
οჃ༰ᏨοԸکлdΪމɓࡈοίοՊ̥ঐ readers would not only understand ancient character used to distinguish glyphs, and glyphs are differenti-
• M.S., The Institute of Computer Management, ᓥɓࡈɨd̀ᆽႩdʑঐᏨ॰οҖi through modern characters, but would also deepen ated by their structures. Therefore, we adopt directly
National Tsing Hua University (1986) Ш݊ɓࡈο̙˸ϞλࡈdவԬ̙Νࣛ͜ their knowledge of modern characters through the the structure of glyph to express encoding for elimi-
ԸᏨ॰οҖf reading of ancient characters. nating the missing character problem completely.
• B.S., Information Science, National Chiao Tung
University (1984). ̬e࿏ֵ༆Ӕဏοٙᇜᇁਪᕚfॹοਪᕚɓٜ
Selected Publications
ೌج༆Ӕd݊ΪމତБဏοʹ౬ᇁdਗ਼ဏοൖΝГ Selected Publications
˙ႧԊٙܳࠪο͎dҁΌׁଫəဏο݊ڌจ˖οd
݊͟Ϟࠢٙਿᓾהଡ଼ϓٙfГ˙ႧԊٙܳࠪο 1. 謝清俊、莊德明、張翠玲、許婉蓉,中文字形資料庫的設計與應 4. 莊德明、謝清俊,漢字構形資料庫的建置與應用, ဏοၾΌଢʷ
用, ୋʬ֣ʕ˖οኪΌኪஔীึ ,台中,1995年4月。 ყኪஔীึ ,台北,2005年1月。
͎݊ࡈϞࠢණΥdϾဏοۍ݊ࡈක׳οණiοҖᗭ 2. 莊德明、謝清俊、林晰,中央研究院古籍全文資料庫解決缺字問 5. Der-Ming Juang, Jenq-Haur Wang, Chen-Yu Lai, Ching-Chun Hsieh,
˸ϗҁΌdॹοਪᕚІ್ᄴ̈ʔᇊf್݅ʹ౬ᇁ 題的方法, ୋɚϣՇ֦̚ᘬଣӺኪஔীึ ,北京,1998年 Lee-Feng Chien, and Jan-Ming Ho, Resolving the Unencoded Charac-
5月。 ter Problem for Chinese Digital Libraries, Joint Conference on Digital
˴ࠅ݊͜ԸਜйοҖdϾοҖٙࢨମίഐɪٙ
3. 莊德明、許永成、謝清俊,如何使用電腦處理古今文字的銜接─ Libraries 2005, Denver,Colorado,USA, June 7-11, 2005
ʔΝdΪϤҢࡁٙЪجٜ݊ટમ͜οҖٙഐڌ༺ 以小篆為例, ୋɤ̬֣ʕ˖οኪΌኪஔীึ ,高雄,2003
όԸᇜᇁdவᅵʑঐ࿏ֵ༆Ӕဏοٙᇜᇁਪᕚf 年3月。
114 115