Internet Draft Authors: Xiao-Dong Lee Li Ming Tseng Jan 1, 2002 Ho Jan-Ming Expires in six months Xiang Deng Kenny Huang Erin Chen GuoNian Sun Requirements of Chinese Domain Name Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html 1. Premise to be emphasized All requirements of such memo focus on the requirements of Traditional and Simplified Chinese Domain Name Equivalence Matching and delimiter folding. So which is important in this paper is not the definition of Chinese Domain Name but that Internationalized Domain Name SHOULD satisfy such requirements. Any Internationalized Domain Name that includes any character defined by [1] and appendix A SHOULD satisfy such requirements, no matter what character is included in all the labels of it. That is, for any IDN-aware application with IDNA support, if it is CDN-aware too, it should check if the domain name inputted is defined by [1] and appendix A, furthermore, it SHOULD satisfy the requirements defined in this memo. 2. Terminology The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and "MAY" in this document are to be interpreted as described in RFC 2119[6]. "TC" is an abbreviation for Traditional Chinese. "SC" is an abbreviation for Simplified Chinese. "CDN" is defined as an acronym of Chinese Domain Name that represents internationalized domain name, which contains at least one Chinese character. As to the scope of Chinese character, please refer to ISO/IEC 10646-1:2000(E) [second edition 2000-09-15], if one character is marked "C and G-Hanzi-T", it MUST be a Chinese character, such definition does not mean it is not the character of other countries that use HAN ideograph. "Equivalent CDN" is defined as CDNs that have at least one character from SC-TC tables [1]. "TC-only CDN" is a CDN that all characters of all its labels are TC characters. "SC-only CDN" is a CDN that all characters of all its labels are SC characters. "Mixed-use TC and SC CDN" is a CDN that in all labels of the domain name, at least one traditional and one simplified Chinese character appear. 3. Problems Traditional Chinese and Simplified Chinese themselves are not a problem. It is a fact of life. If IDN does not deal with this fact, then it isn't a complete solution. There are mainly four problems associated with CDN as follows: [1] TC and SC CDN equivalent matching SC is derived from TC, and Chinese people use both SC and TC. So Chinese people consider that TC CDN as being equivalent to its corresponding SC forms. [2] Mixed-use TC and SC CDN cause an exponential problem If we want to ensure a CDN in both TC/SC forms to be resolved correctly, we could register all combinations with mixed equivalent TC and SC characters. But, along with the length of a label, the number of different combinations grows exponentially. An ordinary Chinese domain name may have dozens, hundreds, even thousands of TC/SC records. That is unreasonable for users to register, and is also difficult for administrators to manage. [3] Registration and delegation of multiple equivalent CDN Without the support of proper delegation and resolution architecture, when a user registers a Chinese domain name, he may have to obtain many forms of it and must operate many domains. The lower level delegation domain name servers may adopt a different domain administrative policy which differs from the one adopted by the upper level, Consistency of TC/SC domain names then can't be ensured. [4] Multiple possible periods (e.g. U+3002 , U+2022, U+FF0E) In Mainland China, there is a different period other than dot. While user input Chinese domain name, he or she types the delimiter of domain name, and he or she will certainly get period (such as: U+3002). In Taiwan Chinese IME, user might type or copy and paste U+2022 or U+FF0E as the delimiter. 4. Requirements 4.1 Requirements of Traditional and Simplified Chinese Domain Name [1] Traditional/Simplified CDN solution MUST be consistent for all CDN users, including but not limited to end users and administrators. [2] The need to do multiple registrations and delegation for an equivalent CDN MUST be minimized. There MUST be only one registration for equivalent CDN. The delegation(s) for an equivalent CDN MUST be consistent. [3] Equivalent CDN SHOULD be treated as equivalent in IDN comparison. [4] Applications that support CDN MAY display the equivalent CDN to users depending on the priority order of user preference followed by default original form and then lastly ACE fallback. [5] Implementation of IDN that supports CDN MUST preserve the original form of CDN. [6] IDN requirements MUST accommodate CDN user requirements. 4.2 Requirement of Delimiter Folding [1] U+3002, U+2022, and U+FF0E MUST be treated as domain names delimiter. 5. Wish List [1] We wish that every implementation would support CDN if and when there is an IDN standard. [2] We wish to see a quick conclusion to the CDN/IDN standardization process. [3] We wish software to have the capability to support both Traditional and Simplified CDN. 6. Authors Xiao-Dong Lee, lee@cnnic.net.cn, CNNIC Li Ming Tseng, tsenglm@cc.ncu.edu.tw, TWNIC Ho Jan-Ming, hoho@iis.sinica.edu.tw, TWNIC Xiang Deng, deng@cnnic.net.cn, CNNIC Kenny Huang, huangk@sinica.edu.tw, TWNIC Erin Chen, erin@twnic.net.tw, TWNIC GuoNian Sun, sun@cnnic.net.cn, CNNIC 7. Acknowledgement The original list of problems, requirements and wish list are derived from the result of the consensus of 7th JET meeting held on Nov 19th, 2001 in Beijing. Thanks for all participants of the meeting. Moreover, some persons as follows are high appreciated. Shian-Shyong Tseng Wen-Sung Chen Wenhui Zhang Wei Mao Hualin Qian 8. References [1] A Complete Set of Simplified Chinese Characters, published in 1986 by the Committee of National Language and Chinese Character of China. [2] Dictionary of Chinese Character Variants, compiled by Mandarin Promotion Council of Taiwan. Version 2 was published in Aug 2001 on Web site. http://140.111.1.40/ [3] Paul Hoffman, Marc Blanchet, " Stringprep Profile for Internationalized Host Names" September 27, 2001, draft-hoffman-stringprep-00.txt [4] Patrik Falstrom, Paul Hoffman, "Internationalizing Host Names In Applications (IDNA)", July 20, 2001, draft-ietf-idn-idna-06.txt [5] The Unicode Consortium, "The Unicode Standard", http://www.unicode.org/unicode/standard/standard.html. [6] Scott Bradner, "Key words for use in RFCs to Indicate Requirement Levels", March 1997, RFC 2119. [7] ISO/IEC 10646-1:2000(E). International Standard - Information technology -- Universal Multiple-Octet Coded Character Set (UCS) Appendix A. Delimiter Set U+3002 U+2022 U+FF0E