Internationalization Status and Directions: IETF, JET, and ICANN John C Klensin October 2002 © 2002 John C Klensin.
Download ReportTranscript Internationalization Status and Directions: IETF, JET, and ICANN John C Klensin October 2002 © 2002 John C Klensin.
Internationalization Status and Directions: IETF, JET, and ICANN John C Klensin October 2002 © 2002 John C Klensin Four topics today • IETF Work and Status • Opportunities, risks, and registry restrictions • Registry restrictions for CJK strings: the JET work • Another look at multilingual TLDs Disclaimer Unless specified as a committee recommendation, policy recommendations have not been discussed enough in the ICANN IDN committee to know whether there is consensus IETF Work and Status • Several separate components • Unicode handling and encoding – Stringprep – Punycode • DNS-specific – Nameprep – IDNA • Approved for publication as Proposed Standards Unicode Handling Protocols • Tables for matching and filtering – “Stringprep” • Encoding to ASCII-compatible (ACE) form – “Punycode” DNS-specific Internationalization • Nameprep – A profile of “stringprep” for DNS internationalization • IDNA – “Internationalizing Domain Names in Applications” – Base protocol, containing “ToUnicode” and “ToASCII” operations Opportunities, risks, and registry restrictions • IETF work is part of the solution – The problem it solves may still not be clearly understood – How to put a name into the DNS, and query it, given that name is appropriate • Still leaves many risks, problems, and issues – Not IETF’s job to solve policy problems – ICANN recommendation to prohibit non-language characters was not accepted by IETF – Solution to these problems lies with ICANN and Zone administrators – “No solution” may equal chaos or effective Internet fragmentation Risks, problems, and issues • Character-related issues – – – – – Confusion of names Alternative characters Reserved name issues Non-language characters Mixed scripts … (not new … most discussed in Melbourne) Extending existing remedies • UDRP not prepared for this – Confusion of character appearances is not grounds for revocation • WHOIS committee was asked to look at internationalization – Not reflected in report Since the protocols won’t provide protection, some alternatives • Registry restrictions: Per-zone or global restrictions on what can be registered • Script homogeneity restrictions ? • Letting the market sort it out Registry restrictions for CJK strings: the JET work • So far, most advanced work on registry restrictions and specific character handling is for CJK • Recognized problems earlier and got started • Good cooperative effort, focusing on special needs of Chinese characters Other Languages and Scripts • CJK has special problems – Language overlaying – Recent character reforms – Japanese and Korean are mixed-script • But every language and script has traps and potential ambiguities – Even English and ASCII What are the JET Guidelines about? • Problems – Some Chinese characters are different in different areas – same words, different characters – but need to match – Matching rules cannot be applied simply on a per-character basis – Can’t “fix” Chinese and wreck Korean or Japanese – And Korean and Japanese have their own issues JET Guideline Approach • Registry restrictions on what can be registered: invalid forms not permitted • Careful handling of “variant” characters: – If a string is registered, preferred form must be used. – Reservation “package” of preferred name and variants – Variants of string can be registered only by the same registrant (or not at all) • Definitions of permitted characters, preferences, and variant tables are per-zone (typically percountry) • Need not restrict to SLD registrations Variants • About characters: – Tables for each national use of language – E.g., not required to agree on one universal table for Chinese (important, e.g., some areas have not adopted Simplified forms) • Variant labels – Generated by combining variants of all characters present – If have ABCD, with two variants for B (X and Y) and one for C (Z), six potential labels: ABCD, AXCD, AYCD, ABZD, AXZD, AYZD – Some may then be excluded JET Guidelines and other languages/ scripts • Details will differ, principles of what to look for may be useful • Principle of registration restrictions is the important one: ultimately may be the only tool we have • Zones bear some responsibility for overall stability of Internet, integrity of references, etc. Restrictions by TLD Type • Language and script restrictions are plausible for ccTLDs – Any such restrictions start with “this language (or script) is more important than that one” decision. – Harder with each additional supported script • A generic TLD cannot prefer one language or script – So may not be able to adopt and use effective registration restriction rules. – Which makes IDNs much more dangerous. Another look at multilingual TLDs • TLDs with names other than Romanderived ISO 3166-1 codes • Motivation is not clear – Use of national language in country? – A “free” extra domain (or more than one) for commercial exploitation? – ??? • Important to understand problem Administrative hierarchy structure of DNS • Very hard to accurately administer parallel structures. • No “see also” construction • TLDs are special – must be administratively heterogeneous • These are not issues if the reason for a “multilingual TLD” is “free TLD with different administration” Options and tradeoffs • New TLDs anyway – But IDN Committee recommended normal approval process, not a free ride – The administrative problems happen – Allocation is a nasty problem – So are countries with multiple official languages • Translation The Translation Issue • Presentation – Ultimately, users don’t care what is in the DNS – They care, greatly, about what they see and type • Localization – For a limited namespace, users can see whatever the application-writer likes • Two-letter code in, user-preference out • (or national preference, or local language preference, or…) – Problem: users need to understand that there is an internal/global form • But IDNA is already going to require this The Role of the DNS • Is the DNS the right place to solve these problems? – Many restrictions and requirements for central administrative hierarchy – Poor search support capability when exact name is not known, but “exact” gets harder with IDN • Seeing evolution from product-name.TLD to product.company.TLD or http://company.TLD/product • There are alternatives and “search engines” are only one group of them. The Role of a Domain Adminstration • Responsibility to the overall Internet community and to users • For ccTLDs, ICANN probably can not compel and should not try, but can recommend • Registries who cause (or permit) messes that damage others will ultimately be held responsible. IETF Specification of Name Validity • Something of a myth – DNS Protocol does not require LDH – recommends as good/safe practice – Hostname rules were NIC document, not technical standards-track – DNS rules of late 80s and early 90s (including RFC 1591) were IANA documents, not IETF • IETF provides “how to” register and look up, and systems/technical constrains • Specific syntax and character constraints are a zone administration and IANA/ICANN issue. Independent of ICANN • Domain administrations who • Care about the Internet • Exist to serve users, registrants, and the Internet community – will develop and use registration restrictions that minimize the risk of confusion and mismatches (accidental or deliberate) • No one said this would be easy but… – Internationalization is very important – So is stability and name integrity – This appears to be the price of having both Some closing thoughts • Are there localization solutions that are effective and that meet user needs? • Localization does not require ICANN approval or involvement • In looking at the DNS to solve a range of i18n issues, are we sure we are asking the right questions? • The primary role of ICANN is preserve DNS stability. I hope it can examine this area, and move decisively, before it is too late. • “Too late” could be only a month or two from now. For further reading • IETF Proposed Standards for IDN encoding – Final drafts: • • • • draft-hoffman-stringprep-03.txt draft-ietf-idn-nameprep-11.txt draft-ietf-idn-punycode-03.txt draft-ietf-idn-idna-14.txt • JET Guidelines • Current draft – Draft-jseng-idn-admin-01.txt • Role of the DNS • draft-klensin-dns-role-04.txt (and others) • Local translation • draft-klensin-idn-tld-00.txt • Searching, not exact matching • draft-klensin-dns-search-04.txt (and others) Internet Drafts available from • http://www.ietf.org/internet-drafts/xxx • (and elsewhere) For further reading • IETF Proposed Standards for IDN encoding – Final drafts: • • • • draft-hoffman-stringprep-03.txt draft-ietf-idn-nameprep-11.txt draft-ietf-idn-punycode-03.txt draft-ietf-idn-idna-14.txt • JET Guidelines • Current draft – Draft-jseng-idn-admin-01.txt • Role of the DNS • draft-klensin-dns-role-04.txt (and others) • Local translation • draft-klensin-idn-tld-00.txt • Searching, not exact matching • draft-klensin-dns-search-04.txt (and others)