Internationalization Status and Directions: IETF, JET, and ICANN John C Klensin October 2002 © 2002 John C Klensin.

Download Report

Transcript Internationalization Status and Directions: IETF, JET, and ICANN John C Klensin October 2002 © 2002 John C Klensin.

Internationalization Status and
Directions: IETF, JET, and ICANN
John C Klensin
October 2002
© 2002 John C Klensin
Four topics today
• IETF Work and Status
• Opportunities, risks, and registry
restrictions
• Registry restrictions for CJK strings: the
JET work
• Another look at multilingual TLDs
Disclaimer
Unless specified as a committee
recommendation, policy
recommendations have not been
discussed enough in the ICANN IDN
committee to know whether there is
consensus
IETF Work and Status
• Several separate components
• Unicode handling and encoding
– Stringprep
– Punycode
• DNS-specific
– Nameprep
– IDNA
• Approved for publication as Proposed Standards
Unicode Handling Protocols
• Tables for matching and filtering
– “Stringprep”
• Encoding to ASCII-compatible (ACE) form
– “Punycode”
DNS-specific Internationalization
• Nameprep
– A profile of “stringprep” for DNS
internationalization
• IDNA
– “Internationalizing Domain Names in
Applications”
– Base protocol, containing “ToUnicode” and
“ToASCII” operations
Opportunities, risks, and registry
restrictions
• IETF work is part of the solution
– The problem it solves may still not be clearly understood
– How to put a name into the DNS, and query it, given that name is
appropriate
• Still leaves many risks, problems, and issues
– Not IETF’s job to solve policy problems
– ICANN recommendation to prohibit non-language characters was
not accepted by IETF
– Solution to these problems lies with ICANN and Zone
administrators
– “No solution” may equal chaos or effective Internet fragmentation
Risks, problems, and issues
• Character-related issues
–
–
–
–
–
Confusion of names
Alternative characters
Reserved name issues
Non-language characters
Mixed scripts
… (not new … most discussed in Melbourne)
Extending existing remedies
• UDRP not prepared for this
– Confusion of character appearances is not grounds for
revocation
• WHOIS committee was asked to look at
internationalization
– Not reflected in report
Since the protocols won’t provide
protection, some alternatives
• Registry restrictions: Per-zone or global
restrictions on what can be registered
• Script homogeneity restrictions ?
• Letting the market sort it out
Registry restrictions for CJK
strings: the JET work
• So far, most advanced work on registry
restrictions and specific character handling
is for CJK
• Recognized problems earlier and got started
• Good cooperative effort, focusing on special
needs of Chinese characters
Other Languages and Scripts
• CJK has special problems
– Language overlaying
– Recent character reforms
– Japanese and Korean are mixed-script
• But every language and script has traps and
potential ambiguities
– Even English and ASCII
What are the JET Guidelines
about?
• Problems
– Some Chinese characters are different in
different areas – same words, different
characters – but need to match
– Matching rules cannot be applied simply on a
per-character basis
– Can’t “fix” Chinese and wreck Korean or
Japanese
– And Korean and Japanese have their own issues
JET Guideline Approach
• Registry restrictions on what can be registered:
invalid forms not permitted
• Careful handling of “variant” characters:
– If a string is registered, preferred form must be used.
– Reservation “package” of preferred name and variants
– Variants of string can be registered only by the same
registrant (or not at all)
• Definitions of permitted characters, preferences,
and variant tables are per-zone (typically percountry)
• Need not restrict to SLD registrations
Variants
• About characters:
– Tables for each national use of language
– E.g., not required to agree on one universal table for
Chinese (important, e.g., some areas have not adopted
Simplified forms)
• Variant labels
– Generated by combining variants of all characters
present
– If have ABCD, with two variants for B (X and Y) and
one for C (Z), six potential labels:
ABCD, AXCD, AYCD, ABZD, AXZD, AYZD
– Some may then be excluded
JET Guidelines and other
languages/ scripts
• Details will differ, principles of what to
look for may be useful
• Principle of registration restrictions is the
important one: ultimately may be the only
tool we have
• Zones bear some responsibility for overall
stability of Internet, integrity of references,
etc.
Restrictions by TLD Type
• Language and script restrictions are plausible for
ccTLDs
– Any such restrictions start with “this language (or
script) is more important than that one” decision.
– Harder with each additional supported script
• A generic TLD cannot prefer one language or
script
– So may not be able to adopt and use effective
registration restriction rules.
– Which makes IDNs much more dangerous.
Another look at multilingual
TLDs
• TLDs with names other than Romanderived ISO 3166-1 codes
• Motivation is not clear
– Use of national language in country?
– A “free” extra domain (or more than one) for
commercial exploitation?
– ???
• Important to understand problem
Administrative hierarchy
structure of DNS
• Very hard to accurately administer parallel
structures.
• No “see also” construction
• TLDs are special – must be administratively
heterogeneous
• These are not issues if the reason for a
“multilingual TLD” is “free TLD with
different administration”
Options and tradeoffs
• New TLDs anyway
– But IDN Committee recommended normal
approval process, not a free ride
– The administrative problems happen
– Allocation is a nasty problem
– So are countries with multiple official
languages
• Translation
The Translation Issue
• Presentation
– Ultimately, users don’t care what is in the DNS
– They care, greatly, about what they see and type
• Localization
– For a limited namespace, users can see whatever the
application-writer likes
• Two-letter code in, user-preference out
• (or national preference, or local language preference, or…)
– Problem: users need to understand that there is an
internal/global form
• But IDNA is already going to require this
The Role of the DNS
• Is the DNS the right place to solve these
problems?
– Many restrictions and requirements for central
administrative hierarchy
– Poor search support capability when exact name is not
known, but “exact” gets harder with IDN
• Seeing evolution from product-name.TLD to
product.company.TLD or
http://company.TLD/product
• There are alternatives and “search engines” are
only one group of them.
The Role of a Domain
Adminstration
• Responsibility to the overall Internet
community and to users
• For ccTLDs, ICANN probably can not
compel and should not try, but can
recommend
• Registries who cause (or permit) messes
that damage others will ultimately be held
responsible.
IETF Specification of Name Validity
• Something of a myth
– DNS Protocol does not require LDH – recommends as
good/safe practice
– Hostname rules were NIC document, not technical
standards-track
– DNS rules of late 80s and early 90s (including RFC
1591) were IANA documents, not IETF
• IETF provides “how to” register and look up, and
systems/technical constrains
• Specific syntax and character constraints are a
zone administration and IANA/ICANN issue.
Independent of ICANN
• Domain administrations who
• Care about the Internet
• Exist to serve users, registrants, and the Internet community
– will develop and use registration restrictions that
minimize the risk of confusion and mismatches
(accidental or deliberate)
• No one said this would be easy but…
– Internationalization is very important
– So is stability and name integrity
– This appears to be the price of having both
Some closing thoughts
• Are there localization solutions that are effective and that
meet user needs?
• Localization does not require ICANN approval or
involvement
• In looking at the DNS to solve a range of i18n issues, are
we sure we are asking the right questions?
• The primary role of ICANN is preserve DNS stability. I
hope it can examine this area, and move decisively, before
it is too late.
• “Too late” could be only a month or two from now.
For further reading
• IETF Proposed Standards for IDN encoding
– Final drafts:
•
•
•
•
draft-hoffman-stringprep-03.txt
draft-ietf-idn-nameprep-11.txt
draft-ietf-idn-punycode-03.txt
draft-ietf-idn-idna-14.txt
• JET Guidelines
• Current draft
– Draft-jseng-idn-admin-01.txt
• Role of the DNS
• draft-klensin-dns-role-04.txt (and others)
• Local translation
• draft-klensin-idn-tld-00.txt
• Searching, not exact matching
• draft-klensin-dns-search-04.txt (and others)
Internet Drafts available from
• http://www.ietf.org/internet-drafts/xxx
• (and elsewhere)
For further reading
• IETF Proposed Standards for IDN encoding
– Final drafts:
•
•
•
•
draft-hoffman-stringprep-03.txt
draft-ietf-idn-nameprep-11.txt
draft-ietf-idn-punycode-03.txt
draft-ietf-idn-idna-14.txt
• JET Guidelines
• Current draft
– Draft-jseng-idn-admin-01.txt
• Role of the DNS
• draft-klensin-dns-role-04.txt (and others)
• Local translation
• draft-klensin-idn-tld-00.txt
• Searching, not exact matching
• draft-klensin-dns-search-04.txt (and others)