Panel on Open Source Approaches to Unicode Enablement

Download Report

Transcript Panel on Open Source Approaches to Unicode Enablement

Open-Source Approaches to Unicode
Enablement
Panel Discussion
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Agenda





Panel Introductions
Library Descriptions and Demos
What is Open Source?
What is the Open Source experience?
Q and A
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Today’s Panel





Arnt Gulbrandsen
Bob Verbrugge
Frank Tang
Helena Shih
Mark Leisher
16th International Unicode Conference




Steven Loomis
Steven Watt
Tex Texin
Yves Arrouye
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Library Descriptions and Demos




Troll: QT Free Edition
CRL: Assorted Unicode Support
Mozilla: International Library of Mozilla
IBM: International Components for
Unicode
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
Troll’s Qt Free Edition
Arnt Gulbrandsen
Troll Tech
CRL’s Unicode Support
Mark Leisher
Computing Research Laboratory
New Mexico State University
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
CRL’s Unicode Support

Goal: Provide example resources usable on Unix.
 Fonts.
 Encoding mapping tables.
 Unicode character information.
 Algorithms.
 Other resources.
 Resource availability.
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
CRL’s Unicode Support

Fonts.
Three bitmap fonts in BDF format were developed and made
available.
Arabic
Devanagari
Clearly U
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
CRL’s Unicode Support

Encoding mapping tables.
The Unicode Consortium provides mapping tables
for converting many of the more common character
sets to Unicode. The CSets archive provides
supplementary mapping tables for character sets
and encodings that are not supplied by the Unicode
Consortium.
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
CRL’s Unicode Support

Unicode character information.
To facilitate development of Unicode-capable
software, a simple character information and partial
bi-directional reordering API and library was
developed early on before standardization efforts
really gained momentum. This is the UCData
package and the Pretty Good Bidi Algorithm.
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
CRL’s Unicode Support

Algorithms.
To further encourage independent development of Unicode
capable software, a few basic text search algorithms were
converted to use Unicode text. These include:
A Boyer-Moore string search routine.
A glob matching routine called Wildmat.
An almost minimal DFA regular expression routine.
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
CRL’s Unicode Support

Other resources.
Some of the other resources made available by CRL are:
Code to test wchar_t type support in C/C++ compilers.
Keyboard arrangements for various languages that have
been collected over the years.

Resource Availability.
All of the resources mentioned are freeware and can be found
at http://crl.nmsu.edu/~mleisher/.
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
International Library for Mozilla
Frank Tang
Netscape Communications
Mozilla
International Components for Unicode
(ICU)
Helena Shih and Steven Loomis
IBM Unicode Technology Center
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Unicode support in the Industry






Lack of a complete set of features in most
implementations.
Inconsistent across different environments. Win32
vs. POSIX, for example.
Poor portability.
Unable to share the resources with other products.
Almost no extensibility and customization.
Not a concern for most companies when a product is
first designed.
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Netfinity Server
Apple G3 Macintosh
I
I
C
IBM’s DB/2 Product
C
U
U
AS/400 e-Server 720
Microsoft NT Workstation
World Wide Web
S/390 Server
16th International Unicode Conference
Sun Ultra 60 Workstation
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
ICU Objectives

Quality Unicode & I18N support across platforms

Consistent results in both C/C++ and Java

Powerful, portable API available to the OpenSource development community

Important resources sharing mechanism

Outside feedback & contributions improve quality
and feature set
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
ICU Features









Parallel to the i18n architecture in JDK
All components multi-thread safe
Full Unicode string manipulation
Complete locale support, e.g. > 145 locales
Fast and flexible character set conversion
Efficient data loading mechanism
Hierarchical resource bundles with Unicode data
Extensive calendar and timezone support
Date, time, currency, number and message formatting
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
ICU Features







Locale sensitive sorting (including Thai)
Locale sensitive text boundary detection
Customizable transliteration interface
Unicode text compression algorithm
Fast and compliant Unicode 3.0 Bidi algorithm
Unicode 3.0 normalization support
Most up-to-date Unicode 3.0 character properties
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Platform Support

Reference Platforms:
–
–
–
–
–
–
–

AIX
OS/390
AS/400
RedHat Linux
Solaris
Windows 98, NT4.0 and Win2000
HP-UX
Working Partners:
Sun, IBM, NCR, Xerox, Netscape, Progress, RealNames,
Versant, Compuware, GlobalSight, Hotmail, Lotus ...
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
ICU Documentation

API Documentation
– Updated from header files (like javadoc)
– Available on external web site

User Guide
– Work in progress, feedback welcome
– Initial draft available
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
ICU4J - ICU for Java







IBM developed extensive I18N library
I18N code added to Java JDK 1.1
Java code ported to C++ -> ICU
ICU available on alphaWorks
Both ICU and Java classes continue development
– Sometimes “leapfrogging” each other with
features
ICU open source, moves to developerWorks
2000 March: Java Code open source as “ICU4J”
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
ICU4J Features


Builds on Java 2 feature set
Feature summary:
– Advanced text boundary detection
– Calendars: Hebrew, Hijri/Islamic, Japanese
Gengou, Thai Buddhist
– Spelled-out numbers
– Normalization
– Transliteration
– Standard Unicode compression
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Reference Information

ICU Web Sites
– http://oss.software.ibm.com/icu/

developerWorks Unicode site
– http://www.ibm.com/developer/unicode/

The Unicode Standard
– http://www.unicode.org/

developerWorks Java site
– http://www.ibm.com/developer/java/
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Demos



Locale Explorer
xliterate-It!
Qt Demo
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Agenda





Panel Introductions
Library Descriptions and Demos
What is Open Source?
What is the Open Source experience?
Q and A
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
ICU OpenSource Objectives

Promotes a cross-platform Unicode strategy

Produces a Unicode technology
implementation

Supports important OpenSource products
Linux, Apache, Mozilla, XML etc.
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Open-Source Models

The Apache model
– Web access for CVS repository
– Technical committees

Developer community support
– [email protected] support account
– news.alphaworks.ibm.com discussion newsgroup

Commercial product partnership
– RealNames, versant, GE ...
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Open-Source Models

The Troll Tech model
– Free and Professional Editions
– Distinguish private, open source use from commercial,
closed source use
– All contributions accepted and used in both versions.
– Source updated daily
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Why contribute to Open Source?

Bob Verbrugge:
– Requires robust I18n and portability
– Implementing alone, cost is considerable
– Sharing development is cost effective
– Shared knowledge with experts
– Ability to influence the end-result
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Why contribute to Open Source?

Steve Watt:
– Requires portability and interoperability
– Upgrading existing library to Unicode
version 3.0 is a sizable effort
– Commercial libraries did not meet our
needs
– Shared effort means our development
focus is now aligned with on our needs
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Why contribute to Open Source?

Steve Watt’s concerns:
– Giving away proprietary technology
– Design by committee
– Will release schedules fit product
schedules?
– Will library and product stay in synch?
– Do all participants have common
objectives?
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Why contribute to Open Source?

Yves Arrouye:
– Share expertise, give something
– Benefits from features developed by others
• Normalization, optimized algorithms
• Character set conversions
– Access to source code
– Using multiple Open Source products
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Why contribute to Open Source?

Yves Arrouye’s concerns:
– Management Perceptions
“If it’s free, it must be for play…”
– Entry requirements and qualifications to be
able to affect direction or design
– Patch integration, Release control and
schedules
– Build stability
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000
C14, C15: Panel on Open-Source Approaches to Unicode Enablement
Agenda





Panel Introductions
Library Descriptions and Demos
What is Open Source?
What is the Open Source experience?
Q and A
16th International Unicode Conference
Amsterdam, the Netherlands, March 2000