IGNITE: Linguistic Infrastructure for Localisation: Language Data
Download
Report
Transcript IGNITE: Linguistic Infrastructure for Localisation: Language Data
Tenth Localisation Conference organised by the LRC
LRC-X The Development Localisation Event
University of Limerick +++ 13–14 September 2005
Linguistic Resources for Localisation
Reinhard Schäler (LRC)
Deirdre Farrell (VeriTest)
Annette Lee (VU Games)
Andreas Papadakis (Archetypon)
Florian Sachse (PASS)
A project co-funded by the European Union’s eContent Programme
Agenda
The IGNITE project
Partners
Context – Rationale - Overview
Phase I: Linguistic Resources
Summary
Discussion
Partners
Coordinator
University of Limerick, Localisation Research
Centre (Ireland); contact: Reinhard Schäler
Contractors
PASS GmbH (Germany); contact: Florian Sachse
Lionbridge – VeriTest (Ireland); contact: Deirdre
Farrell
Vivendi Universal Games (Ireland); contact:
Annette Lee
Archetypon (Greece); contact: George Boukis
Each partner represents one of the stakeholders in localisation
Independent Research Organisation
Localisation Research Centre
www.localisation.ie
The Localisation Research Centre (LRC) is the
information, educational, and research centre
for the localisation community.
IGNITE will enable the LRC to offer an independent
testing and certification service to publishers of digital
content, tools and technologies, as well as to the
developers of localisation standards
Digital Content Developer
Vivendi Universal Games
Global leader in multi-platform interactive
entertainment.
Products for all major platforms, including PCs,
consoles, internet
700 title library includes multi-million unit selling
Rationale for participation
reduce costs & time to market, increase potential
market, and improve efficiency
participate along with key industry players in
developing an infrastructure of resources, tools,
technologies and standards
automation of localisation processes for digital
content, incorporating findings from IGNITE project
Localisation Service Provider
Archetypon
SME, founded in 1987
90% Revenue from the international market
One of the 500 leading high-growth companies in
Europe, for the second time in 2002 (GrowthPlus)
Rationale for participation
Evolve in a dynamic market
Automate localisation processes
Collaborate with partners in the industry
Localisation Technology Provider
PASS Engineering
Worldwide leading provider of high-quality localization
tools; developers of PASSOLO
Cutting-edge technology, across a wide variety of
platforms, powerful interfaces, highly competitive,
scalable pricing
Rationale for participation
PASS can make a very significant contribution
Expect a formalized set of test cases to run against
PASSOLO or its competitors
Expect better understanding of data-centric or
programmatic approches based on a large set of
scenarios
IT Testing & Certification Expert
VeriTest
Established, Public Company (Nasdaq: LIOX)
Global IT Outsourcing Solutions
19 Solution Centers in 10 Countries
The industry's most trusted brand for high-quality, costeffective outsourced testing, competitive analysis and
certification services
Rationale for participation
To be involved in developing supportable standards within the
industry
To develop, implement and audit the proposed test harness
and certification process
VeriTest is a division of Lionbridge
The localisation factory
a case study (2003)
The Setting
Project constraints
4m wordcount software
strings
30 languages simultaneous
release
13k localisable files
Localisation group in Dublin;
5,000 people world-wide
distributed development
team
Objectives
24/7, 100% automated process –
no exceptions
Translation in parallel with
development
Translation begins at code checkin
Translation “on demand” – no
more “big project” model
Current throughput:
100,000 language check-ins
per month
2 million files per month
98% of words leverage
Average time to process a
file: 45 seconds
Fully scalable “add-a-box
model”
Simship of all 30 languages
International version testing
before US release
Reduced no. of release
engineers (20->2) resulting
in US$20m saving per year
Positive ROI within 1 year
Objectives and deliverables
Linguistic infrastructure resources
Language data
Tools and technologies
Standards
Access
Content developers
Service providers
Technology developers
Performance scenarios
Digital content - standards
Tools and technologies – standards
Standards - coverage
Standard verification and enhancement
Localisation Process Environment
Phase II
Phase III
IGNITE
Performance
Performance analysis
analysis
State-of-the-art technologies and process environent
L i n g u i
s t i c
Phase I
IGNITE
Consortium
R e s o u r c e s
S u p p o r t
Linguistic Resources
N e t w o r k
IGNITE
Contact Group
Tools
Standard
Standards
Digital content source/target
Terminologies
Translation memories
Terminology DBs
TM systems
UI editors
OASIS
ISO
Unicode
W3C
Examples
Language data
Phase I: Lingustic resources
Language Data
Tools
Standards
Language data
Set of speech or language data and
descriptions in machine readable form
Used e.g.
for building, improving or evaluating natural
language and speech algorithms or systems
as core resources for the software localisation and
language services industries
for language studies
electronic publishing
international transactions
subject-area specialists
end users
Language data
Types of language data
Multimodal digital content in source and
target languages
Monolingual and multilingual terminology
Translation memories
Languages covered
Primarily those represented in consortium
Also those represented by contact group
Linguistic tools
Linguistic tools and technologies answer some
of the central questions around terminology
handling and update processing.
Terminology handling
Access to standard terminology in multiple languages
Maintainance of multilingual terminologies
Integration of late terminology changes
Consistence checker
Update processing
Version comparison, change control
Analysis and alignment of source and target
Use of exact or fuzzy matches
Beyond Translation Memory Systems
Linguistic tools
Linguistic tools in localisation
Terminology management systems
Translation memories
Machine translation
User interface and user assistance visual translation
environments
Language data analysis tools
Sophisticated matching tools
Natural language parsers
Extract-and-Insert tools
Parsers for natural language digital content in
compiled sources
Linguistic tools
Direct and online access
Cooperation with leading industry
associations (e.g. Gala and TILP)
Tools review and categorisation
Dissemination
Greater confidence when selecting tools
Development of market for tools
Standards
A large number of standards relevant to
linguistic resources in the context of
localisation have been published by a number
of organisations
International Standards Organisation (ISO)
Localisation Industry Standards Association (LISA)
OASIS
The Free Standards Group Open Internationalization
Initiative (Openi18n.org)
Termnet
Unicode
WC3
Standards
Central repository
Review of standard development
process
Uptake and support
Demonstration
Effectiveness
Phase I: Linguistic resources – how?
IGNITE Contact Group
Digital content publishers
Coordinated by Vivendi Universal Games
Service providers
Coordinated by Archetypon
Tools developers
Coordinated by PASS
Standard organisations
Coordinated by VeriTest
How to contribute and benefit
IGNITE contact group – sign up!
Organised by type of enterprise and interest (content
developer, service provider, tools developer, certification
expert)
Contributors to the content, tools/technologies and
standards repositories
Early access to localisation resources, infrastructure and
test harness
Exposure through project literature and publications
Preferential invitation to Contact Group meetings,
workshops
Next steps
Phase II
Review approaches to standard process descriptions
(in other industries, e.g. manufacturing)
Destillation of standard localisation process
Localisation Factory (automated localisation process
environment)
Phase III
Develop process and standard evaluation strategy
Build test harnesses
Report on performance evaluations
Discussion
Benefits for your company?
Are all angles covered?
Interested in contributing?
How to?
Is it feasible?