IGNITE: Linguistic Infrastructure for Localisation: Language Data

Download Report

Transcript IGNITE: Linguistic Infrastructure for Localisation: Language Data

Tenth Localisation Conference organised by the LRC
LRC-X The Development Localisation Event
University of Limerick +++ 13–14 September 2005
Linguistic Resources for Localisation
Reinhard Schäler (LRC)
Deirdre Farrell (VeriTest)
Annette Lee (VU Games)
Andreas Papadakis (Archetypon)
Florian Sachse (PASS)
A project co-funded by the European Union’s eContent Programme
Agenda
 The IGNITE project
 Partners
 Context – Rationale - Overview
 Phase I: Linguistic Resources
 Summary
 Discussion
Partners
 Coordinator
 University of Limerick, Localisation Research
Centre (Ireland); contact: Reinhard Schäler
 Contractors
 PASS GmbH (Germany); contact: Florian Sachse
 Lionbridge – VeriTest (Ireland); contact: Deirdre
Farrell
 Vivendi Universal Games (Ireland); contact:
Annette Lee
 Archetypon (Greece); contact: George Boukis
Each partner represents one of the stakeholders in localisation
Independent Research Organisation
Localisation Research Centre
www.localisation.ie
The Localisation Research Centre (LRC) is the
information, educational, and research centre
for the localisation community.
IGNITE will enable the LRC to offer an independent
testing and certification service to publishers of digital
content, tools and technologies, as well as to the
developers of localisation standards
Digital Content Developer
 Vivendi Universal Games
 Global leader in multi-platform interactive
entertainment.
 Products for all major platforms, including PCs,
consoles, internet
 700 title library includes multi-million unit selling
 Rationale for participation
 reduce costs & time to market, increase potential
market, and improve efficiency
 participate along with key industry players in
developing an infrastructure of resources, tools,
technologies and standards
 automation of localisation processes for digital
content, incorporating findings from IGNITE project
Localisation Service Provider
 Archetypon
 SME, founded in 1987
 90% Revenue from the international market
 One of the 500 leading high-growth companies in
Europe, for the second time in 2002 (GrowthPlus)
 Rationale for participation
 Evolve in a dynamic market
 Automate localisation processes
 Collaborate with partners in the industry
Localisation Technology Provider
 PASS Engineering
 Worldwide leading provider of high-quality localization
tools; developers of PASSOLO
 Cutting-edge technology, across a wide variety of
platforms, powerful interfaces, highly competitive,
scalable pricing
 Rationale for participation
 PASS can make a very significant contribution
 Expect a formalized set of test cases to run against
PASSOLO or its competitors
 Expect better understanding of data-centric or
programmatic approches based on a large set of
scenarios
IT Testing & Certification Expert
 VeriTest
 Established, Public Company (Nasdaq: LIOX)
 Global IT Outsourcing Solutions
 19 Solution Centers in 10 Countries
 The industry's most trusted brand for high-quality, costeffective outsourced testing, competitive analysis and
certification services
 Rationale for participation
 To be involved in developing supportable standards within the
industry
 To develop, implement and audit the proposed test harness
and certification process
VeriTest is a division of Lionbridge
The localisation factory
a case study (2003)
The Setting
Project constraints
4m wordcount software
strings
30 languages simultaneous
release
13k localisable files
Localisation group in Dublin;
5,000 people world-wide
distributed development
team
Objectives
24/7, 100% automated process –
no exceptions
Translation in parallel with
development
Translation begins at code checkin
Translation “on demand” – no
more “big project” model
 Current throughput:
100,000 language check-ins
per month
 2 million files per month
 98% of words leverage
 Average time to process a
file: 45 seconds
 Fully scalable “add-a-box
model”
 Simship of all 30 languages
 International version testing
before US release
 Reduced no. of release
engineers (20->2) resulting
in US$20m saving per year
 Positive ROI within 1 year
Objectives and deliverables
 Linguistic infrastructure resources
 Language data
 Tools and technologies
 Standards
 Access
 Content developers
 Service providers
 Technology developers
 Performance scenarios
 Digital content - standards
 Tools and technologies – standards
 Standards - coverage
Standard verification and enhancement
Localisation Process Environment
Phase II
Phase III
IGNITE
Performance
Performance analysis
analysis
State-of-the-art technologies and process environent
L i n g u i
s t i c
Phase I
IGNITE
Consortium
R e s o u r c e s
S u p p o r t
Linguistic Resources
N e t w o r k
IGNITE
Contact Group
Tools
Standard
Standards
Digital content source/target
Terminologies
Translation memories
Terminology DBs
TM systems
UI editors
OASIS
ISO
Unicode
W3C
Examples
Language data
Phase I: Lingustic resources
 Language Data
 Tools
 Standards
Language data
 Set of speech or language data and
descriptions in machine readable form
 Used e.g.
 for building, improving or evaluating natural
language and speech algorithms or systems
 as core resources for the software localisation and
language services industries
 for language studies
 electronic publishing
 international transactions
 subject-area specialists
 end users
Language data
 Types of language data
 Multimodal digital content in source and
target languages
 Monolingual and multilingual terminology
 Translation memories
 Languages covered
 Primarily those represented in consortium
 Also those represented by contact group
Linguistic tools
 Linguistic tools and technologies answer some
of the central questions around terminology
handling and update processing.
 Terminology handling




Access to standard terminology in multiple languages
Maintainance of multilingual terminologies
Integration of late terminology changes
Consistence checker
 Update processing




Version comparison, change control
Analysis and alignment of source and target
Use of exact or fuzzy matches
Beyond Translation Memory Systems
Linguistic tools
 Linguistic tools in localisation









Terminology management systems
Translation memories
Machine translation
User interface and user assistance visual translation
environments
Language data analysis tools
Sophisticated matching tools
Natural language parsers
Extract-and-Insert tools
Parsers for natural language digital content in
compiled sources
Linguistic tools
 Direct and online access
 Cooperation with leading industry
associations (e.g. Gala and TILP)
 Tools review and categorisation
 Dissemination
 Greater confidence when selecting tools
 Development of market for tools
Standards
 A large number of standards relevant to
linguistic resources in the context of
localisation have been published by a number
of organisations
International Standards Organisation (ISO)
Localisation Industry Standards Association (LISA)
OASIS
The Free Standards Group Open Internationalization
Initiative (Openi18n.org)
 Termnet
 Unicode
 WC3




Standards
 Central repository
 Review of standard development
process
 Uptake and support
 Demonstration
 Effectiveness
Phase I: Linguistic resources – how?
 IGNITE Contact Group
 Digital content publishers
 Coordinated by Vivendi Universal Games
 Service providers
 Coordinated by Archetypon
 Tools developers
 Coordinated by PASS
 Standard organisations
 Coordinated by VeriTest
How to contribute and benefit
 IGNITE contact group – sign up!
 Organised by type of enterprise and interest (content
developer, service provider, tools developer, certification
expert)
 Contributors to the content, tools/technologies and
standards repositories
 Early access to localisation resources, infrastructure and
test harness
 Exposure through project literature and publications
 Preferential invitation to Contact Group meetings,
workshops
Next steps
 Phase II
 Review approaches to standard process descriptions
(in other industries, e.g. manufacturing)
 Destillation of standard localisation process
 Localisation Factory (automated localisation process
environment)
 Phase III
 Develop process and standard evaluation strategy
 Build test harnesses
 Report on performance evaluations
Discussion
Benefits for your company?
Are all angles covered?
Interested in contributing?
How to?
Is it feasible?