Titolo - unitn.it

Download Report

Transcript Titolo - unitn.it

CELCT
Center for the Evaluation of
Language and Communication
Technologies
www.celct.it
[email protected]
Qu i c k T i m e ™ e u n
d e c o m p re s s o re T IF F (No n c o m p re s s o )
s o n o n e c e s s a ri p e r v i s u a l i z z a re q u e s t ' i m m a g i n e .
Introduction
CELCT (/selekt/):
• A joint initiative of DFKI and ITC-Irst (now Bruno
Kessler Foundation)
• An autonomous institution located in Trento
•
Started operating at the beginning of 2004
•
Funded by the Autonomous Province of Trento
2
Motivations
The Importance of Evaluation
Evaluation plays a relevant role in advancing
technology
 Evaluation is useful to assess technology
(improving software quality)
 Evaluation should collaborate with the
research community, but be independent from
technology developers

3
Mission
GOALS
 Setting up all the necessary infrastructure and skills to
support evaluation in Human Language-Multimodal
Communication Technologies (HL-MCTs)
 Co-operating with private partners in order to foster the
development of HL-MCTs applications for the market
4
Overview of Activities
CELCT has been involved in the following activities:
 CLEF
 RTE-PASCAL
 I-CAB
 DUC
Euromatrix
 PATExpert
 IWSLT
 Multimodal Interface & Web Site Evaluation
5
CLEF
Cross Language Evaluation Forum
 Coordination of the Multilingual Question
Answering Track in 2005-6-7 campaigns.
 organization of the QA@CLEF final workshops
and editing of the Workshop Proceedings 20042005-2006
6
RTE
Recognizing Textual Entailment
Challenge promoted by PASCAL (a network of excellence
which pioneers principled methods of pattern analysis,
statistical modeling and computational learning) aimed at
evaluating NLP systems’ ability to automatically recognize
semantic inference between two text fragments.
 creation of development and test data sets for RTE 1 and 2,
guidelines in collaboration with Bar-Ilan University,
Microsoft, MITRE, NYU
 organization of the final workshops (Southampton (UK),
April 12, 2005; Venice, April 10, 2006)
 organization of the third edition (2007)
 2008 - Collaboration with NIST
7
I-CAB
Italian Content Annotation Bank
Italian news corpus annotated with different kinds of semantic
information in collaboration with ITC-irst
 A benchmark intended as a reference work for Information
Extraction tasks, providing annotations of:
 temporal expressions: e.g. lunedì/Monday, oggi/today
 entities persons: e.g. il presidente/the president, Maria/Mary
• organizations: e.g. Croce Rossa/The Red Cross
• geo-political entities (GPEs): e.g. Italia/Italy
• locations: e.g. Caucaso/Caucasus
 relations among entities: e.g. the relation affiliation connecting a
person to an organization
 events, a specific occurrence involving participants: e.g. nascita/beborn, bancarotta/bankruptcy
8
I-CAB (2)
Formalism adopted:
 The activity has been carried out by adopting the standards
developed within the American ACE program,
(ACE - Automatic Content Extraction,
http://www.nist.gov/speech/tests/ace).
 The English guidelines provided by the Linguistic Data
Consortium (LDC, http://www.ldc.upenn.edu) have been
adapted to Italian.
9
DUC2005
Document Understanding Conference
A campaign dedicated to the evaluation of automatic
summarization systems
 in collaboration with Columbia University CELCT
tested the Pyramid Method – an empirically based
method to evaluate the content of automatic
summaries - by annotating the results and providing
feed-back to the organisers of the competition
10
EUROMATRIX Statistical and Hybrid
Machine Translation Between All European languages
• Sixth EU Framework Programme for Research and
Technological Development-FP6
• Objective: SO 2.5.7 “Multimodal Interfaces”
• Partners:
•
•
•
•
•
•
1. Saarland University, Germany
2. University of Edinburgh, UK3.
3. Charles University, Prague, Czech Republic
4. CELCT, Italy
5. GROUP Technologies, Germany
6. MorphoLogic, Hungary.
EUROMATRIX: goal
• To create basic translation systems between all
European language pairs.
• To advance the state of the art in machine
translation by the combinations of statistical
techniques with linguistic knowledge sources in
a new hybrid machine translation system.
EUROMATRIX: role of CELCT
• Develop standardized evaluation procedures to
objectively assess performance
• Set the state of the art in machine translation
evaluation methods and techniques
• Organization of two evaluation campaigns (10
language pairs chosen among officially used
languages of EU)
• Create resources, data, tools, and systems for the
automatic translation between all language pairs
PATEXPERT: Advanced Patent
Document Processing Techniques (1)
 EC project in collaboration with ITC-irst
 Objectives:
– develop a multimedia content representation formalism based
on Semantic Web technologies for selected technology areas
– investigate the retrieval, classification, multilingual generation
of concise patent information, assessment and visualization of
patent material encoded in this formalism, taking the
information needs of all user types as defined in a user
typology into account
– technological goal: to develop a showcase that demonstrates
the viability of PATExpert's approach to content representation
for real applications
PATEXPERT: role of CELCT
 Task, annotation of English patents
 Domains: machine tools and optical recording devices (English)
 Language: Legal/Patent
 Syntactic and structural information, (PoS, Chunking, e Text zoning;
Parsing)
 PoS tagging
- ~10k words, (6-7 patents) annotated using the CLAWS4 tagset
- Represents the Gold Standard
 Chunking
- Same data of PoS
- Tagset will contain 5-10 tags
 Entity and relation annotation
- 6-8 entities and relations
- ~50k words for training, (annotation also of the 10k words already
annotated with PoS, and Chunks)
IWSLT: International Workshop on
Spoken Language Translation
 In collaboration with ITC-irst within the CSTAR consortium (Consortium for Speech
Translation Advanced Research)
 CELCT prepared text and speech data for the
2006 evaluation campaign translation for
Italian-English track BTEC Corpus (Basic
Travel Expression).
 CELCT coordinated the evaluation campaign
in 2007 (Workshop in Trento, October).
EVALITA 2007
• Evalita 2007: Evaluating Natural Language
Tools for Italian
Bernardo Magnini, Amedeo Cappelli (eds.)
Anno IV, Num ero 3
Luglio-Settembre 2007
EVALITA 2007
• received 55 expressions of interest for the five tasks
• in the end, the number of participants was 30 (21 different
organizations, 8 not Italian, 2 non academic), with the
following distribution:
– Part of Speech tagging (11),
– Parsing (8),
– Word Sense Disambiguation (1),
– Temporal Expressions (4),
– Named Entity Recognition (6).
Multimodal Interfaces and Website
Evaluation
Since mid-2005 CELCT has offered a service of website and
multimodal interface evaluation:
 Based on User Centered Design (UCD) methodology of
evaluation and planning system applied to industrial projects
UCD approach
•
•
•
•
•
•
Preliminary analysis client’s requirements
Client’s tasks
Projectation’s requirements and specifics
Formative evaluation
New projectation’s requirements and specifics
Global evaluation
Role in Treble-Clef
Benef iciary Benef iciary
no.
6
WP1
WP2
WP3
WP4
WP5
WP6
short name
CELCT
Total
person
months
0.5
2
1
2
7.5
13
 Coordination of WP6 Dissemination
• Objectives: The aim of this work package is to disseminate
the results of the project. Tools and data collected and
produced during the activities carried out in the project, and
all the information about MLIA best practices, will be made
available to the scientific community and industry.
20