Showroom of Best Practice - E-MELD

Download Report

Transcript Showroom of Best Practice - E-MELD

E-MELD
“School” of Best Practice
Helen Aristar-Dry & Gayathri Sriram
The LINGUIST List
Eastern Michigan University
July 11, 2003
E-MELD 2003
The LINGUIST List Crew
July 11, 2003
E-MELD 2003
Working late…
July 11, 2003
E-MELD 2003
Back
Using all available talent ….
July 11, 2003
E-MELD 2003
Back
Overview
• The E-MELD ‘School’ of Best
Practice: latest version
– Purpose
• What is ‘best practice’?
• Why ‘best practice’?
– Organization
• Demo some of the facilities
July 11, 2003
E-MELD 2003
A note about the name…
•
•
•
•
•
•
•
•
•
Showroom of BP? …..Nope, it’s got rooms.
House of BP?
Funhouse?
Playhouse?
Outhouse?
Bazaar?
Palace?
Chateau?
Shed?
July 11, 2003
E-MELD 2003
School
What is Best Practice?
Practices designed to insure that digital
language resources :
• endure through time.
• can be reused by others, both now and in the
future.
• are as independent as possible of computer
environments, scholarly communities, and
domains of application.
-Bird & Simons 2003
July 11, 2003
E-MELD 2003
Best Practice as we know it …
. . . this afternoon
• Distinguish between the archival format and
the presentation format(s). BP is concerned
primarily with archival format.
• Archival formats should employ open file
formats and open standards.
• Examples of archive formats:
– Documents: plain text with XML markup.
– Images: TIF 16 bit gray scale format
– Audio files: pure (uncompressed) WAV files.
July 11, 2003
E-MELD 2003
Best Practice
• Write metadata for the language resource in
an approved format.
Recommended:
• OLAC format
• A format mapped to OLAC, e.g., IMDI
• Make the metadata available to a general
search engine.
Recommended:
• An OLAC service provider, e.g. LINGUIST List
July 11, 2003
E-MELD 2003
Best Practice
• For morphosyntactic markup:
countenance different terminology sets
but use an ontology of linguistic
concepts (GOLD) as an interlanguage
• Relate the different morphological
markup schemas to the ontology by
means of a metaschema.
July 11, 2003
E-MELD 2003
Why Best Practice?
“Best practice is enduring practice”
(Simons, bc)
BP is important for all language
documentation . . .
. . . but especially for documentation
of endangered languages
July 11, 2003
E-MELD 2003
Why Best Practice?
• According to the Ethnologue, 52
languages have only 1 speaker
left.
• Somewhere 52 field linguists are
making audiotapes, videotapes,
and transcripts….
July 11, 2003
E-MELD 2003
What if . . .
–Ten are transcribing in MS
Word 6
(which probably won’t be
readable in 15 years )
July 11, 2003
E-MELD 2003
What if . . .
–Ten more are using
compressed audio formats?
(and compressing away some
of the data)
July 11, 2003
E-MELD 2003
What if . . .
–Two more forget to turn
on the tape recorder?
July 11, 2003
E-MELD 2003
A true story….
The BBC Doomsday Project…
July 11, 2003
E-MELD 2003
So the School is designed to
• Help users preserve their valuable data for
generations to come.
– Data:
• Notes
• Images
• Audio & video
– Users:
• linguists, programmers, archivists
• (digital) beginners or advanced users
July 11, 2003
E-MELD 2003
Ob jectives:
• Teach
• Motivate
• Facilitate
• Invite (suggestions &
participation)
July 11, 2003
E-MELD 2003
What will the School offer?
– Information about the preservation and
digitization of data
– Tutorials to provide hands-on training
– Facilities for online operations on the
linguist’s own data, i.e., creation of metadata
– Tools (and links to tools) for client-side
operations, i.e., text annotation
– Reading material about various aspects of
BP
– showcase of data from 10 endangered
languages digitized according to BP
July 11, 2003
E-MELD 2003
How is the School organized?
– Information
– Tutorials
– Online facilities
– Client-sideTools
– Reading material
– Showcase of data
from 10 endangered
languages
July 11, 2003
E-MELD 2003
Classroom
Workroom
Tool Room
Reading Room
Exhibit Hall
The Exhibit Hall
Purpose: to show what can be done within the
BP framework
• Data (currently) from Biao Min and Mocovi
• Info on the language(s)
• Biao Min lexicon & metadata
– Archive formats
– Presentation formats (with some audio)
• Search: cross-language search at a finegrained morphosyntactic level (thanks to the
ontology)
• Comments facility for users
• What else?
July 11, 2003
E-MELD 2003
Classroom
Teach users how to:
– choose equipment & software
– create metadata and make it available for
search
– create an XML file, schema & metaschema
– create and use stylesheets to transform XML
files
– annotate & transcribe audio & video files
– acquire ethics
– What else??
July 11, 2003
E-MELD 2003
Workroom
Where user gets to work on her own data, using
BP tools for:
• metadata creation (ORE)
• terminology mapping
• annotation & transcription
• lexicon creation (FIELD)
• What else?
July 11, 2003
E-MELD 2003
Reading Room
– Reference materials
– Manuals
– Links to off-site tutorials
– White papers
– Glossary of terms (linked to
other pages on the site)
• What else?
July 11, 2003
E-MELD 2003
Toolroom
Downloads of :
• FIELD (Laptop version)
• Standalone ORE
• Links to LDC, IMDI tools, etc. for
–Conversion
– Annotation
• What else?
July 11, 2003
E-MELD 2003
The “School”
http://emeld.org/school
July 11, 2003
E-MELD 2003