Working group #5: Archiving, Ethics, and Metadata

Download Report

Transcript Working group #5: Archiving, Ethics, and Metadata

Working group #5: Archiving, Ethics, and Metadata

E-MELD 2004, Detroit, MI, July 15-18

Archiving

Current pages are mostly about how to build an archive than about how to build a better field corpus.

First suggestion: reverse that order, put establishing an archive at the bottom with link to page for the relatively few who will be interested.

How to build a better corpus

 First: change the name of the section! Field linguists don’t create archives; they create archive-ready corpora.

 Add a link called Labelling to the right-hand box and sprinkle the phrase

Label Everything

randomly throughout the document.

 Have links to concrete examples that print nicely wherever possible.

Archiving, cont

How to build a better corpus  Change “collection” to “field corpus” or “language documentation corpus”  Make it more concrete, give a set of steps to follow  Steps should link to other parts of the School, e.g. metadata, ethics  Should be very granular, small bites

Ethics

 4 facets to ethical foundations for language documentation:  Ethics: a general code of conduct;  Informed consent: from consultants;  Intellectual property rights: (IPR) legal status of materials produced;  Access management: for archives, over the long term.

School of Best Practice guide to ethically grounded fieldwork

I.

II.

III.

Get informed; Get consent; Give something back.

Get informed

 What kinds of information would be helpful for field linguists?

 Statements from relevant entities:   LSA, AAA, APA academic organizations’ statements of ethics; DoBeS, AILLA, AIATSIS, ANLC, etc.: archives’ Codes of Conduct; Terms & Conditions of use;  IRB requirements for some representative universities;

Get informed, cont.

 Summaries of IPR laws in different countries;  Statements from indigenous organizations, e.g. Kuna Cultural Congress, Mayan Academy.

 Tip sheets related to regions and/or language & culture areas, with advice and warnings about special legal and ethical issues to watch out for. Ex: paying illegal immigrant consultants in the U.S.

Get informed, cont.

 Examples of written/oral agreements between researchers/projects and speakers/communities;  Samples of boilerplate for IRB documents;  Case studies, anecdotes, and tidbits from experienced field linguists, about what went wrong and what went right in terms of getting permission to work with people and record people and publish results in various ways;

Get informed, cont. (last one)

 We need a place for discussing and archiving information about compensating consultants.

 Forms of compensation (money, gifts, labor) depend on everything: region, culture, religion, class.

Get consent

 We need downloadable, full-color, tri-fold brochures in all the major contact languages entitled “How to talk to your linguist about Intellectual Property Rights.”  Also helpful to have anecdotes, ideas, etc. about how to talk about publishing with people who aren’t familiar with publishing media.

Get consent, cont.

Get consent to what? List typical potential uses of language documentation:  Archive w/limited access: specify conditions;  Archive @ public access: think about pros and cons;  Publish whole work: transcribed/translated, CD, DVD  Publish excerpts: grammars, articles, books  Public performance: broadcast  Commercial uses: Madonna samples Achuar

Get consent, cont.

How to document consent:  Signed license agreements;  Recorded license agreements (transcribed and translated);  Relate agreement to resource via metadata;  Could also document Codes of Conduct adhered to in the metadata for future reference.

Give something back

We have to come up with better ways to make at least some part of our language documentation useful and accessible to speakers, as early in the documentation process as possible.

Give something back, cont

People want printed materials, maybe CDs and tapes. Print is still the most useful medium for most endangered language communities.

 Archives should be sure to offer formats that print nicely;  Departments need to encourage and reward linguists for spending time on pedagogical materials;

Give something back, cont.

School of Best Practice could showcase examples of good things to create:  Teaching materials: primers, simple dictionaries, Illustrated Encyclopedia of Animals (pictures with words in Rama, Spanish, and English);  Calendars;  Collections of stories with accompanying cassette tapes

Metadata

Critique of the School’s ‘What is metadata?’ page:  Needs to fit on one page;  Should be very simple, a basic into with links to details;  List the basic required elements with a link to an Example that will print nicely;  How to do it -> page with tools & templates  History -> page with links to OLAC, IMDI, etc

Metadata, cont

 We must make it clear that this is not new stuff: linguists already collect the essentials.

 We need to make it clear that metadata and corpus management are useful for the field linguist, not just another rule imposed from the outside;  Metadata is your friend! Knowing what you have and what goes with what is helpful!

Metadata, cont.

Got Metadata? A guide to managing information in your field project: 1.

Set up your template of elements in your metadata logbook of choice:     Spiral notebook; Excel spreadsheet; Shoebox database; IMDI Corpus Browser;  We need clever people to create templates and make them available from the School.

Metadata, cont.

2.

3.

4.

5.

Figure out a labelling/identifier scheme for your materials; Label every object (tape, cd, file, notebook) with its assigned label; Use the labels as keys in your logbook so metadata records are associated with the things they describe; Record a header on every recording with the same essential info.

Metadata, cont.

A miscellany of suggestions:  A link on the Metadata section to info about creating a subcommunity, e.g. comparitivists;   A bb for discussion about a thesaurus of Genre (Role, etc) terms for particular regions/language families; “Find a way to lead people from Google into the OLAC search domain”  Add an element for linking to a standard like GOLD (like IMDI content-encoding?)

Metadata & Citations

We need to devise a standard format for citing archived language documentation resources and get it out there to be reviewed, discussed, improved, and adopted.

People must start getting real professional credit for language documentation work.

Citations

Two examples: Sánchez Morales, Germán. (1994). "Satornino y los soldados." Heidi Johnson, (Researcher). Format=audio. [online.] http://www.ailla.utexas.org

: Archive of the Indigenous Languages of Latin America. ZOH001R010I001.wav.

Johnson, Heidi. (1994). "Satornino y los soldados." Germán Sánchez Morales, (Consultant). Format=interlinear text. [online] http://www.ailla.utexas.org

: Archive of the Indigenous Languages of Latin America. ZOH001R010I001.txt.

Citations

MLA has guidelines for citing online resources, BUT they don’t look like normal citations.

We want these to look as much as possible like other legitimate publications.

Also: we need to think of ways to achieve some sort of peer-review process for at least some archived language documentation.