Machine Translation (Level 2)

Download Report

Transcript Machine Translation (Level 2)

A grammar rule

SVE.GRAM CL.IMP -------------- <& PHR.CAT> :=: 'CL, <& TYPE> :=: 'IMP, <* WORD.CAT> = 'VERB, <& VERB WORD.CAT> :=: 'VERB, <* INFF> = 'IMP, <& VERB INFF> :=: <* INFF>, <& REG V1.LEM> :=: <* LEM>, <& VERB DIAT> :=: <* DIAT>, ADVANCE, (ADVERBIALS1// CONTINUE), DO(<& REG V1.LEM :VERBCOMP>), DO(<& VERB LEX :VERBACTION>), ADVANCE, (ADVERBIALS2, ADVANCE// CONTINUE), (END.OF.SENT, STORE/ CL_COORD); Inactive filter WORD.CAT or PHR.CAT;

Anna Sågvall Hein, GSLT, January 2003

Transfer rules

• copy feature • delete feature • transfer feature • define feature Anna Sågvall Hein, GSLT, January 2003

Copy feature

LABEL TYPE SOURCE <* TYPE> = ?TYPE

TARGET <* TYPE> = ?TYPE

TRANSFER

Anna Sågvall Hein, GSLT, January 2003

LABEL REG SOURCE <* REG> = ANY TARGET <*> = <*> TRANSFER

Delete feature

Anna Sågvall Hein, GSLT, January 2003

Transfer feature

LABEL OBJ.DIR

SOURCE <* OBJ.DIR> = ?OBJ.DIR1

TARGET <* OBJ.DIR> = ?OBJ.DIR2

TRANSFER ?OBJ.DIR1 <=> ?OBJ.DIR2

Anna Sågvall Hein, GSLT, January 2003

Define feature

LABEL växellåda SOURCE <* word.cat> = noun <* lex> = växellåda.nn.1

TARGET <* word.cat> = noun <* lex> = gearbox.nn.1

TRANSFER

Anna Sågvall Hein, GSLT, January 2003

A contextual lexical rule

%% i universalstativ

LABEL i_universalstativ on universal stand SOURCE <* phr.cat> = pp <* prep lex> = i.pp.1

<* prep word.cat> = prep <* pobj> = ?x1

<* pobj df head lex> = universalstativ.nn.1

TARGET <* phr.cat> = pp <* prep lex> = on.pp.1

<* prep word.cat> = prep

Anna Sågvall Hein, GSLT, January

<* pobj> = ?x2

TRANSFER

2003

A generation rule

LABEL CL.IMP

X1 ---> X2 X3 X4 : = CL = = IMP = = Anna Sågvall Hein, GSLT, January 2003

A trace

Anna Sågvall Hein, GSLT, January 2003

Language modules in the new organisation of MULTRA

• dictionary in a database with different views • excl. contextually defined lexical transfer rules • Swedish grammar (excl. morfologi) • transfer grammar – incl. contextually defined lexical transfer rules • generation grammar (excl. morphology) – temporary generation dictionary (lexemes) Anna Sågvall Hein, GSLT, January 2003

The lexical database

• sv-en_LinkLexicon • en_Inflections • en_LemmaLexicon • en_LexemeLexicon • en_Lexicon • en_StemLexicon • sv_Inflections • sv_LemmaLexicon • sv_LexemeLexicon • sv_Lexicon • sv_StemLexicon Anna Sågvall Hein, GSLT, January 2003

Scaling up the dictionary

Swedish lemmas English lemmas befor 369 184 now 20.883

6.562

Anna Sågvall Hein, GSLT, January 2003

Scaling up the Swedish-English lexeme dictionary

• word alignment with UWA • lemmatising the Swedish one-word-units in the word links • heuristic lemmatisation of the English counterparts • basic cleaning • professionell revision • genereration of lexemes Anna Sågvall Hein, GSLT, January 2003

The MATS process in 10 steps

1. extracting text sentences from SGML-dokuments 2. tokenisation 3. extracting lemmas (SL+TL), lexemes (SL + TL), linguistic codes and valency rules form the db. 4. expansion av lemma-, lexeme- och valency codes 5. mosy-code  av-strukture 6. parsing Anna Sågvall Hein, GSLT, January 2003

Continuation (steps 7-11)

6.

7.

transfer + syntactisk generation av-strukture  mosy-code 8.

dictionary look-up in db 9.

finish (capital letter etc.) 10.

recreate SGML Anna Sågvall Hein, GSLT, January 2003

Lexicalistic translation

• Identify (lexical) translation units in the source sentence • Translate each unit separately (considering the context) • Order the result in agreement with a model of the target language (formulation due to Lars Ahrenberg; see further the KOMA project) See also Beaven, L. John, Shake-and-Bake Machine Translation. Coling – 92, Nantes, 23-28 Aout 1992.

Anna Sågvall Hein, GSLT, January 2003

Lexicalistic systems?

• Systran?

• Multra?

• Scots Anna Sågvall Hein, GSLT, January 2003

Interlingua translation

• See SN Anna Sågvall Hein, GSLT, January 2003

Assignment 2: Hable Con Ella (babelfish) • Translate the text from English to one of your favourite languages using Babelfish • Make a general quality assessment of the translation. • Specify what transfer rules were applied • Formulate them in the Multra formalism • Specify the translation errors that were made and discuss how they may be handled. Anna Sågvall Hein, GSLT, January 2003