Non-deterministic Tree Automata Models for Statistical Machine
Download
Report
Transcript Non-deterministic Tree Automata Models for Statistical Machine
Chiara Moraglia
Branch of computational linguistics
The study of mathematical structures and
methods that pertain to linguistics.
Combines aspects of computer science,
mathematics and linguistics.
Words: anchor
Sentences : Cleaning fluid can be dangerous.
Claire kicked the bucket.
Machine translation that keeps in mind the
problem of ambiguity.
A sequence of reordering decisions and word
translation decisions, each with a probability
assigned based upon linguistic data.
2 main reordering models:
1) phrase-based models: re-align phrases
(strings of words)
2) syntax-based models: can use tree
transducers to permute trees (syntactic
structure) with words as leaves
http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
Generalize the work on tree automata and
tree transductions to non-deterministic
models and explore the equivalence
properties that were proven to hold in the
deterministic case.
A hierarchical collection of labeled nodes
connected by edges, starting at a root node
https://upload.wikimedia.org/wikipedia/commons/f/f7/Binary_tree.svg
A tree transducer is a 5-tuple <F,H,Q, qin,R> where
i) F is a functional signature of input symbols
ii) H is a functional signature of output symbols
iii) Q is a finite set of states
iv) qin∈Q is the initial state
v) R is a finite set of rules <q, φ> ζ where ζ is
1) <q’, ψ>
2) h(< q1, ψ1>,…,< qk, ψk>)
Φ gives the conditions the current node must satisfy,
Ψ says which node to go to from the current node
(Courcelle & Engelfriet, 2012)
A functional signature is a set of function
symbols, each with an associated arity ρ(f)
(the number of arguments the function takes
on)
E.g. f(x), ρ(f)=1
h(x,y,z), ρ(h)=3
(Courcelle & Engelfriet, 2012)
i) F={f,a,b} where ρ(f)=2, ρ(a)=ρ(b)=0
ii) H= {a,b,ε} where ρ(a)=ρ(b)= 1, ρ(ε)=0
iii) Q={qin,q1,q2}
iv) qin∈Q is the initial state
v) R=
1) <qin, labf(x1)>
<qin, down1>
2) <qin, labx(x1)^bri(x1)>
x(< qi, up>)
3) <q1, True>
< qin, down2 >
4) <q2, bri(x1)>
< qi, up >
5) <q2, rt(x1)>
ε
6) <qin, labx(x1)^rt(x1)>
x(< q2, stay>)
(Courcelle & Engelfriet, 2012)
a or b(<a or b &1st child, >)
q1
<f,
>
<True, >
qin
<1st child, >
<a or b & root, stay>
<2nd child, >
a or b (<a or b & 2nd child, >)
q2
ε
<root, stay>
input tree
f
a
output tree
a
b
b
ε
A tree transducer is deterministic if the state
and the position in the tree uniquely
determine what rule should be applied
Otherwise, it is non-deterministic
E.g. <qin, labf(x1)> <qin, down>
<qin, labf(x1)> <q1, up>
g(<f, >)
qin
<a, stay>
a
h(<f, >)
Modified from (Fülöp, 1981)
input
possible outputs
f
g
h
g
h
f
g
h
h
g
a
a
a
a
a
The possible output trees would be assigned
probabilities
Then the words would be translated into the
target language
Courcelle, B., & Engelfriet, J. (2012). Graph Structure and
Monadic Second-Order Logic. Cambridge: University Press.
Fülöp. (1981). On attributed tree transducers. Acta Cybernetica,
5, p.261-279.
Knight, K., & Koehn, P. What’s new in statistical machine
translation [PDF document]. Retrieved from
http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
Tree (data structure). (n.d.). Retrieved from
http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf