Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University Course organization  http://www.tulane.edu/~ling/NLP/  NLTK is installed on the computers in this room!  How.

Download Report

Transcript Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University Course organization  http://www.tulane.edu/~ling/NLP/  NLTK is installed on the computers in this room!  How.

Finite-state automata 3
Morphology
Day 14
LING 681.02
Computational Linguistics
Harry Howard
Tulane University
Course organization
 http://www.tulane.edu/~ling/NLP/
 NLTK is installed on the computers in this
room!
 How would you like to use the Provost's
$150?
25-Sept-2009
LING 681.02, Prof. Howard, Tulane University
2
SLP §2.2
Finite-state automata
2.2.6 Recognition as search
Non-deterministic
recognition: Search
 In a non-deterministic FSA, there is at least one
path through the machine for a string that is in the
language defined by the machine.
 There is no path through the machine that leads to
an accept state for a string not in the language.
 But not all paths directed through the machine for
an accept string lead to an accept state.
25-Sept-2009
LING 681.02, Prof. Howard, Tulane University
4
Non-deterministic
recognition
 So success in non-deterministic recognition
occurs when a path is found through the
machine that ends in an accept.
 Failure occurs when all of the possible paths
for a given string lead to failure.
25-Sept-2009
LING 681.02, Prof. Howard, Tulane University
5
Back to the example
q0
q1
b
25-Sept-2009
q2
a
q2
a
q3
a
q4
!
LING 681.02, Prof. Howard, Tulane University
$
6
q
0
b
a
a
a
!
a
a
a
!
a
a
!
Example
q
1
1
b
q
2
2
b
3
a
q
b
a
a
4
5
q
2
3
a
!
b
a
a
a
!
q
q
2
4
X
b
25-Sept-2009
a
a
a
!
b
LING 681.02, Prof. Howard, Tulane University
a
a
a
6
!
7
Summary
 States in the search space are pairings of
tape positions and states in the machine.
 By keeping track of as yet unexplored
states, a recognizer can systematically
explore all the paths through the machine
given an input.
25-Sept-2009
LING 681.02, Prof. Howard, Tulane University
8
Keeping track
 But how do you keep track?
 Depth-first/last in first out (LIFO)/stack
Unexplored states are added to the front of the
agenda, and they are explored by going to the
most recent.
 Breadth-first/first in first out (FIFO)/queue
Unexplored states are added to the back of the
agenda, and they are explored by going to the
most recent.
25-Sept-2009
LING 681.02, Prof. Howard, Tulane University
9
Depth-first/LIFO/stack
q2
q12
q27
q18
q12
q2
q27
q31
stack
q41
25-Sept-2009
q50
LING 681.02, Prof. Howard, Tulane University
10
Breadth-first/FIFO/queue
q2
q12
q27
q18
q31
q2
q12
q27
queue
q41
25-Sept-2009
q50
LING 681.02, Prof. Howard, Tulane University
11
SLP §2.2
Finite-state automata
2.2.7 Comparison
Equivalence
 Non-deterministic machines can be
converted to deterministic ones with a
fairly simple construction.
 That means that they have the same power:
non-deterministic machines are not more
powerful than deterministic ones in terms of
the languages they can accept.
25-Sept-2009
LING 681.02, Prof. Howard, Tulane University
13
Why bother?
 Non-determinism doesn’t get us more
formal power and it causes headaches, so
why bother?
 More natural (understandable) solutions.
25-Sept-2009
LING 681.02, Prof. Howard, Tulane University
14
SLP §3
Words and transducers
Intro
Concepts and terminology
 study of spelling
 orthography
 study of word composition
 morphology
 to build a structured
representation of a word
or sentence
 input to this process
 a process that applies
without limitations
 Can all forms be stored in
advance?
25-Sept-2009
 parsing
 surface or input form
 productive
LING 681.02, Prof. Howard, Tulane University
16
Concepts and
terminology
 the minimal meaning-bearing unit in a  morpheme
language
 the main unit
 additional units
 a unit that:




 stem
 affix
 prefix
precedes the main one
follows the main one
surrounds the main one
is inserted within the main one
 a language in which the main unit can
have many additional units
25-Sept-2009
 suffix
 circumfix
 infix
 agglutinative
LING 681.02, Prof. Howard, Tulane University
17
Concepts and terminology
 Combining an affix to a stem
 inflection
does not change the part of
speech of the stem.
 Combining an affix to a stem
DOES change the part of
speech of the stem.
 Combining multiple stems.
 Combining a stem with a
phonologically reduced stem.
 derivation
25-Sept-2009
 compounding
 cliticization
LING 681.02, Prof. Howard, Tulane University
18
SLP §3
Words and transducers
§3.1 Survey of (mostly) English
morphology
Inflectional morphology
stem
-s
-ing
preterite past part.
walk
walks
walking
walked
walked
try
tries
trying
tried
tried
map
maps
mapping mapped
mapped
eat
eats
eating
eaten
catch
catches catching caught
caught
be
is
been
25-Sept-2009
being
ate
was
LING 681.02, Prof. Howard, Tulane University
20
Next time
P4
SLP §3.2ff