Nordtalk 2002 - Spoken dialogue

Download Report

Transcript Nordtalk 2002 - Spoken dialogue

Hans Dybkjær
SpeechLogic™, Prolog Development Center A/S
&
Laila Dybkjær
NISLab, University of Southern Denmark
SpeechLogic
& NISLab
Measuring transaction success
in spoken dialogue information systems
Nordtalk 2002
2002-12-05
LD/HD
Assessing results?
SpeechLogic
& NISLab
• Subjective listening
– Fine and important
– Not suitable for contracts
– Not suited for tracing progress
– Very dependent on mood of caller
• Transcript walkthroughs
– Fine, provides many observations
– Not suitable for contracts
– Not suited for tracing progress
• Transaction coding
– Suitable for contracts
– Suitable for tracing progress?
• Huge work...
Nordtalk 2002
2002-12-05
LD/HD
Project and partners
• Philips Speech Processing sub-contractor to PDC
SpeechLogic
& NISLab
• Holiday Account (“FerieKonto”)
spoken dialogue service via the telephone
• September 2001 – December 2002
• Supported by the Danish government
• Three Danish partners:
– NISLab, SDU
– Prolog Development Center A/S (PDC)
– ATP-huset (hosts FerieKonto and other funds)
• Employers pay 700 M kr. to FerieKonto per year
• About 12.000 selected “general information” in
old touch-tone system per year
Nordtalk 2002
2002-12-05
LD/HD
Facts on FAQ
SpeechLogic
& NISLab
• Phase 1 called ”Vejled” in operation since September
• Phase 2, FAQ, in operation medio December 2002
• Dialogue model
– About 40 A4-pages
– 80 semantic concepts in input
– 100+ different information stories in output
– About 800 (full) words in vocabulary
– About 2500 grammar lines
• Context free with synthesized attributes
– 450 pre-recorded phrases, many long
Nordtalk 2002
2002-12-05
LD/HD
Characteristics
SpeechLogic
& NISLab
• System takes initiative and guides user
– User may take initiative and control system
• Barge-in, i.e. the user may interrupt the system
– But we don’t know where, i.e. for long output we
don’t know how much of logged output they have
heard
• Whatever the user says is recognised as something
withing system vocabulary and grammar
• No sound output logged, only user input
Nordtalk 2002
2002-12-05
LD/HD
Transactions
SpeechLogic
& NISLab
• No clear definition of transaction
• One dialogue may be one transaction (e.g. ticket
reservation or train information)
• One dialogue may contain several different
transactions (e.g. frequently asked questions)
• A simple way of looking at transactions:
– Start
– End (success, failure)
• Relate these to dialogue acts
Nordtalk 2002
2002-12-05
LD/HD
Examples
SpeechLogic
& NISLab
• Success:
U: What is your fax number
S: Fax number ...
• Failure:
U:What is your fax number
S: E-mail address ...
• Wrong = unwanted reply:
S: Do you want our address?
U: No.
S: Our address is ...
(user gets unwanted information – not a transaction)
• Wrong = erroneous information:
S: Fax number 36 36 00 00
(actually PDC’s fax is 36 36 00 01)
• (’Wrong’ is outside the transaction scheme)
Nordtalk 2002
2002-12-05
LD/HD
Dialogue acts
Example
Offer/question
Should I repeat the address?
Information
Email [email protected]
Feedback
If you are an employee…
Accept
Yes
Reject
No thanks
Selection
Employee
Other
Who is most beautiful in this
country?
SpeechLogic
& NISLab
Act
Nordtalk 2002
2002-12-05
LD/HD
Resulting tag set
Explanation
User accepts system offer
Corrections
other
offer
reject
select
check
discard
fail
start
success
wrong
Unclear or null
System offers information to user
User rejects offer
User selects from offer list
System makes explicit feedback
Discard call
Transaction ends in failure
New task initiated
Transaction ends in success
System responds with wrong
information or topic
Type
<U a>
<S mf><S q> |
<U r> <U q>
<o>
<S q>
<U r>
<U s>
<S tf(T)q(T)>
n/a
<U q(T)>
<U q(T)>...
<S f(T’) |f(bad T)>
SpeechLogic
& NISLab
Tag
accept
repair
Nordtalk 2002
2002-12-05
LD/HD
An example dialogue
SpeechLogic
& NISLab
• S 0.1: - Velkommen til FerieKontos talebaserede vejledningssystem.
- Her kan du få standardoplysninger om udbetaling af feriepenge.
- Du kan altid bede om hjælp eller gentagelse.
(Opening)
• S 0.2: - Sig om du - er lønmodtager eller arbejdsløs i Danmark - er
frameldt folkeregistret og bor i udlandet - eller har forladt
arbejdsmarkedet på grund af alder eller helbred.
(Offer)
- PAUSE5000
• U 0.1 jeg er lønmodtager Start
(Select)
• S 1.1 - Hvis du er lønmodtager i Danmark eller arbejdsløs og har et
FerieKonto-bevis
(Feedback)
• S 1.2 - skal du udfylde det og sende det til os. - Udover dig selv skal din
arbejdsgiver eller din a-kasse eller socialforvaltningen skrive under.
- PAUSE200
(Information)
• S 1.3 - Hvis FerieKonto-beviset er korrekt udfyldt når vi modtager det,
overfører vi feriepengene til din konto. - Ekspeditionstiden er cirka fem
arbejdsdage. - Du kan dog tidligst få dine feriepenge en måned før din
første feriedag.
(Information)
Success
- PAUSE1000
Nordtalk 2002
2002-12-05
LD/HD
SpeechLogic
& NISLab
• S 1.4 - Vil du høre informationen igen?
Offer
- PAUSE5000
(End)
• U 1.1 nej tak
Reject
Offer
• S 2.1 - Vil du høre FerieKontos adresse?
- PAUSE5000
Reject
• U 2.1 nej tak
• S 3.1- Vil du tilbage til starten af den generelle vejledning?
Offer
- PAUSE5000
• Event 1 nothing recorded
(Other)
• S 4.1 - Undskyld. - Jeg hørte ikke hvad du sagde. - Sig 'ja' for at starte
forfra, 'farvel' for at forlade systemet.
Repair
- PAUSE5000
• U 4.1 nej tak
Reject
• S 5.1 - Undskyld. - Systemet kan give information om udbetaling hvis
du er lønmodtager eller arbejdsløs, er frameldt folkeregistret eller har
forladt arbejdsmarkedet på grund af alder eller helbred. Du kan også
bede om FerieKontos adresse. - Sig en af disse muligheder hvis du vil
fortsætte, eller sig 'farvel' for at slutte.
Repair
- PAUSE5000
(Other)
• Event 2 nothing recorded
• S 6.1 - Farvel og tak.
(Closing)
• Event 3 hangup
(Other)
Nordtalk 2002
2002-12-05
LD/HD
Another example dialogue
SpeechLogic
& NISLab
• S 0.1 - PAUSE1000
(Opening)
- Velkommen til FerieKontos automatiske vejledning. - Du kan lade
systemet foreslå emner eller du kan forsøge med dine egne ord. - Du
vælger emner ved at tale til systemet. - Vælg mellem emnerne:
- PAUSE200 - 'lønmodtager'. - PAUSE200 - 'På orlov'. - PAUSE200
- 'bor i udlandet'. - PAUSE200 - Eller 'forladt arbejdsmarkedet' på
grund af 'alder' eller 'helbred'. - PAUSE500
Offer
- Sig et emne, sig 'gentag' eller bed om 'flere muligheder'?
(Request)
• U 0.1 jeg er studerende Start
• S 1.1 - Spurgte du om - Hovedmenu
Check
Repair
• U 1.1studerende
• S 2.1 - Hvis du er studerende og modtager SU, kan du godt få dine
feriepenge udbetalt selv om du ikke holder fri fra studierne.
- Men hvis du har arbejde ved siden af studierne, skal du holde fri fra
(Information)
dit arbejde for at få feriepengene udbetalt.
• S 2.2 - Hvis du har arbejde er det din arbejdsgiver, der skal
underskrive dit FerieKonto-bevis, ellers er det socialforvaltningen i
din kommune.
Success
(Information)
• S 2.3 - PAUSE500
Nordtalk 2002
2002-12-05
LD/HD
Another, continued
SpeechLogic
& NISLab
Offer
• S 2.4 - Er der andet du gerne vil spørge om?
(Request)
• U 2.1 skal jeg skrive lørdage på Start
• S 3.1 - Fra ferieåret 2002, der starter 1. maj, er ferieugen
ændret til 5 feriedage, og lørdage skal således ikke
(Information)
Success
længere tælles med i ferien.
Offer
• S 3.2 - Vil du vide noget om 'ferieåret'?
(Reject + request)
Start
• U 3.1 nej farvel
(Feedback)
• S 4.1 - Farvel og tak for dit opkald. Success
(Other)
• Event 1 nothing recorded
(Other)
• Event 2 disconnect
Nordtalk 2002
2002-12-05
LD/HD
Transaction annotated data
• Test: 225 calls, three batches, March-May 2002
– Batch 1 primarily developers
– Batch 2, 3 “invited” test persons
• Operation: 217 calls, one week, September 2002
– real customers with real problems
SpeechLogic
& NISLab
• Dataset:
– Vejled: A few thousand calls
– About 500 FAQ test calls
Nordtalk 2002
2002-12-05
LD/HD
Annotation
SpeechLogic
& NISLab
• Transcribed using Philips Transcription Station
– Then transformed to XML and web
• Markup was done using an annotation tool developed
by PDC
– interface is a browser window
– annotation files stored in XML
• All dialogues annotated by same, experienced coder,
using the same coding scheme throughout
Nordtalk 2002
2002-12-05
LD/HD
Results table
Test1
30
13
18
99
16
37
22
142
153
26
11
50
89.5
Test2
38
41
8
125
31
75
11
139
168
6
8
104
95.5
Test3
10
34
10
84
14
50
12
72
81
9
9
71
89.0
Total
78
88
36
303
61
162
45
353
402
41
28
225
91.8
SetA
37
118
19
231
43
77
44
120
133
18
19
217
87.5
70.3
87.3
75.7
79.6
80.8
SpeechLogic
& NISLab
Tag
accept
discard
fail
offer
other
reject
repair
start
success
wrong
Calls with fail
Total no. of calls
Transaction
success percent
Smooth call percent
Nordtalk 2002
2002-12-05
LD/HD
Results comments
SpeechLogic
& NISLab
• Higher transactions success in test dialogues
• Primary causes of failure in test sets are:
– Dialogue model
– Language model
• Causes corrected before operation
• Difference in user groups
• Test users follow the dialogue, they only have artificial
problems
• Primary causes of failure in operational calls are:
– Real customers ask for information not covered
– Typical questions to be covered by FAQ
• Problem with callers hanging up without saying
anything in the dialogue.
Nordtalk 2002
2002-12-05
LD/HD
Smooth dialogues
SpeechLogic
& NISLab
• More precise overview of problems and their causes and
seriousness
– Same topic may have fail and success in same call
– Few or many repairs
– distinction between unwanted and erroneous
information
– erroneous information is unacceptable (tomorrow is
Friday, phone 36 36 00 01)
– other information than asked for may be more or less
serious (fax instead of phone, fax instead of email)
– misunderstanding a yes for a no is usually not so
serious (repairable) but can be a nuisance
– Misrecognitions
– Information blocks may contain more than asked for
Nordtalk 2002
2002-12-05
LD/HD