Multimedia Communications, Dr. Abdulmotaleb El Saddik

Download Report

Transcript Multimedia Communications, Dr. Abdulmotaleb El Saddik

www.site.uottawa.ca/~elsaddik
www.el-saddik.com
SEG 3210
User Interface Design & Implementation
Prof. Dr. Abdulmotaleb El Saddik
University of Ottawa (CBY A211)
(613) 562-5800 x 6277
elsaddik @ site.uottawa.ca
abed @ mcrlab.uottawa.ca
1
Unit E-Guidelines
(c) elsaddik
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Unit E: Design Guidelines
2
Unit E-Guidelines
(c) elsaddik
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
A General Meta-Guideline
Interaction Styles vs. Interaction Elements
Coding Techniques and Visual Design
Response Time
Feedback and Error Handling
Command-Based Interfaces
Menu Driven Systems
Keyboard Shortcuts
Forms-Based Interfaces
Organizing a Windowing Interface
Question and Answer Interfaces
Information Query Interfaces
Voice I/O
Natural Language Interfaces
Other Types of I/O
Localization and Internationalization
On-Line Help
Guidelines and Standards Documents
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
I/O Techniques, Devices, and Technologies
An interaction technique is
• A way to carry out an interactive task
• Based on using a set of:
• input and output devices or
• technologies.
• Indirect Pointing devices
• Input devices
• indirect pointing devices
• direct pointing devices
• keyboards
• microphones
• video cameras
• body sensors
• dolls (e.g., Barney)
•…
3
Unit E-Guidelines
(c) elsaddik
• mice
• trackballs
• joysticks
•…
• Direct Pointing devices
• tablets
• touchscreens
• touchpads
•…
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Input technologies
•
•
•
•
•
•
•
•
•
speech recognition
speaker recognition
machine vision
eye tracking
head tracking
gesture tracking
touch sensing
pressure sensing
…
• Devices that detect touch position: touchscreens, touchpads, tablets.
• can have a display or just an input tablet
• many devices also detect the amount of pressure used when touching
• possibilities for pressure-sensitive interaction techniques
• Touch-sensing: sensing touch in mice and trackballs
(Ken Hinckley, Microsoft)
• The programs can react on touching the device
4
Unit E-Guidelines
(c) elsaddik
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Output technologies
• speech synthesis
• audio-visual speech synthesis (e.g., a talking
head, an artificial person)
• haptic/tactile feedback
• stereoscopic displays
• …
• Output processing hardware
•
•
•
•
5
Unit E-Guidelines
(c) elsaddik
sound cards
graphics cards
interface cards
...
• Output devices
•
•
•
•
monitors
earphones
loudspeakers
tactile devices: technology has advanced to
the level that allows us to mimic real touch
sensations when using computers
• dolls (e.g, Barney)
• …
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
13. Voice I/O
Consider voice input in the following
circumstances:
•
•
•
•
Hands are busy
Environment is dirty
Small device with keyboard not available
Dictation by a slow typist
Structure voice command input so:
• Only one or two words are required
6
Unit E-Guidelines
(c) elsaddik
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Voice I/O (Why Speech?)
• Natural:
• speech is the most efficient, popular and wide-spread way to
communicate
• Efficient:
• in many cases speech is the most efficient communication method
[Chapanis, 1975]
• Expressive:
• Some things are quite impossible to express without using speech
(or natural language in general)
• Popular and preferred:
• Some people use verbal-acoustic problem solving methods
instead of visual-spatial (GUI-oriented) methods [Bradford, 1995]
7
Unit E-Guidelines
(c) elsaddik
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Motives to use speech in user interfaces
1. The only possible method
2. The most efficient method
3. The most preferred method
4. A supportive method
5. An alternative method
6. A substitutive method
It is important to know the motives behind the use of speech!
8
Unit E-Guidelines
(c) elsaddik
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Restrictions of speech in user interfaces
Restrictions related to the characteristics of speech
• temporary, slow and serial
• Public
• the need to remember commands or key-words
Restrictions related to the communication skills
• discrete vs. continuous style of speaking
• interruptions
• the use of pauses
Problems of speech recognition
• Variations in speech-signal
• Linguistic variations
• phonetic, syntactic, semantic and discourse related variations
• Acoustic variations
• communication channel and environment related variations
• Speaker related variations
• internal and interpersonal variations
9
Unit E-Guidelines
(c) elsaddik
Natural speech contains hesitations, false starts, repairs, breaks etc.
ungrammatical elements
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Voice output
Consider voice output in the following
circumstances:
• Eyes are busy or user will be away from screen
• Vision might not be possible
• Blind, variable lighting, fumes
• User has to perform a monitoring function
• Sporadic warning signals are better picked up if they are vocal
• Voice works best if there are numerous screens
• Pilots prefer voice to be used for warning messages, but visual
displays for everything else
• But when continually engaged in a task, pilots respond faster to
visual messages
Avoid voice output where the following are
important:
10
Unit E-Guidelines
(c) elsaddik
• Privacy
• Security
• Not disturbing others (offices)
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Voice output (synthesized or recorded)
Recorded speech
•
•
•
•
Excellent quality
Natural
Possible if only limited amount of different phrases
Hard to make changes later
• we should get the same person to read new material
• also recording conditions etc. should be similar
• Each possible phrase should be recorded
• combining material from different recordings sounds unnatural and
unpleasant
Synthesized speech
• Quality is still significantly worse than natural speech
• Very flexible
• Mature technology
• Stable and reliable software available
• Requires reasonable amount of resources
• Cheap
11
Unit E-Guidelines
(c) elsaddik
• Available for most of the (western) languages
• Sometimes problems with text interpretation
• At the moment the control over the speech is rather limited
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Speech Synthesis
• Formant synthesis
• based on acoustic features of speech: a set of filters is used to model
natural speech sounds
• intelligible, but not very pleasant (machine like)
• Concatenating synthesis
• speech is constructed by mixing short samples of recorded natural speech
together
• longer samples make more natural sounding speech but it is also harder to
collect to samples
• Acoustic modeling synthesis
• models the vocal track and simulates how it works - very complex to
compute even with simplified models
• Prosody
12
Unit E-Guidelines
(c) elsaddik
• prosody: volume, speed and pitch variations and pauses in speech
• if speech has no prosody, it sound very monotonic and it’s hard to
understand
• synthesizers add basic prosody to speech
• synthesizers don’t know where to put emphasis and complex sentences
can cause problems
• user can add simplex control tags to control speed, volume, pitch and
pauses
• this control is however very rough and works in word level
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Voice output (synthesized or recorded)
Quality and acceptance
• In general, users do not prefer synthetic speech
• Even unnatural mixture of natural and synthetic speech is sometimes
better than than all synthetic
• Some people have much more difficulties with synthesis than others
• People learn to listen to synthetic speech
• Many blind people who listen to a synthesizer daily can use it with very
fast speaking rates
Recommendations
Avoid lengthy segments of voice output
• The output is only communicated in real time (slower than reading)
• It must be simple enough to be immediately understood or to fit in short
term memory
• Early parts of longer messages are forgotten
Use a normal output rate of 180 words per minute (wpm)
13
Unit E-Guidelines
(c) elsaddik
• Allow experts to increase the rate to 240 wpm
(Note that this is still slower than the average reader)
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Voice output (synthesized or recorded)
Recommendations
Allow users to easily
• Request repetition of messages
• Skip back a few seconds
• Skip ahead once they have heard what they need
• e.g. immediately choose the next action
Be aware of where to place key information in output
sentences:
• At end if output is highly variable
• Users better remember the last few words.
• e.g. temperature in weather reports
• e.g. ‘Boss said meeting changed to 3 p.m.’ vs. ‘Meeting changed to 3 p.m.
by the boss’
• At start if information is structured and user can skip ahead
• Where the user knows what to listen for, speed becomes critical.
• e.g. ‘3 messages in your mailbox’
• vs. ‘you have 3 messages’
14
Unit E-Guidelines
(c) elsaddik
• Place the required action after its description:
• e.g. ‘To exit your mailbox, press 9’
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
14. Natural Language Interfaces
15
Unit E-Guidelines
(c) elsaddik
A company VP came to work and saw a message
‘Talk to me - end all messages by hitting return’.
The VP thought the machine was connected to
the engineer:
•
•
•
•
•
•
•
•
•
•
•
•
•
VP: Do you think I can use the computer this morning?
Resp: Why do you ask?
VP: I might be able to make some additional sales.
Resp: Why aren’t you sure?
VP: My prospective customers haven’t seen a demonstration.
Resp: Does this mean very much to you?
VP: Of course it does.
Resp: You seem very positive about that.
VP: This is ridiculous.
Resp: What do you mean this is ridiculous.
VP: Is there any reason why I can’t use the computer now?
Resp: Tell me more about your prospective customers.
VP: Damn it, call me at 491-1850
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
14. Natural Language Interfaces
When a system operates with an NL interface,
users expect it to behave like a human:
• Naive users will tend to think it ...
• Can understand arbitrary grammar and vocabulary
• Is ‘intelligent’.
• An NL interface can be very confusing if certain rules are
ignored
Rules of using NL
• Avoid leading the user to think the system is more capable than
it really is
• If the system understands only a limited vocabulary, guide the user
to use this.
16
Unit E-Guidelines
(c) elsaddik
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Rules of using NL
• Respect implicit rules of human conversation
• Use active voice (not passive).
• e.g. Bad: The data was produced by X
• e.g. Good: X produced the data
• Be cooperative
• Suggest alternatives when rejecting an input
Or tell the user where to find alternatives.
• Prioritize the alternatives
• Correct or tolerate minor errors in pronunciation and grammar
violations
• Give meaningful responses
• Avoid excessive use of simple ‘yes’ and ‘no’.
• Avoid deluging the user with unwanted information.
• But allow the user to dig deeper if she or he wants
17
Unit E-Guidelines
(c) elsaddik
www.site.uottawa.ca/~elsaddik
www.el-saddik.com
Rules of using NL
• Rephrase the user’s input
• Seek confirmation before taking any action
• Always build in a ‘clarification’ mechanism
• The user corrects a faulty system interpretation
• ... or the system prompts for more detail
• Avoid jargon and excessive familiarity or joking in the system.
• Use a restricted natural language
• Technology does not yet permit full NL
• Use Wizard technique to find out the kinds of dialogs users will
have with the machine
• A software engineer provides the computer’s side of the dialogue
Beware that:
•
•
•
•
18
Unit E-Guidelines
(c) elsaddik
NL systems are very expensive to build
Many do not work well
They are hard to maintain and internationalize
They may be slower and harder to use than more artificiallooking interfaces