Click to add title

Transcript Click to add title

Human Computer Interaction
Introducing evaluation
류현정
2005-09-01
The aims
 Discuss how developers cope with realworld constraints
 Explain the concepts and terms used to
discuss evaluation
 Examine how different techniques are
used at different stages of development
Two main types of evaluation
 Formative evaluation is done at different
stages of development to check that the
product meets users’ needs
 Summative evaluation assesses the
quality of a finished product
 Our focus is on formative evaluation
What to evaluate
 Iterative design & evaluation is a
continuous process that examines:



Early ideas for conceptual model
Early prototypes of the new system
Later, more complete prototypes
 Designers need to check that they
understand users’ requirements
Bruce Tognazzini tells you why you
need to evaluate
 “Iterative design, with its repeating cycle of
design and testing, is the only validated
methodology in existence that will consistently
produce successful results. If you don’t have
user-testing as an integral part of your design
process you are going to throw buckets of
money down the drain.”
 See AskTog.com for topical discussion about
design and evaluation
When to evaluate
 Throughout design
 From the first descriptions, sketches etc. of
users needs through to the final product
 Design proceeds through iterative cycles of
‘design-test-redesign’
 Evaluation is a key ingredient for a successful
design
Approaches: Naturalistic
 Naturalistic:


describes an ongoing process as it evolves
over time
observation occurs in realistic setting
 ecologically valid

“real life”
 External validity

degree to which research results applies to
real situations
Approaches: Experimental
 Experimental

study relations by manipulating one or more independent
variables
 experimenter controls all environmental factors

observe effect on one or more dependent variables
 Internal validity

confidence that we have in our explanation of experimental
results
 Trade-off: Natural vs Experimental


precision and direct control over experimental design
versus
desire for maximum generalizability in real life situations
Approaches: Reliability
Concerns
 Would the same results be achieved if the test were
repeated?
 Problem: individual differences:


best user 10x faster than slowest
best 25% of users ~2x faster than slowest 25%
 Partial Solution


reasonable number and range of users tested
statistics provide confidence intervals of test results
 95% confident that mean time to perform task X is 4.5+/-0.2
minutes means
95% chance true mean is between 4.3 and 4.7, 5% chance its
outside that
Approaches: Validity Concerns
 Does the test measure something of relevance to
usability of real products in real use outside of lab?

Some typical reliability problems of testing vs real use
 non-typical users tested
 tasks are not typical tasks
 physical environment different

quiet lab vs very noisy open offices vs interruptions
 social influences different

motivation towards experimenter vs motivation towards boss
 Partial Solution



use real users
tasks from task-centered system design
environment similar to real situation
Qualitative Evaluation
Techniques
Qualitative methods for
usability evaluation
 Qualitative:


produces a description, usually in non-numeric terms
may be subjective
 Methods



Introspection
Extracting the conceptual model
Direct observation
 simple observation
 think-aloud
 constructive interaction


Query via interviews and questionnaires
Continuous evaluation via user feedback and field
studies
Querying Users via Interviews
 Excellent for pursuing specific issues




vary questions to suit the context
probe more deeply on interesting issues as they arise
good for exploratory studies via open-ended questioning
often leads to specific constructive suggestions
 Problems:




accounts are subjective
time consuming
evaluator can easily bias the interview
prone to rationalization of events/thoughts by user
 user’s reconstruction may be wrong
Evaluating the 1984 OMS










Early tests of printed scenarios & user guides
Early simulations of telephone keypad
An Olympian joined team to provide feedback
Interviews & demos with Olympians outside US
Overseas interface tests with friends and family.
Free coffee and donut tests
Usability tests with 100 participants.
A ‘try to destroy it’ test
Pre-Olympic field-test at an international event
Reliability of the system with heavy traffic
Development of HutchWorld
 Many informal meetings with patients, carers &
medical staff early in design
 Early prototype was informally tested on site
 Designers learned a lot e.g.


language of designers & users was different
asynchronous communication was also needed
 Redesigned to produce the portal version
Usability testing
 User tasks investigated:




how users’ identify was represented
communication
information searching
entertainment
 User satisfaction questionnaire
 Triangulation to get different perspectives
Findings from the usability test
 The back button didn’t always work
 Users didn’t pay attention to navigation
buttons
 Users expected all objects in the 3-D view
to be clickable
 Users did not realize that there could be
others in the 3-D world with whom to chat
 Users tried to chat to the participant list
Key points
 Evaluation & design are closely integrated in
user-centered design
 Some of the same techniques are used in
evaluation & requirements but they are used
differently
(e.g., interviews & questionnaires)
 Triangulation involves using a combination of
techniques to gain different perspectives
 Dealing with constraints is an important skill for
evaluators to develop

Click to add title

Transcript Click to add title

Directory