Transcript Slide 1

Evaluation
CS2391 Lecture n+1: Robert Stevens
http://img.cs.man.ac.uk/stevens
1
Introduction
• You’ve gathered requirements, designed your system, built the
artefact, …But does it fulfil the user‘s requirements?
• Basic usability
• Basic evaluation
• Evaluation styles
• Design evaluation
• Implementation evaluation
http://img.cs.man.ac.uk/stevens
2
Usability Basics
• Allowing users to achieve a goal with efficiency, effectiveness
and satisfaction
• Utility is the functionality of a system
• Utility without usability, but not vice versa
• Worthy, but unhelpful
• Have paradigms of good usability, e.g. GUI
• Also need theory to know why something is usable
• Really want principles to guide developers – engineering not
craft
http://img.cs.man.ac.uk/stevens
3
Execution and Evaluation
presentation
Input
System
performance
observation
User
Output
http://img.cs.man.ac.uk/stevens
articulation
4
Execution & Evaluation (2)
• Presentation: How the system renders state and allows the user
to evaluate state and alteration to the state
• Observation: What the user notices of the presentation; Can
he/she see what they need to?
• Articulation: Expression of a user’s execution plan
• Performance: the system’s execution of a plan, the results of
which are presented to the user
http://img.cs.man.ac.uk/stevens
5
Usability Principles
a. Visibility of system status System should always keep
users informed
b. Match between system and the real world System should
speak the user's language
c. System functions chosen by mistake need a clear
'emergency exit'
d. Consistency and standards Avoid ambiguity
e. Error prevention
f. Recognition rather than recall
g. Flexibility and efficiency of use
h. Aesthetic and minimalist design
i. recognize, diagnose and recover from errors
j. Help and documentation
http://img.cs.man.ac.uk/stevens
6
What is Evaluation?
• Do the design and implementation behave as we expect and
fulfil the user’s requirements?
• Not just an add on at the end!
• Assess the design at various times during the life cycle
• Assess implementation prototypes, alpha and beta versions
• Evaluation saves time and money
• Many types of evaluation and the trick is to choose the
appropriate one
• Purpose is to uncover usability problems
http://img.cs.man.ac.uk/stevens
7
Usability Thoughts
• Recall and recognition
• Making a system easier to use makes it more powerful
• Humans can switch topics fast – think of more than one thing at
once
• Computer system should be able to do the same
• Complex syntax often hides the task – need directness of
interaction
http://img.cs.man.ac.uk/stevens
8
Styles of Evaluation
Evaluation
Design Evaluation
•
•
•
•
Implementation Evaluation
Cognitive walkthrough
Heuristic evaluation
Review-based evaluation
The use of models
• Empirical
• Observational
• Query
http://img.cs.man.ac.uk/stevens
9
Evaluation Styles (2)
• Cheaper to evaluate design, before the expense of
implementation
• Tends not to involve the end-users, except as consultants
• Evaluation of an implementation does involve end-users
• Design evaluation techniques can be used to evaluate
implementation
• The former are often paper based and involve experts
• The latter are time consuming, difficult and expensive and can
involve numbers of end-users
http://img.cs.man.ac.uk/stevens
10
Types of User
•
•
•
•
•
Not all users are Computer Scientists
Different users have different needs
Remember: Managers, system administrators and trainers
Use end-users where possible and appropriate
Important to have evaluatees that are representative of endusers
• Balance between under use and over use: Users need a reward
for their time
http://img.cs.man.ac.uk/stevens
11
Hawthorn Effect
• Users like to please the evaluator
• People respond well to having someone interested in them
• Simply by evaluating an artefact, experience of that artefact
improves
• Investigation of light levels in factories showed the investigation
itself was the most important factor
• Not much to be done about it – be aware
http://img.cs.man.ac.uk/stevens
12
Goals of Evaluation
• Does the system have the correct functionality? Does it match
the users task?
• A clerk used to searching by post-code, should be able to
search by post-code
• Can the functionality be used: What is the effect on the user?
• What are the problems with the system?
• The last is part of the other two, but negative aspects drawn out
http://img.cs.man.ac.uk/stevens
13
Laboratory Techniques
• A usability lab: One way mirror; Video and audio recorders
• Logging of system
• Lacks context; unnatural for end-users and natural collaborative
work difficult
• Does allow close study, particularly of specialist task or
particular UI notion
• Good for single user tasks
http://img.cs.man.ac.uk/stevens
14
Field Techniques
• See the user in context
• Allows a user to interact with all people, objects and actions
involved in a task
• Collaborative work can take place
• Noisy, difficult to record, etc
• Can lack detail possible in laboratory
http://img.cs.man.ac.uk/stevens
15
Cognitive Walk Through
• Bring psycology theory into informal and subjective walk through
1. Need a design: not necessarily complete, but location and wording
helpful
2. A description of the task: Should be representative
3. A list of actions the user makes to perform the task
4. A description of the users and the experience expected of them
• given to experts, who step through actions and make an assessment of
usability
1. Are the users performing the task described by the action?
2. Can the users see the object of interaction (button etc)?
3. Can the user tell that it is the right action?
4. Once performed, does the user get appropriate feedback?
• End of execution & evaluation cycle
http://img.cs.man.ac.uk/stevens
16
Heuristic Evaluation
• A set of heuristics (rules of thumb) developed by Jakob Nielsen
and Rolf Molich
• Each heuristic used to critique an interface
• A set of independent experts use the heuristics
• Problems found following a Poisson distribution – 5 experts find
about 75% of problems
• Usability questions used to guide and stimulate
• Essentially a check list
http://img.cs.man.ac.uk/stevens
17
Review Based Evaluation
• Principles from experimental psychology and HCI literature
used to provide evaluation criteria
• E.g., menu design, naming items, icon design and language
design and memory attributes
• Cheaper than performing the experiment, but beware of context
in which a study was performed
• Like all expert based methods, it is all about stimulating basic
questions to be asked
• Try and ensure independence of experts
• Performance, using scales and comment fields should be used
http://img.cs.man.ac.uk/stevens
18
Empirical Evaluation
•
•
•
•
•
•
•
•
•
Evaluating the implementation (can also use Design Evaluation
methods here)
Empirical studies concentrate on end-users, rather than experts
The controlled experiment technique
Measure some attribute, while controlling other attributes of system
Various experimental conditions, which differ only in the value of some
variable
Independent (manipulated) and dependent (measured) variable
Difference in behaviour attributed to different values of independent
variable that provide the different conditions (interface style, pointing
device, wording, etc.)
Dependent variable must be measurable in some way – speed, mouse
clicks, satisfaction etc.
Use both subjective and objective Measures
http://img.cs.man.ac.uk/stevens
19
Empirical Techniques (2)
•
•
•
•
•
•
•
•
•
A hypothesis is framed in terms of the variables
A change in the independent variable causes a change in the
dependent
The experiment attempts to prove this relationship
Achieved by disproving null hypothesis; that is, no relationship of
variables
Use statistics to show that any differences seen could not have
happened by chance
Experimental design: Between groups and within groups
Between Groups: Subjects assigned to experimental and control
groups; latter ensures it is the independent variable that counts
Each subject only does one condition, so avoiding learning effects; but
prone to variation
Within Groups: Subject performs in all conditions; Vary condition order
to avoid learning
http://img.cs.man.ac.uk/stevens
20
Empirical Evaluation (3)
• Good for evaluation of individual design decisions: Colour,
dialogue, wording, etc.
• Less good for overall usability – systems and humans too
complex for controlled experiment
• Difficult to design
• Expensive in time, money and users
http://img.cs.man.ac.uk/stevens
21
Observational Techniques
• Think aloud & Co-operative evaluation
• Observing the user’s actions in work context – the whole task
• Usually pre-determined, representative tasks and users explain
what they are doing (think aloud)
• Experimenter interacts with participant (subject) to elicit more
information
• Everything recorded (notes, system log, audio, video)
• Protocols analysed
• Post-experiment walk through
http://img.cs.man.ac.uk/stevens
22
Query Based Techniques
•
•
•
•
•
•
Ask the user can be very informative
Simple, but highly subjective
Interviews and questionnaires (see earlier lectures)
Good for large numbers and high-level
Good for exploring alternative strategies, particularly in context
Less systematic, more subjective
http://img.cs.man.ac.uk/stevens
23
Summary
•
•
•
•
•
•
Need to test appropriateness of functionality
Also that functionality can be used
Efficiency, effectiveness and satisfaction
Evaluation of design and its implementation
Choose your users with care
HCI: Dix, Findlay, Abowd & Beale; Chapter 11
http://img.cs.man.ac.uk/stevens
24