How many participants do you need for usability testing?

How many users does it take to…?

As UX’ers undertaking usability projects we often get asked how many participants do we need for usability tests to ensure that results are reliable and actionable.

I have blogged about this before but it is a question that keeps popping up so I wanted to look at it again.

First off, what do the experts say?  Nielsen.

Nielsen graph on 5 users
Nielsen – You only have to test with 5 Users Published 2000

This says 5 Users will uncover most important usability issues and after that testing falls foul of the law of diminishing returns; each additional user will give you less and less insight.

Rolf Molich who worked with Nielsen on 10 Usability Heuristics For User Interface Design (published 1995 but still relevant!) disagrees with this statement, arguing that it takes far more users than 5 to find 85% of all problems.

Molich has undertaken many years of research in his CUE Studies looking at how usability professionals evaluate technology. However, Molich does continue on to say that if we are trying to sell on usability to an organisation or if we are planning iterative continuous testing 3-4 users may be enough for each round.

Neilsen agrees that breaking a test of 15 users into 3 different tests is better than one big usability test when when testing iterative designs.

Jeff Gothelf in his book on Lean UX sites the example of Meetup research team testing with 3 participants once a week on a Thursday concluding that every round of testing ‘brings us closer to the truth’.   Regular testing is an essential part of agile development.

#Lesson

A small number of participants is fine (3-5) BUT
Test early and Test often (and then Test again)

In his book on usability testing Steven Krug says ~ Testing with 1 user is 100%  better than testing with none~

This chimes with Nielsen curve, above with shows that zero users give zero insights

I am not 100% sure i agree with this.   What if that user is, for some reason different to other users?

The rapid iterative testing and evaluation (RITE) model developed by Microsoft does also allow for changes to a design after one test and one user BUT that system has checks in place to counteract any outliers.    RITE Methodology is another useful tool for Agile.

My Top 5 Tips on usability testing and participants

  1. Dont get hung on the numbers but instead spend time, energy and $$ to get the right participants who are representative of target group
  2. Screen those participants (or make sure Recruitment agency does) to ensure they are articulate and can voice their ideas about what they are doing/thinking.
  3. Frame testing/research aims as hypothesis; something that you are testing to prove/disprove.
  4. Combine usability testing with analytics; both prior to testing to inform test plan and highlight problem areas and after testing to test hypothesis/findings
  5. And of course ~ Test early and test often and then retest

There is no magic number for me but if it is a once off usability test (with no planned follow up tests) I have found recruiting 6 participants and 1 standby to work well. With 6 people we generally tend to see the same top line issues emerging and also importantly this number allows us some time in between each test to discuss results with clients and observers and/ or to tweak test plan for the next participant.

> More Resources <

10 Usability Lessons from Steven Krug’s Dont Make me Think

Tom Tullis Tips on Usability

Usability and UX Research II – online usability testing

Unmoderated usability testing online with User Testing and whatusersdo

Myself and my colleague Sinead recently tried out these online usability testing tools.   We were impressed; we found them easy to use, reliable and cheap!

After a simple sign up process and setting up a simple test, completed videos come back almost overnight. These are videos of screen recordings showing you what panelists do as they navigate through the task and a commentary of what they are thinking while they do it.

user testing logowhat users do

User who?

The one concern I might have is the lack of control I had over panel selection. You can ask one screening question but it is very limited and panelists can self-select. The panel who took part in our tests were mixed – some were good and other not so good. You can rate panelists and there are options to get your own customers involved but I believe its gets a bit more complicated and expensive then.

For desktop testing, both companies have a large number of panelists to choose from and customers can rate panelists based on their performance. For mobile testing in the UK, whatusersdo have the edge on User Testing have a larger panel of mobile testers, which User Testing are currently growing.

How?

Both solutions allow you to view videos, annotate and create clips of user behaviour. We found the video editing and sharing capabilities of User Testing better and easier to create highlight reels from. A feature that whatusersdo in still in beta. User testing also gives you the opportunity for one follow up question with panellists.

Overall, both tools worked very well for general or straightforward usability tasks such as registration and tasks that we did not need our own customers feedback.

Nor have tried how this works for prototypes.

Do

  • Run a pilot test or two beforehand help you shape tasks to ensure you get the information you are looking for in the follow test
  • Be clear on what you are looking to test and keep tasks clear and uncomplicated; this helps both your participants perform correct tasks and helps you focus on important data when analysing videos after (can take a while to go through!)
  • Use these tools to test competitor products

Resources:

List of Remote Usability and UX Research Tools:

Loop11  Online 101 usability videos 

Usability and UX Research – part 1

“When you have a usability hammer in your hand – everything looks like a nail” – Rolf Molich

picture usability lab
I have been involved now with usability testing for about 5 years and have been thinking lately about usability in general and the role of usability in UX research.
Here are some of my thoughts…

How many participants in a Usability Test ?

Testing with one user is 100% better than testing with none – Steven Krug.

We have all seen Nielsen’s graph which says 5 participants is the optimum number to uncover 85% of usability problems

If this sufficient to measure outputs?    Jeff Sauro has a great blog called Measuring usability – which gives a huge amount of info and advice on how to measure usability.    He tells us if we want to ensure a task-completion rate is at least 70%, we should plan on testing at least 8 users.

And it depends on what method you are using.  For eye-tracking and particularly if you want to use statistics  on results – Neilsen recommends 30 participants.

Personally, if it is one day of general testing and you are not comparing specific designs – I think 6-8 users with a test length of 45 mins, with – 5-10 in between (to jot down main points), a good number.    Anything over this number and you start hearing the same issues over and over..

I was recently at a UCD conference in London where Rolf Molich gave a talk on “debunking myths about usability” and Neilsen’s 5 users graph was one of the myths on his list.    Molich maintains you need more, a lot more users to undercover 85% of usability problems .   However, what he was recommending was smaller, more frequent tests at different stages of product development and for this he mains 3 or 4 users are sufficient –  for iterative continuous testing.

Test little and test often.

What to test in a Usability test:

This was always the biggest question for me.  I found, especially working with an agency, that clients would come in and say – something is not working on our site and we want a usability test of our site –

It might not even be a usability test they need (may be more discovery piece).   And to remind ourselves what usability is (according to Steve Krug in Don’t Make Me Think)-  “Usability means making sure something works well and that a person of average ability or experience can use it for its intended purpose without getting hopelessly frustrated”.

If it was a usability test needed, the next step was to try and narrow down what exactly it is they should test – particular function, area or concept.

This is where analytics can help us focus our efforts.   It can help us identify problem areas in the site that we may need to look at further.

Another approach when considering what tasks to test  is to consider –  What are the top three most important things people need to do when using this site (Steven Krug)

It is worth spending time on this part of the project to ensure research outputs meet business needs and expectations.


How do we measure Usability?

Jeff  Sauro talks about task completion rate percentage.   When testing, the version I have used of this is not a percentage perse but rather a ‘ranking’ of task in terms of how successfully they completed it.

1 = without hesitation, 2= hesitation/alternative route 3= intervention etc.   And I have found this useful.

Of course the SUS satisfaction questionnaire has been aournd for a long time and while the language may be a bit dated, analysis by people like Tom Tullis shows that this questionnaire is still probably the most consistent after- test questionnaire.   The problem is of course, that especially at the end of the test, participants rate tasks highly even though we have seen them struggle.  (Jeff Sauro – Around 14% of users who fail a task still rate it as super easy to do)

The way we get around this is to rate  satisfaction, ease of use etc after each task which I think improves the feedback.   And one measure we have been trying out recently to capture this  is Jeff Sauro’s  Single Measurement Score (Below).

measurement

And of course Time on Task is an industry standard.  Using Noldus I have found this easy enough to collect during testing.   Anytime I have tried on Morae, it upset the software.  Often, we have found this measure unnecessary to collect, especially if it is already obvious the participant struggled with the task.   It can be useful though, when looking at the ‘Findability’ of different elements or when considering reactions to different designs.

What about emotion?

stick-figure

We have been trying a couple different methods recently to look at emotion during testing.  Such as  asking participants to tick different faces that best describe their feeling to asking them to look at a stick man and describe what is he thinking (about the product).  We have had mixed results and I think the most important aspect is framing of question – we are working on it!

Interviewing in Usability Testing

In an interview – participant should talk 95% time and the interviewer 5% of the time – John Waterworth from Foolproof

The Interview is an important part of the usability testing– the value of usability testing  is that you get data from several sources e.g observation, eye-tracking and interview – which helps a truer picture of the experience emerges.

Some questions I have found useful during interview

  • What are you thinking?
  • What is going on here?
  • What are you looking at
  • What is going on here
  • What are you looking for?
  •  If you were at home, what would you do now?
  •  Describe what it is and why use it friend (new product)

It is not advisable to ask ‘if you had a magic wand what would you’… participants may just make information up.

Or to ask participants how does X make you feel – it may not make them feel anything at all!    Better to repeat what they say when talking about emotions e.g participant A – this is really boring – Interviewer – Boring?

If the participant is a mumbler (and there are many!) it is important to repeat or clarify what they are saying – both for yourself and any observers in the back room. I have also found it useful to ask them to use the mouse when explaining different aspects of the screen – useful again for observers and also when going over screen recordings later.

Final Thoughts on Usability and UX research

Usability lab testing is expensive.  A couple of areas that I would like to further explore:

I am keen to experiment with (evermore popular) online usability testing products.   Not to replace traditional lab testing but to see how it works along side it..

And i would like to work more with other research methods (both online and off) to give further depth to usability findings.  And find out which methods/tools work and which ones are less useful

And to consider in-house lab testing – how does that work- does it impact validity?

To be continued…

Further reading/resources:

Eye tracking Web Usability – Jacob Nielsen

Don’t make me think – Steve Krug

Rocket Surgery Made Easy – Steve Krug

Beyond the Usability Lab – Tom Tullis

www.measuringusability.com

Five users will find 85% of the usability problems – WRONG

According to Rolf Molich of Dialog Design in Denmark.   He has been conducting Comparative Usability tests with usability experts over the last 10 years and has found that different experts find different usability problems.  Also, that contrary to Neilsen’s much cited quote of ” you only need to test with five users’  he says this is nowhere near enough to undercover the majority of usability issues.

Image
Nielsen usability testing with Five users

Rolf says however that 5 users is useful number to gain

feedback for iterative usability testing cycles. 

He told us this and more at a conference I recently attended in London UCD2012.  Some other interesting points from his talk;

  • Research has shown that Expert Reviews found as many usability problems as usability testing** caveat here – not everyone is an expert
  • Remember to include positive findings in a research report – so that features that users like are not pulled by design team
  • And as he says; “when you have a hammer in your hand – everything looks like a nail” -Remember it’s critical issues you are testing for
  • Lab testing vs non lab testing  – Rolf says one of most useful aspects of testing is that stakeholders can see the user engaging with the product; it doesn’t matter if in lab or not in lab, as-long as stakeholder can see what is happening
  • Beware users opinions – they are that – just opinions.. better focus on facts – what the user does

A very interesting talk, though as a usability researcher of 5 years or so, I still think a smaller number of test participants 5-8 can be very useful in uncovering the top usability issues, if not 85% of them.

But of course agree that test little, test often is the way to go with regular iterative testing cycles.