User testing is qualitative research. The main purpose of user testing is to find and remove problems. Given this I have always assumed that the removal of negatives is good for the design, even if we cannot know what the true frequency of problems are in the wider population. That is, we know difficulty in the lab will mean difficulty in the real-world even if we don't know whether it's 20% or 60% of people that will have that problem.
In terms of positive results I am less sure about using this assumption. I have reported positive findings to show balance and to highlight what is working well in order to encourage teams and maintain successful features (as www.usability.gov argues too). But I get increasingly uncomfortable when things such as colour preferences are extrapolated to the wider user population.
When reading about what makes a qualitative study, showing rich specific content in a specific context is its key strength. The general argument is not we seek to generalise to a wider population but we develop, or generalise, to a theory (Bryman, 2008; Creswell, 2009, Grbich 2007).
I would argue that the temptation to generalise to a wider population is inherent. In fact clients would argue why on earth should I do any research if I cannot generalise to my wider customers. Why do a focus group when we cannot generalise outside the session?
Williams (2000 in Bryman 2008), argues that 'moderatum generalizations' are allowable: linkages can be made to similar groups. For example behaviour of football hooligans in one football club is related to other case studies of different football hooligans.
At a broader level I would assume that one of the points of creating theory in qualitative studies is that is generalisable. I find the idea of 'theoretical sampling & saturation' interesting: you sample, collect data and analyse until there is only repeated information (from Grounded Theory - Glaser & Strauss 1967, Strauss & Corbin 1988). Given this if we have consistent analysis of a positive interaction then we can assume it will work in general. We are not making a conclusion about frequency but a deeper more abstract judgment, for example the button 'affords' clicking through its visual design and therefore it is a good design feature. However to what level do we really theorise in user testing?
So where does this leave us? Feedback on interaction is difficult to get via other methodologies. I still want to report interactions that are working well for a design otherwise I fear the design will be paralysed within continual redesign from scratch. We do violate the principles of generalisation - we are assuming if everyone in the session understands the check-out process, the wider user population will too.
However with other types of feedback, such as preferences, perhaps we should not take a 'some information is better than none' approach. User testing can create hypotheses that should be further examined using other methods such as A/B testing, surveys or web analytics. For example if 6 out of 8 people liked the content tone this is an indication it could be working but it is not a definitive 'yes'. This is where I'm a fan of triangulation: using multiple data points and methodologies to get a clearer idea of the state of the world.
I haven't decided what I think about this topic - how can we be pragmatic and helpful without being misinforming? Definitely welcome discussion
References:
Bryman, Alan (2008). Social Science Research Methods. 3rd Edn. Oxford University Press.
Creswell, John W. (2009) Research Design. Sage Publications
Grbich, Carol (2007). Qualitative data analysis. Sage Publications.
Is your first sentence a statement of how you want to limit the scope of the blog - ie only to consider qualitative research?
ReplyDeleteSurely the aim of testing is to discover mismatches between the design in the designer's head and the "real" world with a view to eliminating misconceptions and tailoring the design to the most likely user personas.
I think it a good idea to build on the good whilst raising user issues. I don't think one can generalise other than to expect a design to comply with proven heuristics as a base line and to build on to this the requirements of targetted users. Determining these requirements has to be a product of research to determine, user needs, requirements and skill levels. Reference to industry standards and examples of best practice might be useful but could be misleading depending on the degree to which users have been involved in the evolution of these standards.
Markets (and therefore user profiles), business environments, user skills and user demographics are all potentially dynamic so one should expect designing to meet UX to follow suit. This means being prepared to modify systems/designs on an ongoing basis not forgetting of course that these days users are getting use to modifying content themselves to have things done their way and who are we to say they are wrong?
Hi Tony,
ReplyDeleteI definitely want to think about anything to do with UX in general. That includes quantitative work for sure.
I think what I find fascinating about research methods is how do we find out about the 'real world'. How do we know what the requirements of our targeted population are?
It is a good point about heuristics or design practices. I guess these are a way of embedding the knowledge of many into the design. As long, as you say, they are valid principles.
Oh Giles mentioned an example technique for preferences - Microsoft's 'product reaction cards'.
ReplyDeleteAnyone have recommendations on soliciting preferences?
So nice to read a blog which actually has references and substance to the opinion. Keep it up :)
ReplyDelete