User performance depends on conditions

In early June, in a hotel lobby, I stopped to observe someone troubleshooting a wireless connection. I’ve faced this challenge myself, since every hotel seems to have a slightly different process for connecting.

The person I was observing was visually impaired and had his GUI enlarged by about 1000% or more. As he attempted to troubleshoot his wireless connection, he was very rapidly scrolling horizontally and vertically in order to read the text and view the icons in the Wireless Connection Status dialog box. The hugely enlarged GUI flew around the screen. His screen displayed only a small portion of the total GUI, but he never lost his place.

Only part of the screen is visible

In contrast, I lost my place repeatedly. I couldn’t relate the different pieces of information, so what I saw was effectively meaningless to me much of the time. His spatial awareness—his ability to navigate quickly around a relatively large area—was clearly more developed than mine.

I could not keep up with all of the text, either, even when he was reading it to me out loud: “It says ‘Signal Strength” is 4 bars, but it won’t connect. See?” (Well, actually, I didn’t see.) Though I’m very familiar with this dialog box, I could only read the shorter words as they flew by on screen. The larger words were illegible to me. His ability to read rapidly-moving whole words when only parts of them were visible at any given instant was much more developed than mine. I felt sheepish about being functionally illiterate under these conditions.

Flying text is hard to read

It was interesting to see how my own user performance depends on such a narrow range of conditions. I need to see the whole word and its context. I need to see at least half the dialog box at once. And, if the image is moving, it must be moving slowly.

Unusable sinks on Boeing planes

Usability isn’t just about web pages, as you’ll know if you’ve tried to dial a phone number on someone else’s cell phone. Or if you’ve tried to wash your hands on most Boeing airplanes built in the past 30 years:

Taps with awkward levers

The water only flows while you press the lever—one lever for cold water and one lever for warm water. It takes one hand continuously pressing to make the water flow. Rinsing one hand without the help of the other hand is difficult. Rinsing soap off is much easier when two hands do it together.

Some of the newer Boeing aircraft—like the 787 Dreamliner—may have better taps, but I’ve never been on one. An aircraft lasts decades, so passengers will be using those old sinks and taps for years to come, on Boeing planes. Airbus planes, on the other hand, have had ergonomic taps for years: one press starts the water flow, leaving both hands free for soaping and rinsing. After a fixed duration, the water stops flowing, but you can always press again to restart the water.

While I’m pointing out usability problems in the airline industry, Airbus doesn’t have clean hands. On the Airbus web site,  type a word in the Search box—the word bathroom, for example—and then press ENTER. Nothing happens. The ENTER key doesn’t start the search, but a mouse click does.

Click OK to start searching

It’s ironic. A design that requires me to move a hand from the keyboard to the mouse is a lot like design that requires me to move a hand from the sink basin to the lever.

This sugar packet is a movie

Whether it’s ethnographic research, usability research, or marketing research, I’ve learned that the best insights aren’t always gleaned from scheduled research.

Here’s a photo of impromptu research, conducted by Betsy Weber, TechSmith’s product evangelist. I was her research subject. Betsy recorded me pushing sugar packets around a table as I explained how I’d like Camtasia to behave.

Jerome demos an idea to Betsy. Photo by Mastermaq

Betsy takes information like this from the field back to the Camtasia team. There’s no guarantee that my idea will influence future product development, but what this photo shows is that TechSmith listens to its users and customers.

The ongoing stream of research and information that Betsy provides ensures better design of products that will be relevant and satisfying for TechSmith customers down the line.

Cognitive psych in poll design

The WordPress community recently ran a poll. Users were asked to choose one of 11 visual designs. The leading design got only 18% of the vote, which gives rise to such questions as:

  • Is this a meaningful win? The leader only barely beat the next three designs, and 82% voted for other designs.

WordPress pollI don’t know about the 18% versus 82%. I do wonder whether some of the entries triggered a cognitive process in voters that caused them to pay less attention to the other designs, which may bring the leading design’s razor-thin lead into question. This cognitive process—known as the “ugly option”—is used successfully by designers as they deliberately apply cognitive psychology to entice users to act. I’ll explain why, below, but I first want to explain my motivation for this blog post.

I’m using this WordPress poll as a jumping-off point to discuss the difficulty of survey design. I’m not commenting on the merit of the designs. (I never saw the designs up close.) And I’m certainly not claiming that people involved in the poll used cognitive psych to affect the poll’s outcome. Instead, in this blog, I’m discussing what I know about cognitive psychology as it applies to the design of surveys such as this recent WordPress.org poll.

Survey design affects user responses

If you’ve heard of the controversial Florida butterfly ballot in the USA’s presidential election in 2000, then you know ballot design—survey design—can affect the outcome. I live outside the USA, but as a certified usability analyst I regularly come across this topic in industry publications; since that infamous election, usability analysts in the USA have been promoting more research and usability testing to ensure good ballot design. I imagine that the Florida butterfly ballot would have tested poorly in a formative usability study.

The recent WordPress poll, however, would likely have tested well in a usability study to determine whether WordPress users could successfully vote for their choice. The question I have is whether the entries themselves caused a cognitive bias in favour of some entries at the expense of others.

It seems that one entry was entered multiple times, as dark, medium, and light variations. This seems like a good idea: “Let’s ask voters which one is better.” Interestingly, the visual repetition—the similar images—may have an unintended effect if you add other designs into the mix. Cognitive science tells us people are more likely to select one of the similar ones. Consider this illustration:

More people choose the leftmost image. The brain’s tendency to look for patterns keeps it more interested in the two similar images. The brain’s tendency to avoid the “ugly option” means it’ll prefer the more beautiful one of the two. Research shows that symmetry correlates with beauty across cultures, so I manipulated the centre image in Photoshop to make it asymmetrical, or “uglier”.

The ugly-option rule applies to a choice between different bundles of goods (like magazine subscriptions with different perks), different prices (like the bottles on a restaurant wine list), and different appearances (like the photos, above). It may have applied to the design images in the WordPress poll. The poll results published by WordPress.org lists the intentional variations in the table of results:

  • DR1: Fluency style, dark
  • DR2: Fluency style, medium
  • DR3: Fluency style, light

The variants scored 1st, 4th, and 6thIn addition to these three, which placed 1st, 4th, and 6th overall, it’s possible there were other sets of variations, because other entries may have resembled each other, too.

As a usability analyst and user researcher, I find this fascinating. Does the ugly-option rule hold true when there are 11 options? Was the dark-medium-light variation sufficient to qualify one of the three as ugly? Did the leading design win because it was part of a set that included an ugly option? And, among the 11 entries, how many sets were there?

There are ways to test this.

Test whether the poll results differ in teh absence of an ugly-option set. A|B testing is useful for this. It involves giving half the users poll A with only one of the dark-medium-light variants, and the other half poll B with all three variants included. You can then can compare the two result sets. If there is a significant difference, then some further combinations can be tested to see if other possible explanations can be ruled out.

For more about the ugly option and other ways to make your designs persuasive, I recommend watching Kath Straub and Spencer Gerrol in the HFI webcast, The Science of Persuasive Design: Convincing is converting, with video and slides. There’s also an audio-only podcast and an accompanying white paper.

Eyetracking: “I’m typical”

If you’ve ever wondered where exactly on your web site or software your readers or users are looking, eye tracking will tell you that. The eye-tracking equipment emits a specific wavelength of light (invisible to humans) that helps the eye tracker to follow your eyes. As the light bounces off your retinas and back to the eye-tracker’s camera, its software calculates where you were looking, and for how long.

There are different ways to display the results. You can see the data as a “video” that shows a sequence of dots, everywhere you looked. Larger dots are longer fixations. You can also see the data as a cumulative heat map, similar to this:

 eye-tracking

Here’s something interesting I learned about myself. When I participate in an eye-tracking study that studies a photograph—such as a full-page magazine ad—I look at all the same places for about the same duration as other participants in the study. I know this because the composite heat map, which combines the eye-tracking data of all the participants into one heat map, looks indistinguishable from my individual heat map. It turns out I’m normal, after all.

Eye tracking has helped researchers answer questions such as these:

If you’re interested in eye tracking and usability and want to read more, try Eye Tracking as Silver Bullet for Usability Evaluations?  by Markus Weber.

Usability of a potential design

Three-quarters of the way through a Five Sketches™ session, to help iterate and reduce the number of possible design solutions, the team turns to analysis. This includes a usability analysis.

 generative-design-stage-3

After Œ informing and defining the problem  without judgement  and  generating and sketching lots of ideas  without judgment , it’s often a relief for the team to start Ž  analysing and judging  the potential solutions by taking into account the project’s business goals, development goals, and usability goals).

But what are the usability goals? How can a team quickly assess whether potential designs meet those usability goals? One easy answer is to provide the team with an project-appropriate checklist.

Make your own checklist. You can make your own or find one on the Internet. To make your own, start with a textbook that you’ve found helpful and inspiring. For me, that’s About Face by Alan Cooper. To this, I add things that my experience tells me will help the team—my “favourites” or my pet peeves. In this last category I might consult the Ribbon section of the Vista UX Guide, the User Interface section of the  iPhone human-interface guidelines, and so on.

[local /wp-content/uploads/2009/04/make-usability-checklists.wmv]

Ten-year-old advice

Fresh advice, still:

“Usability goals are business goals. Web sites that are hard to use frustrate customers, forfeit revenue, and erode brands.

Executives can apply a disciplined approach to improve all aspects of ease-of-use. Start with usability reviews to assess specific flaws and understand their causes. Then fix the right problems through action-driven design practices. Finally, maintain usability with changes in business processes.”

—McCarthy & Souza, Forrester Research, September 1998

How to test earlier

Involving users throughout the software-development cycle is touted as a way to ensure project success. Does usability testing count as user contact? You bet! But since most companies test their products later in the process, when it’s difficult to react meaningfully to the user feedback, here are two ways to get your testing done sooner.

Prioritise. Help the Development team rank the importance of the individual programming tasks, and then schedule the important tasks to complete early.

  • Prioritise and schedule the tasksIf a feature must be present in order to have meaningful interaction, then develop it sooner.
  • For example, email software that doesn’t let you compose the message is meaningless. To get meaningful feedback from users, they need to be able to type an e-mail.

    Developers often want to start with the technologically risky tasks. Addressing that risk early is good, but it must be balanced against the risk of a product that’s less usable or unusable.

  • If a feature need not be present or need not be working fully in order to have meaningful interaction, then provide hard-coded actions in the interim, and add those features later.
  • For example, if the email software lets users change the message priority from Standard to Important, hard-code it for the usability test so the priority is always Standard.

  • If a less meaningful feature must to be tested because of its importance to the business strategy, then develop it sooner.
  • For example, email software that lets users record a video may be strategically important for the company, though users aren’t expected to adopt it widely until most laptops ship with built-in cameras.

Schedule. For each feature to be tested, get the Development team to allocate time to respond to usability recommendations, and then ensure this time is neither reallocated to problem tasks, nor used up during the initial development effort of the to-be-tested features. Engage the developers by:

  • Sharing the scenarios in advance.
  • Updating them on your efforts to recruit usability-study participants.
  • After developers incorporate your recommendations, retesting and then reporting improvements in user performance.

Development planning that prioritises programming tasks based on the need to test, and then allows time in the schedule to respond to recommendations, is more likely to result in usable, successful products.

User mismatch: discard data?

When you’re researching users, every once in a while you come across one that’s an anomaly. You must decide whether to exclude their data points in the set or whether to adjust your model of the users.

Let me tell you about one such user. I’ll call him Bob (not his real name). I met Bob during a day of back-to-back usability tests of a specialised software package. The software has two categories of users:

  • What type of user is this?Professionals who interpret data and use it in their design work.
  • Technicians who mainly enter and check data. A senior technician may do some of the work of a professional user.

When Bob walked in, I went through the usual routine: welcome; sign this disclaimer; tell me about the work you do. Bob’s initial answers identified him as a professional user. Once on the computer, though, Bob was unable to complete the first step of the test scenario. Rather than try to solve the problem, he sat back, folded his hands, and said: “I don’t know how to use this.” Since Bob was unwilling to try the software, I instead had a conversation (actually an unstructured interview) with him. Here’s what I learned:

My observation For this product, this is…
Bob received no formal product training. He was taught by his colleagues. Typical of more than half the professionals. 
Bob has a university degree that is only indirectly related to his job.  Atypical of professionals.
He’s young (graduated 3 years ago). Atypical of professionals, but desirable because many of his peers are expected to retire in under a decade.
Bob moved to his current town because his spouse got a job there. He would be unwilling to move to another town for work. Atypical. Professionals in this industry typically often work in remote locations for high pay.
•  Bob is risk averse. Typical of the professionals.
•  He is easily discouraged, and isn’t inclined to troubleshoot. Atypical. Professionals take responsibility for driving their troubleshooting needs.
Bob completes the same task once or several times a day, with updated data each time. Atypical of professionals. This is typical of technicians. 

I decided to discard Bob’s data from the set.

The last two observations are characteristic of a rote user. Some professionals are rote users because they don’t know the language of the user interface, but this did not apply to Bob. There was a clear mismatch between the work that Bob said he does and both his lack of curiosity and non-performance in the usability lab. These usability tests took place before the 2008 economic downturn, when professionals in Bob’s industry were hard to find, so I quietly wondered whether hiring Bob had been a desperation move on the part of his employer.

If Bob had been an new/emerging type of user, discarding Bob’s data would have been a mistake. Imagine if Bob had been part of a new group of users:

  • What would be the design implications?
  • What would be the business implications?
  • Would we need another user persona to represent users like Bob?

Usability testing distant users

When a product’s users are scarce and widely dispersed, and your travel budget is limited, usability testing can be a challenge.

Remote testing from North America was part of the answer, for me. I’ve never used UserVue because the users I needed to reach were in Africa, Australia, South America, and Asia—continents that UserVue doesn’t reach. Even within North America, UserVue didn’t address the biggest problems I faced:

  • My study participants commonly face restrictive IT policies, so they cannot install our pre-release product and prerequisites.
  • I need to prevent study participants from risking their data by using it with a pre-release product.
  • There’s no way to force an uninstall after the usability test. Who else will see our pre-release?

Instead, I blended a solution of my own with Morae, Skype, Virtual Audio Cable, and GoToMeeting. I Testing that's really remoteused GoToMeeting to share my desktop, which addresses all three of the problems listed above. I used Skype to get video and audio. I used Virtual Audio Cable to redirect the incoming voice from Skype to Morae Recorder’s microphone channel. Morae recorded everything except the PIP video. It worked. However, my studies were sometimes limited by poor Internet bandwidth to the isolated locations of my study participants.

Amateur facilitators. I realise this is controversial among usability practitioners, but beggars in North America can’t be choosers about how they conduct usability tests on other continents. I developed a one-hour training session for the team of travelling product managers. Training included a short PowerPoint presentation about the concepts, followed by use of Morae Recorder with webcam and headset while role-playing in pairs. The main points I had to get across:

  • Between study participants, reset the sample data and program defaults.
  • When you’re ready to start recording, first check that the video and audio are in focus and recording.
  • While you facilitate, do not lead the user. Instead, try paraphrasing and active listening (by which I mean vernacular elicitation). Remember that you’re not training the users, so task failure is acceptable, and useful to us.

I had a fair bit of influence over the quality of the research, since I developed the welcome script and test scenarios, provided the sample data, and analysed the Morae recordings once they arrived in North America. Due to poor Internet bandwidth to the isolated areas of my study participants, the product managers had to ship me the Morae recordings on DVD, by courier.

It worked. I also believe that amateur facilitation gave the product managers an additional opportunity to learn about customers.