Testing in the UX-design process

Three weeks ago, a client called me. They had just completed release 1.0 of a new Web application that will replace their current flagship product. The client was asking about summative usability testing to evaluate how well the product performs in the hands of users, because they want their customers to succeed.

Since the product is an enterprise-wide product that requires training, one thing the client specifically asked about was whether the Help is a help to users.

Do you need help?A quick heuristic review I did turned up no obvious problems in the Help, so we decided on user observation with scenarios. In a preparatory dry run done a few weeks ago, I supplied a participant with a few scenarios and some sample data. The participant I observed was unable to start two of the scenarios, and completed the third scenario incorrectly by adding data to the wrong database.

The Help didn’t help her. The participant was able to find the right Help topic, but she completely misinterpreted the first step in the Help’s instructions.

The team had not anticipated the apparent problem that turned up during the dry run. Assuming it is a real problem—and this can’t be more than an assumption given the sample size of 1—this story nicely illustrates the benefit of summative testing, as you’ll see below.

Best practices working together

The team, including a product manager, several developers, a technical communicator as Help author, and me as a contract usability analyst, used these best practices:

  • The Help author used a single-sourcing method. The most common GUI names, phrases, and sentences, are re-used, inserted into many topics from one source, like a variable. In almost every Help topic, the problematic first step was one such re-usable snippet.
  • The product manager assesses the bugs based on severity and cost, ensuring the low-hanging fruit and most serious of defects get priority.
  • In a heuristic review of the Help, I (wearing a usability-analyst hat) did not predict that the first step in most topics would be misinterpreted. Heuristic reviews, when conducted by a lone analyst, typically won’t predict all usability problems.
  • The developers use an Agile method. At this stage of their development cycle, they build a new version of the product every Friday, and, after testing, publish it the following Friday.

After the dry run uncovered the apparent problem, the product manager said: “Let’s fix it.” Since the Help author used re-usable snippets, rewording in one place quickly fixed the problem throughout the Help. And the company’s Agile software development method meant the correction has already been published.

Was this the right thing to do? Should an error found by one participant during a dry run of upcoming usability tests result in a change? The team’s best practices certainly made this change so inexpensive as to be irresistible. With the first corporate customer already migrated to the new product, my client has a lot riding on this. I can’t be certain this rewritten sentence has improved the Help, but—along with the other bugs they’ve fixed—I know it increases my client’s confidence and pride in their product’s quality.

It’ll be interesting to see what the upcoming user observations turn up.

Reminding myself of things I already know

The actual user-observation sessions are still ten days away, but the dry run reminded me of things I already know:

  • Despite each professional’s best efforts, there will always be unanticipated outcomes where users are involved. Users have a demonstrated ability to be more “creative” than designers, developers, and content authors, simply by misinterpreting and making unintended use/misuse of our work.
  • The best practices in each discipline can dovetail and work together to allow rapid iteration of the product by the team as a whole. A faster response means fewer users will be affected and the cost of support—and of the rapid iteration—will be lower. A good development process adjusts practices across teams (product management, research, development, user experience, design, tech-comm, quality assurance) so the practices dovetail rather than conflict.
  • Summative testing helps validate and identify what needs to be iterated. Testing earlier and more often means that fewer or perhaps no users will be affected. Testing earlier and more often is a great way to involve users, a requirement for user-centred design, or UCD. It also changes the role of testing from summative to formative, as it shapes the design of the product before release, rather than after.

Design requires courage and trust, not just user involvement

Designing is usually a rewarding activity, but the path from start to finish can be filled with frustration and even panic. I’ve seen design processes work—and come to the realisation that “My own designs benefited from rapid iteration!”

The benefit of designThese humbling experiences helped me learn to trust the process, even in the face of frustration or panic. It’s these experiences that give me the courage to follow the design process, even when it isn’t clear how to resolve the tension between conflicting design constraints.

In the face of an unknown, individuals and especially teams tend to turn to knowns. If needed, they’ll manufacture the known data, by deferring the choice to users. Here’s part of what Larry Constantin wrote about courage in software design, in a paper that advocates for user involvement at the right time:

Most damning and least recognized among the limitations of user-centered design is the way it subtly discourages courage. Courage is one of the central tenets of extreme programming and agile development methods. […] User-centered design makes it too easy for designers to abdicate responsibility in deference to user preference, user opinion, and user bias. In truth, it is hard to stick with something you know works when users are screwing up their faces at it. What if you are wrong? What if you are not as good a designer as you thought you were? It takes real courage and conviction to stand up for an innovative design in the face of users who complain that it is not what they expected or who want it to work just like some other software or who object to certain sorts of features as a matter of course. It takes responsible judgment to know when to listen to users and when to ignore them.

In the many design sessions I have facilitated, three times I’ve seen that lack of courage expressed by a participant. Each time, it sounded like a mix of panic and frustration:

The solution has been on the wall since the first round!

The design sessions I facilitate ask participants to saturate the design space with lots of ideas. They each bring five sketches—five substantially different ideas—and then, after sharing their ideas with the other participants, they rapidly iterate the first 15 or 20 sketches to develop even more. All this takes place before any analysis.

When the goal is to saturate the design space—to identify as many solutions as possible in a short time—there’s more to influence the design once the analysis begins. Inevitably, the design that the team decides on was not already on the wall. Motivated design participants quickly learn this, and—in most cases—become advocates of the process.

For most development teams, the Five Sketches™ process I introduce is a departure from the status quo, so it takes courage for their team members to take a stand, to say “I will use this process” for design problems that need it.

How to test earlier

Involving users throughout the software-development cycle is touted as a way to ensure project success. Does usability testing count as user contact? You bet! But since most companies test their products later in the process, when it’s difficult to react meaningfully to the user feedback, here are two ways to get your testing done sooner.

Prioritise. Help the Development team rank the importance of the individual programming tasks, and then schedule the important tasks to complete early.

  • Prioritise and schedule the tasksIf a feature must be present in order to have meaningful interaction, then develop it sooner.
  • For example, email software that doesn’t let you compose the message is meaningless. To get meaningful feedback from users, they need to be able to type an e-mail.

    Developers often want to start with the technologically risky tasks. Addressing that risk early is good, but it must be balanced against the risk of a product that’s less usable or unusable.

  • If a feature need not be present or need not be working fully in order to have meaningful interaction, then provide hard-coded actions in the interim, and add those features later.
  • For example, if the email software lets users change the message priority from Standard to Important, hard-code it for the usability test so the priority is always Standard.

  • If a less meaningful feature must to be tested because of its importance to the business strategy, then develop it sooner.
  • For example, email software that lets users record a video may be strategically important for the company, though users aren’t expected to adopt it widely until most laptops ship with built-in cameras.

Schedule. For each feature to be tested, get the Development team to allocate time to respond to usability recommendations, and then ensure this time is neither reallocated to problem tasks, nor used up during the initial development effort of the to-be-tested features. Engage the developers by:

  • Sharing the scenarios in advance.
  • Updating them on your efforts to recruit usability-study participants.
  • After developers incorporate your recommendations, retesting and then reporting improvements in user performance.

Development planning that prioritises programming tasks based on the need to test, and then allows time in the schedule to respond to recommendations, is more likely to result in usable, successful products.

Which user involvement works

User-centred design (UCD) advocated involving users in the design process. Have you wondered what form that user involvement could take, and which forms lead to the most successful outcomes?

I recently came across data that Mark Keil published a while ago. He surveyed software companies and correlated project outcomes with the type of user access that designers and developers had.

Type of contact with users Effectiveness
  For custom software projects
  Facilitate teams, hold structured workshop with users, or use joint-application development (JAD). ██████████
  Expose users to a UI prototype or early version to uncover any UI issues. ██████
  Expose users to a prototype or early version to discover the system requirements. ████
  Hold one-on-one semi-structured or open-ended interviews with users. ████
  Test the product internally (acceptance testing rather than QA testing for bugs) to uncover new requirements. ██
  Use an intermediary to define user goals and needs, and to convey them to designers and developers. ██
  Collect user questions, requirements, and problems indirectly, by e-mail or online locations.
  For packaged or mass-market software projects
  Listen to live/synchronous phone support, tech-support, or help-desk calls. ████████
  Hold one-on-one semi-structured or open-ended interviews with users. ██████
  Expose users to a UI prototype or early version to uncover any UI issues. ████
  Convene a group of users, from time to time, to discuss usage and improvements. ████
  Expose users to a prototype or early version to discover the system requirements. ██
  Test the product internally (acceptance testing rather than QA testing for bugs) to uncover new requirements. ██
  Consult marketing and sales people who regularly meet with and listen to customers. ██
  At trade shows, show a mock-up or prototype to users and get their feedback.
Not reported as effective in this 1995 source
  Conduct a (text) survey of a sample of users.
  Conduct a usability test to “tape and measure” users in a formal usability lab. (This study precedes such products as TechSmith Morae.)
  Observe users for an extended period, or conduct ethnographic research.
  Conduct focus groups to discuss the software.

Although Keil’s article includes quantitative data, his samples are small. I opted to show only the relative usefulness of various methods. My descriptions, above, are long because the original article uses 1995 terms that have shifted in meaning. I believe some of the categories now overlap, due to changes in technology and method. For example, getting users to try a prototype of the UI in order to uncover UI issues sounds like the early usability testing that I do with TechSmith Morae, yet the 1995 results gave these activities very different effectiveness ratings.

For details, see the academic article by Mark Keil (Customer-developer links in software development). Educational publishers typically require a fee for access.

These methods also relate to research I’m doing on epistemology of usability analysis.

Rigid UCD methodology fails?

I received an e-mail from someone at the 2008 IA Summit about Jared Spool’s declaration that UCD is dead:

——Forwarded message——
From: P
Date: Sun 13/04/2008, 2:54 PM

Hi Jerome,

I’m at the iA Summit in Miami right now, and hearing about all of the things that are going on makes me think of you. One of the interesting sessions was Jared Spool’s keynote speech. He conducted research into what makes certain companies better able to produce effective designs. He used this model to talk about the various approaches departments do to facilitate design:

Things that facilitate design

He said all design involves a process, whether it’s been formalized or not. Interesting, though not surprising: companies that have dogmatic UCD leadership or use a rigid UCD methodology are unlikely to create anything innovative. To innovate, you want to apply techniques in sometimes surprising ways to solve problems that they were not intended for (those are the “tricks”.)

OK, I’m going back out in the warm (hot!) weather.

– P

Of course, the lack of process doesn’t guarantee innovation, either, nor does it guarantee you’ll be able to repeat your (accidental) successes. I believe a successful design process must involve some form of generative design—as Five Sketches™ does—that’s based on knowledge of user condition. I also beleive that, once you’ve internalised those two things, you can use almost any form of facilitation to design good products.

Complicated GUI is fixable

According to usability guru Jakob Nielsen, the worst mistakes in GUI design are domain-specific. Usually, he says, applications fail because they:

  • solve the wrong problem.
  • have the wrong features for the right problem.
  • make the right features too complicated to understand.

Nielsen’s last point reminds me of what a product manager once told me: many users of highly specialised software think of themselves as experts, but only few are. His hypothesis? Elaborate sets of features are too numerous or complex to learn fully.

Cookie on a plateOne of my projects involved software for dieticians. The software allowed users to enter a recipe. The software would calculate the nutritive value per portion. Users learned the basic settings for an adequate result. They ignored the extra features that could take into consideration various complex chemical interactions between the recipe ingredients. The extra features—the visual and cognitive complexity—got ignored. Ironically, their very presence increased the likelihood that users would satisfice, or avoid the short-term pain of learning something new. When the product was developed, each extra bit seemed a good idea, and they may also each have helped sell the product. But, good idea or not, those extra features needed to be removed, or hidden from the majority of users, or redesigned.

Resolving the “extra features” problem
  1. If the extra features are superfluous, remove them. Usage data can help identify seldom-used features, and many of our products are capable of collecting usage data, though we currently only collect it after crashes and mainly during Beta testing. However, removing a seldom-used part of an existing feature is a complex decision, and one for the Product Manager to make. The difficulty lies in determining whether a feature would be used more if it were simpler to use. In that case, it may not be superfluous.
  2. If the extra features are used only occasionally by relatively few users, then hide them. The suggested GUI treatment for an occasional-by-few control is to expose it only in the context of a related task. Do not clutter the main application window, menu bar, or the main dialog boxes with controls for occasional-by-few tasks. Hiding the controls for an occasional-by-few task is supported by the Isaacs-Walendowski frequency-commonality grid:
  3.   If the
      feature is…
    by many
    by few
    GUI treatment:
    •  Visible.
    •  Few clicks.
    GUI treatment:
    •  Suggested.
    •  Few clicks.
    GUI treatment:
    •  Suggested.
    •  More clicks.
    GUI treatment:
    •  Hidden.
    •  More clicks.
  4. If the extra feature is to be a core feature, simplify it. I’m talking about a feature that the Product Manager believes would be used frequently or by many if users could figure it out. Burying or hiding such features isn’t the answer. You need to find ways to reduce complexity by designing the interaction well and by organising the GUI well. For this, Five Sketches™ can help.
What are the requirements, really?

All this begs the question: who can tell us which features are the extra features (the features to omit), which ones are occasional-by-few (the features to hide), and which ones are used frequently or by many users (the features on which to focus your biggest design-guns)? Nielsen says “Base your decisions on user research” and then test the early designs with users. He adds:

People don’t want to hear me say that they need to test their UI. And they definitely don’t want to hear that they have to actually move their precious butts to a customer location to watch real people do the work the application is supposed to support. The general idea seems to be that real programmers can’t be let out of their cages. My view is just the opposite: no one should be allowed to work on an application unless they’ve spent a day observing a few end users.  …More.

Conclusion: conduct user research and use what you learn to inform the design.