Chip-card usability: Remove the card to fail

Card readerI went to the corner store, made a purchase, and tried to pay by using a chip card in a machine that verifies my PIN. My first attempt failed, because I pulled my card out of the card reader too soon, before the transaction was finished. I should add that I removed my card when the machine apparently told me so.

The machine said: “REMOVE CARD”

And just as I pulled my card out, I noticed the other words: “PLEASE DO NOT”

Have you done this, too…?

Since making a chip-card payment is an everyday task for most of us, I wonder: “What design tweaks would help me—and everyone else—do this task correctly the first time, every time?” Who would have to be involved to improve the success rate?

Ideas for a usable chip-card reader

A bit of brain-storming raised a list of potential solutions.

  • Less shadow. Design the device so it doesn’t cast a shadow on its own screen. The screen of card reader I used was sunk deeply below its surrounding frame, and the frame cast a shadow across the “PLEASE DO NOT” phrase. (See the illustration.)
  • Better lighting. Ask the installer to advise the merchant to reduce glare at the cash register, by shading the in-store lighting and windows.
  • Freedom to move. The device I used was mounted to the counter, so I couldn’t turn it away from the glare.
  • Layout. Place the two lines of text—”PLEASE DO NOT” and “REMOVE CARD”—closer together, so they’re perceived as one paragraph. When perceived as separate paragraphs, the words “REMOVE CARD” are an incorrect instruction.
  • Capitalisation. Use sentence capitalisation to show that “remove card” is only part of an instruction, not the entire instruction.
  • Wording. Give the customer a positive instruction: “Leave your card inserted” could work. But I’d test with real customers to confirm this.
  • Predict the wait time. Actively show the customer how much longer to wait before removing their card. 15 seconds…, 10 seconds…, and so on.
  • Informal training. Sometimes, the cashier tells you on which side of the machine to insert your card, when to leave it inserted, and when to remove it.
  • Can you think of other ideas?

Listing many potential ideas—even expensive and impractical ones—is a worthwhile exercise, because a “poor” idea may trigger other ideas—affordable, good ideas. After the ideas are generated, they can be evaluated. Some would be costly. Some might solve one problem but cause another. Some are outside of the designers’ control. Some would have to have been considered while the device was still on the drawing board. Some are affordable and could be applied quickly.

Making improvements

Designers of chip-card readers have already made significant improvements by considering the customer’s whole experience, not just their use of the card-reader machine in isolation. In early versions, customers would often forgot their cards in the reader. With a small software change, now, the card must be removed before the cashier can complete the transaction. This dependency ensures customers take their card with them after they pay. One brand of card reader is designed for customers to insert their card upright, perpendicular to the screen. This makes the card more obvious, and—I’m giving the designer extra credit—the upright card provides additional privacy to help shield the customer’s PIN from prying eyes. These changes show that the design focus is now on more than just verifying the PIN; it’s about doing it quickly and comfortably, without compromising future use of the card. It’s about the whole experience.

A good hardware designer works with an interaction designer to make a device that works well in its environment. A good user-experience designer ensures customers can succeed with ease. A good usability analyst tests the prototypes or early versions of the device and the experience to find any glitches, and recommends how to fix them.

Unreliability of self-reported user data

Many people are bad at estimating how often and how long they’re on the phone. Interestingly, you can predict who will overestimate and who will underestimate their phone usage, according to the 2009 study, “Factors influencing self-report of mobile phone use” by Dr Lada Timotijevic et al. For this study, a self-reported estimate is considered  accurate if it is within 10% of the actual number:

Defining 'accuracy'

Underestimated Accurate Overestimated
Number of phone calls (number of people) (number of people) (number of people)
High user 71% 10% 19%
Medium user 53% 21% 26%
Low user 33% 16% 51%
Duration of phone calls
High user 41% 20% 39%
Medium user 27% 17% 56%
Low user 13% 6% 81%

If people are bad at estimating their phone use, does this mean that people are bad at all self-reporting tasks?

Not surprisingly, it depends how long it’s been since the event they’re trying to remember. It also depends on other factors. Here are some factoids that should convince you to be careful with self-reported user data that you collect.

What’s the problem with self-reported data?

On questions that ask respondents to remember and count specific events, people frequently have trouble because their ability to recall is limited. Instead of answering “I’m not sure,” people typically use partial information from memory to construct or infer a number. In 1987, N.M. Bradburn et al found that U.S. respondents to various surveys had trouble answering such questions as:

  • During the last 2 weeks, on days when you drank liquor, about how many drinks did you have?
  • During the past 12 months, how many visits did you make to a dentist?
  • When did you last work at a full-time job?

To complicate matters, not all self-report data is suspect. Can you predict which data is likely to be accurate or inaccurate?

  • Self-reported Madagascar crayfish harvesting—quantities, effort, and harvesting locations—collected in interviews was shown reliable (2008, Julia P. G. Jones et al).
  • Self-reported eating behaviour by people with binge-eating disorders was shown “acceptably” reliable, especially for bulimic episodes (2001, Carlos M. Grilo et al).
  • Self-reported condom use was shown accurate over the medium term, but not in the short term or long term (1995, James Jaccard et al).
  • Self-reported numbers of sex partners were underreported and sexual experiences and condom use overreported a year later when compared to self-reported data at the time (2002, Maryanne Garry et al).
  • Self-reported questions about family background, such as father’s employment, result in “seriously biased” research findings in studies of social mobility in The Netherlands—by as much as 41% (2008, Jannes Vries and Paul M. Graaf).
  • Participation in a weekly worship service is overreported in U.S. polls. Polls say 40% but attendance data says 22% (2005, C. Kirk Hadaway and Penny Long Marler).
Can you improve self-reported data that you collect?

Yes, you can. Consider these:

  • Decomposition into categories. Estimates of credit-card spending gets more accurate if respondents are asked for separate estimates of their expenditures on, say, entertainment, clothing, travel, and so on (2002, J. Srivastava and P. Raghubir ).
  • For your quantitative or qualitative usability research or other user research, it’s easy to write your survey questions or your lines of inquiry so they ask for data in a decomposited form.

  • Real-time data collection. Collecting self-reported real-time data from patients in their natural environments “holds considerable promise” for reducing bias (2002, Michael R. Hufford and Saul Shiffman).
    Collecting real-time self-report data
  • This finding is from 2002. Social-media tools and handheld devices now make real-time data collection more affordable and less unnatural. For example, use text messages or Twitter to send reminders and receive immediate direct/private responses.

  • Fuzzy set collection methods. Fuzzy-set representations provide a more complete and detailed description of what participants recall about past drug use (2003, Georg E. Matt et al).
  • If you’re afraid of math but want to get into fuzzy sets, try a textbook (for example, Fuzzy set social science by Charles Ragin), audit a fuzzy-math course for social sciences (auditing is a low-stakes way to get things explained), or hire a tutor in math or sociology/anthropology to teach it to you.

Also, when there’s a lot at stake, use multiple data sources to examine the extent of self-report response bias, and to determine whether it varies as a function of respondent characteristics or assessment timing (2003, Frances K. Del Boca and Jack Darkes). Remember that your qualitative research is also one of those data sources.

The delight of insight

One of the things I really like about usability research is that moment of insight, when I see a problem. In the comic book of my life, those moments look like this:

Oh!

This experience—this feeling of surprise and delight—is not about finding an error. It’s about learning how the product performs in the hands of users, so it can be improved. In a sense, GUI design is like a puzzle that must be assembled well in order for the user’s brain to see the intended picture. It’s rare that a team gets it exactly right on the first try. That’s why a good development process expects and plans for rapid iteration—and expects formative testing of either prototypes or early working versions to inform the next iteration.

Rapid iteration

Fling that thing! If it won’t fly, bend the wing!
Formative testing of Iteration 1 informs the design of Iteration 2.

For me, realising there’s a problem means we can iterate the product to address the problem. That’s exciting. But realising there’s a problem can also be disappointing or humbling. From time to time, I’ve said to myself:

“I thought this design was really good.”

“Why didn’t I see that problem coming? It’s SO obvious, in hindsight.”

This experience points to the problem with heuristic reviews done by an individual expert. (In contrast, experts who work in pairs come closer to finding all the problems in the set.) But either way, experts cannot predict all the usability and interaction problems in a software product. That’s why testing and a process that includes rapid iteration are necessary.

The insight and the subsequent exploration of the problem lead to the best parts of the experience: deciding how to present the test results most effectively (as a critique of the prototype, not as a criticism of the effort to date), and facilitating the process of designing the next iteration.

Learning from a poke in the face

During usability testing, I’m always fascinated to see how creatively users misinterpret the team’s design effort. I’ve seen users blame themselves when our design failed, and I’ve seen users yell at the screen because our GUI design was so frustrating.

One Wednesday, over a decade ago, the tables were turned.

I unintentionally “agreed” to let Facepoke—that social-networking site—invite everyone with whom I’d ever exchanged e-mail. Think about all the people you may have exchanged e-mail with. Former bosses and CEOs. Your kid’s teachers and the principal, too. People you used to date. Prospective business partners, or people you’ve asked for work but who turned you down. Your phone company, car-rental company, bank, and insurance company. Government agencies. The person you just told “I’m too busy to volunteer,” and your teammates from that course in 2005. Your e-mail records are full of people that you simply wouldn’t want on your Facepoke page.

How could I be so stupid?

See paragraph 1:  User blames self for poor design.

Facepoke had been interrupting my flow for several days, offering to help me find Friends by examining my Gmail records.

  1. I gave in, chose three Friends, and clicked Invite.

  2. The screen flashed, but the list was still there.

  3. I clicked Invite again.

  4. Then came the moment of horror: I saw that the list had been changed! Switched! It was now a list of every e-mail address in my Gmail account that was not already associated with a Facepoke account.

    With that second click, I had “agreed” to let Facepoke invite everyone with whom I had ever exchanged e-mail. There was no confirmation, no “Invite 300 people? Really?!?”

  5. I sought in vain for a way to Undo.

With each passing minute, I thought of more and more people who would have received this inappropriate inivitation to join me on Facepoke.

FacepokeWhy wasn’t there a confirmation?

See paragraph 1:  User emotes in frustration.

Note to self: Always do better than this

In my usability- and design work, I will continue to ask: “What’s the worst that can happen?” I will promote designs that prevent the worst that can happen. I will not present two apparently identical choices back to back, one of little consequence, one of great consequence. I will allow users to control their account and to Undo or recover from their unintended actions. I will not make users feel like they’ve been misled.

Software UX/GUI design in education

I was wondering whether the “design” of web sites and software is anything more than “intermediation” (inserting a layer between the user and the raw data), whether “intermediation” is just a synonym for “information architecture,” and whether “design” must therefore be something greater—something that includes the emotional impact of the experience. Or is that last phrase merely another way to say “user-experience design”?

Apparently, it was a day for wondering, because, next, I thought about the many excellent software developers I’ve worked with, and wondered how they would respond to my apparently pointless musings. Then I wondered: would the opinions of my software-development colleagues be informed by their formal education or their work experience, attendance at conferences, or professional development reading? [For me, as a usability practitioner and CUA, it’s all of the above.]

What core competencies are taught?After this, I wondered how much software developers are formally taught about user-experience design and user-interface design, in school.

A quick online search led me to the course lists, summarised in the table, below, for the different program types offered where I live. I’ve highlighted the two courses that specifically mention  interface design. There’s no mention of  usability, or of the all-encompassing  user experience. There is one program at Capilano College that includes user-experience design, and my own course, Fundamentals of user-interface design, is only offered every two years through one of SFU’s continuing studies programs. Also, I’ve noticed an increase in the proportion of software-development students at monthly Vancouver User-Experience events. So change is in the wind.

What’s the situation in your community of practice?

It seems to me there’s a hole in the bucket, but we can mend it. The answer simple. Go back to your school and ask to sit as an industry representative on the academic-advisory committee. The local chapter of your professional association can help open doors. Once appointed to the committee, participate in a curriculum review. This is a slow, formal, and somewhat political process—but it works. It’s a great way for experienced software developers and interaction designers to improve our communities of practice. And it looks good on a resume.

Bachelor degree, Computer science, Simon Fraser University Certificate, Software systems development, BC Institute of Technology Certificate, Software engineering, University of British Columbia
CMPT310 Artificial Intelligence Survey.
CMPT411 Knowledge Representation.
CMPT412 Computational Vision.
CMPT413 Computational Linguistics.
CMPT414 Model-Based Computer Vision.
CMPT417 Intelligent Systems.
CMPT418 Computational Cognitive Architecture.
CMPT419 Special Topics in Artificial IntelligenceComputer Graphics and Multimedia.
CMPT361 Introduction to Computer Graphics.
CMPT363 User Interface Design.
CMPT365 Multimedia Systems.
CMPT368 Introduction to Computer Music Theory and Sound Synthesis.
CMPT461 Image Synthesis.
CMPT464 Geometric Modeling in Computer Graphics.
CMPT466 Animation.
CMPT467 Visualization.
CMPT469 Special Topics in Computer GraphicsComputing Systems.
CMPT300 Operating Systems I.
CMPT305 Computer Simulation and Modeling.
CMPT371 Data Communications and Networking.
CMPT379 Principles of Compiler Design.
CMPT401 Operating Systems II.
CMPT431 Distributed Systems.
CMPT432 Real-time Systems.
CMPT433 Embedded Systems.
CMPT471 Networking II.
CMPT479 Special Topics in Computing Systems.
CMPT499 Special Topics in Computer Hardware.
CMPT301 Information Systems Management.
CMPT354 Database Systems I.
CMPT370 Information System Design.
CMPT454 Database Systems II.
CMPT456 Information Retrieval and Web Search.
CMPT459 Special Topics in Database Systems.
CMPT470 Web-based Information Systems.
CMPT474 Web Systems Architecture.
CMPT373 Software Development Methods.
CMPT383 Comparative Programming Languages.
CMPT384 Symbolic Computing.
CMPT473 Software Quality Assurance.
CMPT475 Software Engineering II.
CMPT477 Introduction to Formal Verification.
CMPT480 Foundations of Programming Languages.
CMPT481 Functional Programming.
CMPT489 Special Topics in Programming Languages.
CMPT307 Data Structures and Algorithms.
CMPT308 Computability and Complexity.
CMPT404 Cryptography and Cryptographic Protocols.
CMPT405 Design and Analysis of Computing Algorithms.
CMPT406 Computational Geometry.
CMPT407 Computational Complexity.
CMPT408 Theory of Computer Networks/Communications.
CMPT409 Special Topics in Theoretical Computing Science.
MACM300 Introduction to Formal Languages and Automata with Applications.
SSDP1501 Systems Foundations 1. Application development, OOP, C#, Java, fundamentals of programming and program design.
SSDP2501 Systems Foundations 2. Web-based applications, architecture, web design, principles, HTML, XHTML, CSS.
SSDP3501 Systems Foundations 3. Medium and large-scale applications, dynamic web technologies, project management, relational databases, security issues of web applications.
SSDP4001 Specialty Topics. Enterprise-scale applications, ASP.net, advanced Java.
SSDP5001 Projects. Practical experience with an internal and external software-development project.
IE535 Software Teamwork: Taking Ownership for Success.
IE520 Introduction to Practical Test Automation.
IE523 Agile Development Methodologies.
IE527 Applied Practical Test Automation.
IE507 Object-Oriented Methods: Object-Oriented Modelling and Development with UML.
IE526 Principles and Components of Successful Test Team Management.
IE503 Requirements Analysis and Specification: A Practical Approach.
IE505 Software and System Testing: Real-World Perspective.
IE504 Software Architecture and Iterative Development Process: Managing Risk through Better Architecturd.
IE510 Software Configuration Management: Controlling Evolution.
IE502 The Software Engineering Process.
IE506 Software Project Management.
IE509 Software Quality Assurance: More Than Just Testing.
IE511 Software Team Project.
IE525 Strategic Test Analysis and Effective Test Case Design.
IE528 Testing for the Global Market.
IE508 User Interface Design: Designing an Effective Software Interface.

How to test earlier

Involving users throughout the software-development cycle is touted as a way to ensure project success. Does usability testing count as user contact? You bet! But since most companies test their products later in the process, when it’s difficult to react meaningfully to the user feedback, here are two ways to get your testing done sooner.

Prioritise. Help the Development team rank the importance of the individual programming tasks, and then schedule the important tasks to complete early.

  • Prioritise and schedule the tasksIf a feature must be present in order to have meaningful interaction, then develop it sooner.
  • For example, email software that doesn’t let you compose the message is meaningless. To get meaningful feedback from users, they need to be able to type an e-mail.

    Developers often want to start with the technologically risky tasks. Addressing that risk early is good, but it must be balanced against the risk of a product that’s less usable or unusable.

  • If a feature need not be present or need not be working fully in order to have meaningful interaction, then provide hard-coded actions in the interim, and add those features later.
  • For example, if the email software lets users change the message priority from Standard to Important, hard-code it for the usability test so the priority is always Standard.

  • If a less meaningful feature must to be tested because of its importance to the business strategy, then develop it sooner.
  • For example, email software that lets users record a video may be strategically important for the company, though users aren’t expected to adopt it widely until most laptops ship with built-in cameras.

Schedule. For each feature to be tested, get the Development team to allocate time to respond to usability recommendations, and then ensure this time is neither reallocated to problem tasks, nor used up during the initial development effort of the to-be-tested features. Engage the developers by:

  • Sharing the scenarios in advance.
  • Updating them on your efforts to recruit usability-study participants.
  • After developers incorporate your recommendations, retesting and then reporting improvements in user performance.

Development planning that prioritises programming tasks based on the need to test, and then allows time in the schedule to respond to recommendations, is more likely to result in usable, successful products.

User mismatch: discard data?

When you’re researching users, every once in a while you come across one that’s an anomaly. You must decide whether to exclude their data points in the set or whether to adjust your model of the users.

Let me tell you about one such user. I’ll call him Bob (not his real name). I met Bob during a day of back-to-back usability tests of a specialised software package. The software has two categories of users:

  • What type of user is this?Professionals who interpret data and use it in their design work.
  • Technicians who mainly enter and check data. A senior technician may do some of the work of a professional user.

When Bob walked in, I went through the usual routine: welcome; sign this disclaimer; tell me about the work you do. Bob’s initial answers identified him as a professional user. Once on the computer, though, Bob was unable to complete the first step of the test scenario. Rather than try to solve the problem, he sat back, folded his hands, and said: “I don’t know how to use this.” Since Bob was unwilling to try the software, I instead had a conversation (actually an unstructured interview) with him. Here’s what I learned:

My observation For this product, this is…
Bob received no formal product training. He was taught by his colleagues. Typical of more than half the professionals. 
Bob has a university degree that is only indirectly related to his job.  Atypical of professionals.
He’s young (graduated 3 years ago). Atypical of professionals, but desirable because many of his peers are expected to retire in under a decade.
Bob moved to his current town because his spouse got a job there. He would be unwilling to move to another town for work. Atypical. Professionals in this industry typically often work in remote locations for high pay.
•  Bob is risk averse. Typical of the professionals.
•  He is easily discouraged, and isn’t inclined to troubleshoot. Atypical. Professionals take responsibility for driving their troubleshooting needs.
Bob completes the same task once or several times a day, with updated data each time. Atypical of professionals. This is typical of technicians. 

I decided to discard Bob’s data from the set.

The last two observations are characteristic of a rote user. Some professionals are rote users because they don’t know the language of the user interface, but this did not apply to Bob. There was a clear mismatch between the work that Bob said he does and both his lack of curiosity and non-performance in the usability lab. These usability tests took place before the 2008 economic downturn, when professionals in Bob’s industry were hard to find, so I quietly wondered whether hiring Bob had been a desperation move on the part of his employer.

If Bob had been an new/emerging type of user, discarding Bob’s data would have been a mistake. Imagine if Bob had been part of a new group of users:

  • What would be the design implications?
  • What would be the business implications?
  • Would we need another user persona to represent users like Bob?

Usability testing distant users

When a product’s users are scarce and widely dispersed, and your travel budget is limited, usability testing can be a challenge.

Remote testing from North America was part of the answer, for me. I’ve never used UserVue because the users I needed to reach were in Africa, Australia, South America, and Asia—continents that UserVue doesn’t reach. Even within North America, UserVue didn’t address the biggest problems I faced:

  • My study participants commonly face restrictive IT policies, so they cannot install our pre-release product and prerequisites.
  • I need to prevent study participants from risking their data by using it with a pre-release product.
  • There’s no way to force an uninstall after the usability test. Who else will see our pre-release?

Instead, I blended a solution of my own with Morae, Skype, Virtual Audio Cable, and GoToMeeting. I Testing that's really remoteused GoToMeeting to share my desktop, which addresses all three of the problems listed above. I used Skype to get video and audio. I used Virtual Audio Cable to redirect the incoming voice from Skype to Morae Recorder’s microphone channel. Morae recorded everything except the PIP video. It worked. However, my studies were sometimes limited by poor Internet bandwidth to the isolated locations of my study participants.

Amateur facilitators. I realise this is controversial among usability practitioners, but beggars in North America can’t be choosers about how they conduct usability tests on other continents. I developed a one-hour training session for the team of travelling product managers. Training included a short PowerPoint presentation about the concepts, followed by use of Morae Recorder with webcam and headset while role-playing in pairs. The main points I had to get across:

  • Between study participants, reset the sample data and program defaults.
  • When you’re ready to start recording, first check that the video and audio are in focus and recording.
  • While you facilitate, do not lead the user. Instead, try paraphrasing and active listening (by which I mean vernacular elicitation). Remember that you’re not training the users, so task failure is acceptable, and useful to us.

I had a fair bit of influence over the quality of the research, since I developed the welcome script and test scenarios, provided the sample data, and analysed the Morae recordings once they arrived in North America. Due to poor Internet bandwidth to the isolated areas of my study participants, the product managers had to ship me the Morae recordings on DVD, by courier.

It worked. I also believe that amateur facilitation gave the product managers an additional opportunity to learn about customers.

Blended usability approach “best”

I received a brochure in the mail, titled Time for a Tune-up: How improving usability can improve the fortunes of your web site. It recommends this blend of usability methods:

  • A blended approachExpert reviews focus on workflows and give best results when the scope is clearly defined.
  • Usability studies are more time-consuming than expert reviews, but is the best way to uncover real issues.
  • Competitor benchmarking looks at the wider context.

The brochure is written by online-marketing consultants, and with Web sites in mind, but its content is also relevant to other development activities.

Effectiveness of usability evaluation

Do you ever wonder how effective expert reviews and usability tests are? Apparently, they can be pretty good.

Rolf Molich and others have conducted a series of comparative usability evaluation (CUE) studies, in which a number of teams evaluate the same version of a web site or application. The teams chose their own preferred methods—such as an expert review or a usability test. Their reports were then evaluated and compared by a small group of experts.

What the first six CUE studies found

About 30% of reported problems were found by multiple teams. The remaining 70% were found by a single team only. Fortunately, of that 70%, only 12% were serious problems. In one CUE study, the average number of reported problems was 9, so a team would be overlooking 1 or 2 serious problems. The process isn’t perfect, but teams found 80% or more of the serious problems.

Teams that used usability testing found more usability problems than expert reviewers. However, expert reviewers are more productive-they found more issues per hour-as this graph from the fourth CUE study illustrates:

CUE study 4 results

Teams that found the most usability problems (over 15 when the average was 9) needed much more time than the other teams, as illustrated in the above graph. Apparently, finding the last few problems takes up the most time.

The CUE studies do not consider the politics of usability and software development. Are your developers sceptical of usability benefits? Usability studies provide credibility-boosting video as visual evidence. Are your developers using an Agile method? Expert reviews provide quick feedback for each iteration.

To learn more about comparative usability evaluation, read about the findings of all six CUE studies.