Amazing Apple Anecdote: When OS Testing Backfires

Amazing Apple Anecdote: When OS Testing Backfires

We’re back with another Amazing Apple Anecdote for you to enjoy. This week we learn how some controlled user testing of an OS backfired due to an amusing typo.

Andy Hertzfeld recalls (via Folklore):

Many of the academic types who were involved in creating the earliest implementations of the graphical user interface at Xerox PARC and various universities sort of sneered at the first generation of personal computers when they appeared in the mid-seventies, since the early personal computers were much less powerful than the machines that they were used to programming. There wasn’t that much you could do with only four kilobytes of memory and no disk drive.

But Larry Tesler, who was a key member of the Smalltalk team in the Learning Research Group at Xerox PARC, felt differently. He was really excited by the potential of personal computers, buying a Commodore PET as soon as one became available in 1977. He was one of the demonstrators at Apple’s famous Xerox PARC visit in December 1979, and he was so impressed by the Apple visitors that he quit PARC and started working at Apple on July 17, 1980, as the manager of the Lisa Applications team.

Larry championed consistency between applications, and made many contributions to what eventually became the Macintosh User Interface. He was also the leading advocate and implementor at Apple of user testing: actually trying out our software out on real users and seeing what happened. Starting in the summer of 1981, Larry organized a series of user tests of the nascent Lisa software, recruiting friends and family to try out the software for the first time, while being observed by the Apple designers who recorded their reactions.

The user tests were conducted in a specially constructed room featuring a one-way mirror, so observers could watch the tests without being intrusive. The tests were conducted by a moderator who made sure the user felt comfortable and showed her the basics of using a mouse. Then, with no further instruction, users were asked to perform specific tasks, without help from the moderator, like editing some text and saving it. The moderator encouraged each user to mumble under her breath while doing the tasks, revealing her current thinking as much as possible. Each session was audio or videotaped for later analysis.

When the software required confirmation from the user, it displayed a small window called a “dialog box”, that contained a question, and presented two buttons, for positive or negative confirmation. The buttons were labeled “Do It” and “Cancel”. The designers observed that a few users seemed to stumble at the point that the dialog was displayed, clicking “Cancel” when they should have clicked “Do It”, but it wasn’t clear what they were having trouble with.

Finally, the team noticed one user that was particularly flummoxed by the dialog box, who even seemed to be getting a bit angry. The moderator interrupted the test and asked him what the problem was. He replied, “I’m not a dolt, why is the software calling me a dolt?”

It turns out he wasn’t noticing the space between the ‘o’ and the ‘I’ in ‘Do It’; in the sans-serif system font we were using, a capital ‘I’ looked very much like a lower case ‘l’, so he was reading ‘Do It’ as ‘Dolt’ and was therefore kind of offended.

After a bit of consideration, we switched the positive confirmation button label to ‘OK’ (which was initially avoided, because we thought it was too colloquial), and from that point on people seemed to have fewer problems.

So it’s clear that when it came to proofreading, Apple didn’t know how to Dolt. That’s it for this week but we’ll be back soon with another amazing anecdote, for now though, enjoy!