Making Computers Accessible to Everyone
Lessons learned from interface design for disabled people reflect some of the most advanced research in IT -- and mean better tools for all of us.
Neil Scott is Leader and Chief Engineer for the Archimedes Project at Stanford University's Center for the Study of Language and Information. For his invention of the Total Access System, described in this story, Scott was nominated in the 1997 Discover Magazine awards as one of the five top innovators in the United States in the field of computer hardware and electronics. The lead article in the January 2000 edition of San Francisco magazine featured him as one of fifteen Bay Area futurists who will shape the way people live, think, work and play in the new millennium.
On the surface, it appears that software developers pay little regard to how much time and effort users invest in mastering the user interface to each computer program. All around the world, users face a hiatus whenever they are forced to relearn the user interface for an updated version of a favorite word processor or e-mail program. For instance, each time Microsoft delivers a new version of Word, they disrupt the working patterns of millions of users, who spend time searching for tools that have mysteriously disappeared from their familiar menu locations. Sometimes the tools have been regrouped with other tools for reasons known only to the person who "improved" that particular section of the program, or they might have been given a new name, or they may have been replaced by a new and more sophisticated tool. I am sure I am not the only person whose hard drive is littered with upgrades I discarded because they required too much effort to learn how to use them.
Version obsolescence is firmly entrenched in the computer industry as a way of forcing organizations to upgrade their hardware and software more frequently than necessary. Many different ploys are used to force version obsolescence on users. Old software bugs are fixed so you will upgrade from the previous version, and new bugs are introduced to ensure you will have to move up to a future version. Files from an earlier version are not recognized by the newer version thereby forcing you to upgrade any older systems you may have, even though they still perform their intended function perfectly. New computers are shipped with the latest version of the operating system that is incompatible with older systems on your network. You find that the new version is locked-in and can't be removed and, once again, you have no choice but to upgrade everyone to the newer systems. Yet another variation is to stop providing software drivers for certain types of peripherals so that you will need to upgrade them as well.
Every computer user is painfully aware of the effort required to learn a new or revised interface. Software developers deliberately make the basic interface appear similar to, and behave similarly, to everyone else's so novices will be able to begin using it quickly without assistance. They use a totally different strategy for advanced users, however. Here, a unique interface provides an advantage for the suppliers since a user who has invested a great deal of time and effort in learning to use a particular product is unlikely to move to another product. The software industry has adopted the term "stickiness" to describe how well their product retains users. Each upgrade provides an opportunity to enshroud committed users ever more tightly in their interface. In a rational society, stickiness would result from user satisfaction rather than a software variation of flypaper.
There is a way, however, of avoiding this software flytrap. The developers of access tools for disabled people are extremely conscious of version obsolescence. Wherever possible, access tools are designed to work across multiple versions of applications and operating systems because disabled users don't have the resources to upgrade on a regular basis and the vendors of access tools are too small to support multiple versions of their products. Lessons learned from designing accessibility tools provide a way for organizations to take control of version obsolescence.
Disabled computer users are particularly vulnerable to version obsolescence because they depend on their computers for doing many of the things the rest of us take for granted. For many of them, a computer provides the only means for communicating with others, controlling the environment, and earning a living. The hiatus that occurs when they must learn a new interface or come to grips with illogical or unreliable software is much more than an inconvenience, it is a threat to their independence. Many blind people, for instance, lost their jobs or had them severely downgraded when the Windows operating system replaced DOS. They were no longer able to compete in the workplace when their DOS-based accessibility tools no longer worked with the Graphical User Interface (GUI). It took almost three years for GUI access software to achieve a reasonable level of performance. Today, access to the GUI is still an ongoing challenge for blind people.
There are basically two strategies for making computers accessible to people with disabilities. The first is to provide alternative ways for performing critical operations by modifying the computer hardware and/or software. Until recently, this was the only practical option, and it generally limited the usefulness of a particular system for everyone except the disabled person for whom it was configured. This was not an issue when most people interacted with only one computer, but it has become a significant problem now that individuals must interact directly, or indirectly, with multiple computers.
The Archimedes Project at Stanford University has developed a second access strategy that focuses on the access needs of the individual user rather than modifying the operation of the device that is being accessed. The Total Access System, shown in Figure 1, cleanly separates the user interface from the applications performed on the targeted computer by splitting the human/computer interface into three distinct parts.
- The Total Access Port (TAP) is a small microprocessor-based device connected between the physical screen, keyboard and mouse and the matching ports on the target computer. The screen, keyboard and mouse continue to perform their normal functions on the target system but, additionally, the TAP is able to emulate all aspects of their operation in software. Different TAPs are required for each type of target device that is to be made accessible. Currently, TAPs have been developed for all of the common personal computer workstations (IBM, Sun, SGI, HP, Macintosh), industrial control networks, and household appliances. Additional TAPs will be designed as needs are identified.
- The accessor is a personal computing device that implements an interface closely matched to the individual needs, abilities and preferences of the particular user. Accessors are currently constructed by adding appropriate hardware and software to notebook or handheld computers. Inputs for accessors include: special keyboards, ultra sensitive switches, hand writing recognition, speech recognition, head tracking, and eye tracking. Outputs include: special display screens, speech synthesis, Braille, animated graphics, multidimensional sound, haptic (force feedback), robotic devices and appliance controllers. Future accessors may be purpose-designed to provide highly specialized capabilities and performance.
- The TAS Protocol is a standardized data communications link that connects any accessor to any TAP. The TAS Protocol allows all user accessible operations on a target system to be controlled by simple, consistent accessor instructions. A scripting language called TAPScript allows concise accessor commands to initiate and control complex tasks on the target system.
The most important property of the TAS is that an accessor requires no prior knowledge of a target system. A basic level of communication allows any accessor to immediately interact with any TAP. This is augmented through a dialog that automatically occurs when an accessor is first connected to a TAP. After each component identifies itself and declares its capabilities and needs, additional information that may be required to support future interactions is passed back and forth. The TAS Protocol ensures that a person with an accessor always has at least the same level of capability for interacting with a target as someone walking up and using the standard input and output devices. While not essential, an accessor may remember details of target systems to which it is frequently connected.
TAS isolates the user from the hardware platform, the operating system and the type of application that is being used. In other words, a single accessor allows a person to work equally well on a PC or a Sun, to control an ATM machine or to turn lamps on or off. Furthermore, while one person might use a speech recognition accessor to perform all of these tasks, another might use an eye tracking accessor, the target devices would be completely unaware that anything other than the standard user interface was being used.
You can think of an accessor as being rather like a pair of spectacles in which lenses, uniquely matched to the characteristics of a person's eyes, provide undistorted access to the surrounding visual environment. An accessor, matched to the physical, visual, hearing, or cognitive characteristics of a user, provides undistorted access to the surrounding information infrastructure.
Many of the tasks we perform on a computer require different modes of interaction. The simplest example of this is entering text with a keyboard and pointing to an object on the screen with the mouse. People who can't use their hands -- or prefer not to -- often use speech recognition as an alternative way to perform these functions. While speech recognition does a fair job of entering text, it is a very clumsy way to control a mouse. Head tracking, on the other hand, performs mouse functions extremely well but is a clumsy alternative to the keyboard for entering text. Combining these strategies, however, produces a solution that is superior to using either alone. Performance can be improved even further by adding a foot switch for clicking the mouse buttons.
Adding speech recognition, head tracking and foot switches to a conventional computer without disturbing its normal operation is a far from trivial exercise that may involve wrestling with hardware and software compatibility issues, searching for suitable software drivers, and compensating for degraded system performance. In contrast, TAS is designed to support multimodal operation directly so adding alternative interfaces is extremely quick and easy and the performance of the target system is not degraded in any way.
One of our ongoing research efforts is focused on developing a collaborative infrastructure for TAS that will enable accessors to collaborate with each other to perform complex tasks. An example of this could be a speech recognition accessor that informs other accessors (such as a head tracker or foot switch) which target machine should receive any mouse commands that may be generated. This ability to use multiple accessors to perform complex tasks dramatically increases the effectiveness and potential productivity of an interface. There is a potential downside, however. Unintentional or misrecognized commands may lead to results that are very difficult to undo. One of the collaborative functions we are exploring, therefore, is a universal undo switch that can be used at any time to recover from the results of unintended accessor commands. Each accessor tells the undo switch how to undo each command before sending it to the target machine.
Accessibility-tool developers strive to create systems that enable disabled users to be competitive in the workplace, but there are many barriers that must be overcome in achieving this. Some are physical, some psychological, and some are due to stigmas and the perceptions and assumptions of employers and fellow employees. While there are numerous ways to get past each of these barriers, finding the optimum path for each disabled individual is extremely challenging. TAS simplifies the process by making it quick and easy to evaluate different approaches. Studying the TAS in a variety of working situations has shown that it is possible for disabled workers to achieve productivity levels that are equal to or better than their non-disabled peers.
One of the core strategies for removing barriers is to bury the complexities inside the technology. In other words, make things appear simple and logical to the user even when the underlying functions are complex. Mainstream designers have lost sight of this concept (if they ever had it). Software marketing is driven by how many different capabilities can be built into a product. Users of even the most basic software tools are often overwhelmed by the number of options they must navigate to perform each task. This problem is escalating as each upgrade of a program crams in more features, each of which requires another icon and/or menu entry. Making room for the new features within the existing menus often necessitates removing, renaming or relocating existing options. The need to search for familiar functions in an ever changing user interface makes it difficult for users (and accessibility programs) to develop the reflexive working habits required for working quickly.
So, how do we hide the complexity and restore simplicity and stability?
Since the birth of the computer, Artificial Intelligence (AI) researchers have been attempting to make computers behave like people, but they have been foiled by the magnitude of the task. In hindsight, it seems that many of them were trying to boil the ocean because they did not place sufficient constraints on the scope of what they were attempting to do. Traditional AI solutions have pretty much fallen by the wayside in the development of user interfaces.
There has been a recent resurgence of interest, however, in using some of the AI techniques to create intelligent agents that focus very tightly on selected facets of human/computer interaction. Rather than attempting to turn the computer into an artificial person, intelligent agents perform a limited range of interactions in a human-like manner. You can think of intelligent agents as small bundles of software that know how to perform a small, carefully defined task very well and, equally importantly, they also know when not to do anything.
A strategy called "task delegation" is the key to using this type of artificial agent. When an agent receives a request to perform a task, it breaks it into subtasks and broadcasts a request for assistance to a myriad of subordinate agents. Only those agents that recognize they know something about a particular subtask respond by repeating the process with their own subordinate agents. This process repeats until all of the subtasks are completed whereupon the results are passed back up to the agent that originated the request. If the agents are unable to perform a given task, there is a learning process that allows them to quiz the user to find out how to handle a similar task in the future.
Archimedes researchers are adding intelligent agent software to accessors to disambiguate user commands -- that is, to make human language commands intelligible to the system. While all accessors have a need for disambiguation, its advantages can be clearly seen by looking at its potential impact on speech recognition. The promise of speech recognition has been that users can do everything they need to do with their computer merely by talking to it. The reality is that most people don't have a clue about what to say.
Dictating text into an e-mail message or memo is trivial with the latest speech recognizers. Almost anything you say is recognized and entered into the active text window. It is not trivial, however, to tell the program what to do with the text. Users must learn how to say command scripts such as "click file save as click tab notes for Neil click enter." An equivalent colloquial English language command might be, "Save this file as notes for Neil." It could also be "Call this file notes for Neil and save it." There is nothing that makes a particular instruction right or wrong and it is natural for different people express their intentions in different ways.
It is not natural, however, for computers to force users into using arcane scripts that don't even follow natural language -- that language we use in everyday communication. This is where intelligent agents enter the picture and offer real benefits. Within a constrained context, intelligent agents can now take messages such as the last two examples and reduce them to the cryptic command scripts expected by the computer. Other strategies such as Natural Language Processing (NLP) can be used to achieve similar results but they require more design effort and greater computing resources than is practical in most accessors.
Archimedes researchers are also exploring productivity-based applications that have no direct links to disability. In particular, we foresee substantial productivity gains in graphics-intensive applications such as computer-aided design and drafting (CADD) and photogrammetric analysis of satellite images. In addition to the potential advantages already discussed, we are also investigating another interesting result of using multimodal interfaces with these applications. Working with complex graphical images is often slowed down by the need for precise manipulation of the mouse or other pointing device. Combining spoken commands with pointing, for instance, can reduce the need for precision movements by allowing the user to say what is to happen to an object that is currently being pointed to by the mouse. For example, placing the cursor anywhere on a circle and saying "snap to center" is much quicker than clicking on a menu item and then moving the mouse to within the necessary snapping distance, particularly if there are other potential targets in close proximity. Mouse operations can also be sped up by using haptic or force feedback techniques originally developed to enable blind people to feel screen objects.
We believe that combining the hardware and software independence of the TAS with the disambiguation capabilities of intelligent agents will revolutionize the human computer interface. TAS will allow different interaction modes to be combined in a mix-and-match manner as and when they are needed, and the intelligent agents will enable the user to express his or her intentions in a natural manner. While individuals with disabilities will be among the first to benefit, all computer users will ultimately find that this approach makes computers easier, faster, safer and more productive.
Released: June 22, 2000
iMP Magazine: http://www.cisp.org/imp/june_2000/scott/06_00scott.htm
© Copyright 2000. Neil G. Scott. All rights reserved.