Computers play an integral role in the social and economic lives of most people, but some people are unable to use a traditional keyboard or touchscreen. For example, some spinal or brain injuries lead to locked-in syndrome where patients are almost fully paralyzed and cannot speak, even though their cognitive abilities are intact (Laureys et al., 2005). Such patients have very limited interaction with their environment, and computer access can greatly enhance their quality of life. To provide computer access, human–computer interactions leverage the limited muscle control (e.g., an eye blink or a small muscle contraction) of the user to trigger binary signals in switch devices. As described in detail below, such devices work with a virtual keyboard to guide a cursor on the keyboard to select a desired character or function. This type of switch keyboard thereby allows patients to engage in local and on-line discussions with caregivers, friends, and employers (Bauby, 1997; Tavalaro & Tayson, 1997).
Several virtual keyboard designs have been proposed by commercial companies. For example, the SwitchXS keyboard provides various modes that implement mouse input and keyboard characters (AssistiveWare, 2013). In a common design, an initial switch activation initiates a cursor that follows a path across the keyboard and stops to select an item when the user triggers the switch device again. A default approach is for the user first to select a row containing a desired item and thereby direct the cursor to move across that row’s items for an additional user selection. Other keyboard designs use a similar strategy but define differently the groups of items that can be selected. For example, in the Logo keyboard (Norte & Lobo, 2007), two major scanning groups (numeric and alphabetic) are displayed, and the user first triggers the switch to select one of these two groups, and then selects the item in the selected group as the cursor moves across the group.
Because of the very limited type of user input (only a binary switch), the cursor must scan (or be guided by switch actions) across different keyboard items. Indeed, these types of systems are sometimes referred to as scanning keyboards (MacKenzie & Felzer, 2010). Such an approach means that some items are reached relatively quickly (those near the beginning of the cursor path) while other items take longer to select (those near the end of the cursor path). Thus, the arrangement of characters along the path significantly affects input efficiency, and many studies have explored ways to improve the design of virtual keyboards (not only for patients, but also for other specialized keyboards). A simple and effective way to improve keyboard efficiency is to place commonly used keyboard items early in the cursor path (Hughes, Warren, & Buyukkokten, 2002; Mayzner & Tresselt, 1965; Zhai, Hunter, & Smith, 2002).
Other factors also influence the usability of a virtual keyboard. For example, the scanning speed of the cursor cannot be too fast (users would not have enough time to respond correctly) nor too slow (users would have to wait unnecessarily for the cursor to reach a desired item). Francis and Johnson (2011) showed that character placement on a switch virtual keyboard can be treated as a mathematical optimization problem with a trade-off between speed and accuracy. They also proposed an algorithm to design a keyboard that optimized the cursor speed for a given desired accuracy.
An important component of keyboard optimization is the path the cursor follows across the keyboard. There are many different possible cursor scan paths (MacKenzie & Felzer, 2010; MacKenzie 2012). Figure 1 shows a keyboard where the cursor follows a linear cursor path (see Additional file 1: Movie 1 for an animated version). Here, the cursor starts at the first key (top left) and then proceeds to each key one by one by wrapping from the end (right side) of one row to the beginning (left side) of the next row. When the cursor covers the key of interest, the user triggers the switch and thereby selects that item. With this kind of cursor path, selecting any virtual key requires only one action by the user, and 1 to 64 cursor steps.
Figure 2 (Additional file 2: Movie 2) demonstrates a row–column cursor path, which is commonly used in switch keyboard designs. Here, the cursor starts at the top row and first scans across the rows to allow the user to select the row containing the target character. When a row is selected, the cursor then scans the columns in the selected row so that the user is able to select the target key by triggering the switch device again. With this kind of cursor path, selecting any virtual key requires two actions by the user and two to 16 cursor steps.
Figure 3 (Additional file 3: Movie 3) shows the same kind of keyboard where the cursor follows a quadrant cursor path. Here the cursor first moves across the four quadrants of the keyboard. When the user triggers the switch, the cursor follows a row–column path within the selected quadrant. With this kind of cursor path, selecting any virtual key requires three selections by the user (one for the quadrant, one for the row, and one for the column), and three to 12 cursor steps.
Figure 4 (Additional file 4: Movie 4) shows the same kind of keyboard where the cursor follows a binary cursor path. Here, the cursor first moves across the left and right halves of the keyboard. A selection by the user focuses the cursor path to the selected side, which is then divided into a top and bottom. Further selections keep dividing the number of remaining keys in half and guide the cursor toward the target key. With this kind of cursor path, selecting any virtual key requires six selections by the user and six to 12 cursor steps.
As Figs. 1, 2, 3 and 4 and Additional file 1: Movie 1, Additional file 2: Movie 2, Additional file 3: Movie 3, and Additional file 4: Movie 4 demonstrate, different cursor paths can be assigned to the very same physical layout of the keyboard. In terms of guiding the cursor to a target key, it is the cursor path, rather than the physical layout of keys, that determines the efficiency of the keyboard. To enable users to learn and remember the cursor path, the physical layout of the keyboard may reflect the groupings of virtual keys along the path, but our analysis supposes that users know the cursor path regardless of its complexity.
The choice of a cursor path is important because it imposes requirements on the user and determines the time needed to select an item. For example, accessing a key on the keyboard for the linear cursor path requires only one selection from the user, while the binary cursor path requires six selections. If triggering the switch device is difficult for the user, it might be best to use a cursor path that needs few selections. On the other hand, the linear cursor path will take a long time to reach items at the end of the path (64 cursor steps for the final item), while the binary cursor path takes no more than 12 cursor steps. Several studies have explored the costs and benefits of different types of cursor paths (e.g., Koester & Simpson, 2014).
The need for severely disabled patients to communicate is so striking that researchers are continually exploring methods to improve their performance (e.g., Broderick & MacKay, 2009; Koester & Simpson, 2014; Lin, Wu, Chen, Yeh, & Wang, 2008; MacKenzie & Felzer, 2010; Miró-Borrás & Bernabeau-Solar, 2009; Morland, 1983; Sears & Zha, 2003). A challenge for these investigations is that different approaches require development of new tools for applying design constraints related to a particular instance. Here, we provide a general approach that can be modified to consider other constraints and aspects of switch keyboard designs.
Because it formed the motivation for the basic optimization approach, in this paper we focus on the optimization of a switch virtual keyboard design by considering a variety of factors that influence the speed and accuracy of a keyboard. The rest of the paper is organized as follows. The next section quantitatively describes the keyboard design problem and develops a mixed integer programming (MIP) algorithm that solves a key part of the design problem. This algorithm is orders of magnitude faster than one described in Francis and Johnson (2011), and many of the subsequent results are tractable only because the MIP algorithm provides a quick solution to a critical part of keyboard design optimization. The subsequent section describes a behavioral experiment that measures performance for a switch keyboard. The subsequent section explains how to develop a performance model from the behavioral data. With the resulting model, the final section describes optimized keyboards for several different situations and compares and contrasts different keyboard designs. The experimental data files, analysis scripts, and optimization programs are available at the Open Science Framework (https://osf.io/vuaxj/?view_only=489c5b2f9b8b45cbb866416ef1e50ef4).
An algorithm to solve the speed/accuracy trade-off
We describe the design problem in a general way because it demonstrates the common aspects of keyboard design for many different situations. An instance of the virtual keyboard design problem consists of an integer set \(\mathcal {I}=\{1,2,\dots,N\}\) that refers to characters and an integer set \(\mathcal {K}=\{1,2,\dots,N\}\) that refers to keyboard locations. The design task is to assign the characters to keyboard locations. Let F
i
be the frequency of the i-th character in a given text corpus. These character frequencies might be estimated from general databases, or might be based on the specific type of text entered by a specific user.
First consider the time needed to reach a location on the keyboard (speed of entry). Let S
k
be the number of cursor steps required to reach location k from the start of the cursor path. For example, for the row–column cursor path shown in Fig. 2, to reach position 12 (supposing the keys are indexed in sequence from left-to-right and top-to-bottom), it will first take two steps to reach the second row, and then four steps to reach the fourth column in row 2. Therefore, in this cursor path, the total number of steps to reach position 12 is S
12=2+4=6. Note that the value of S
k
depends on the cursor path; different cursor paths may lead to quite different values of S
k
even for the same key position. In addition, we let D be the duration that a cursor will stay on each step of the path. Therefore, the time to reach position k is D×S
k
.
Now consider the accuracy of reaching a location on a keyboard. Let P
k
be the probability of the user making an error while trying to guide the cursor to position k. It is reasonable to imagine that P
k
increases as D decreases (with a slower cursor the user is less likely to make an error), but the exact nature of the relationship needs to be measured or modeled. We will measure and model the relationship in later sections.
To promote subsequent calculations, we introduce an indicator variable X
ik
that is equal to 1 if the i-th character is assigned to position k and 0 otherwise. Then we define the average entry time across all entries as
$$ C_{t}= \frac{\sum\limits_{i\in \mathcal{I}}\sum\limits_{k\in \mathcal{K}}F_{i}DS_{k} X_{ik}}{\sum\limits_{i\in \mathcal{I}}F_{i}} $$
(1)
where most of the terms in the numerator summation will be zero, except for where item i is at position k. At those positions, the numerator sums the entry time (D
S
k
) multiplied by the frequency, F
i
, of character i. We also define the average error rate as
$$ C_{e}=\frac{\sum\limits_{i\in \mathcal{I}}\sum\limits_{k \in \mathcal{K}}F_{i}P_{k}X_{ik}}{\sum\limits_{i\in \mathcal{I}}F_{i}}, $$
(2)
which gives the average error probability across all character entries.
In general, C
t
and C
e
trade off each other. For example, placing the most commonly used characters near the beginning of the cursor path will decrease C
t
, but if those locations are error prone, then C
e
will increase. We suggest that a practical way to trade off speed and accuracy is to identify a user-defined acceptable error rate, ε. We then describe the problem as assigning characters to the keyboard so that C
e
≤ε and C
t
is minimized.
Another practical constraint on keyboard design is that some groups of characters should be spatially grouped because of historical reasons or user preferences. For example, the numerical characters (i.e., 0–9) are commonly co-located and ordered. For the keyboards considered here, we grouped them together and put them in the tail of the keyboard layout, that is, the numerical characters are assigned to locations as follows:
$$\begin{array}{@{}rcl@{}} \mathcal{G} &=\{(55, 55), (56, 56), (57, 57), (58, 58), (59, 59),\\ &\quad (60, 60), (61, 61), (62, 62), (63, 63), (64, 64)\}\notag \end{array} $$
(3)
where the indices 55–64 for the first coordinate correspond to the numerical characters 0–9 and the indices 55–64 for the second coordinate indicate the last ten locations on the keyboard. Figures 1, 2, 3, and 4 (and Additional file 1: Movie 1, Additional file 2: Movie 2, Additional file 3: Movie 3, and Additional file 4: Movie 4) show keyboards that follow these constraints.
We now describe how to find (Pareto) optimal solutions in which no other solution has both a lower average entry time and satisfactory error rate. The task can be modeled as a MIP problem:
$$\begin{array}{*{20}l} \text{minimize}~ &C_{t}&& \end{array} $$
(4a)
$$\begin{array}{*{20}l} \text{subject to}~ & \sum\limits_{k\in \mathcal{K}}X_{ik}=1, &&\text{for all}\ i\in\mathcal{I}; \end{array} $$
(4b)
$$\begin{array}{*{20}l} &\sum\limits_{i\in \mathcal{I}}X_{ik}=1, &&\text{for all}\ k\in\mathcal{K}; \end{array} $$
(4c)
$$\begin{array}{*{20}l} &X_{ik}=1, &&\text{for each}~ (i, k)\in\mathcal{G}; \end{array} $$
(4d)
$$\begin{array}{*{20}l} &C_{e} \le \varepsilon, \end{array} $$
(4e)
$$\begin{array}{*{20}l} &X_{ik} \in \{0, 1\}, &&\text{for each}\ i\in\mathcal{I}; k\in\mathcal{K}; \end{array} $$
(4f)
$$\begin{array}{*{20}l} &C_{t}, C_{e} \ge 0. && \end{array} $$
(4g)
Constraint (4a) seeks to minimize the average time needed to reach items on the keyboard. Constraint (4b) ensures that each character can be assigned to only one position on the keyboard. Constraint (4c) ensures that each position in the keyboard layout contains only one character. Constraint (4d) determines the arrangement of the numerical characters as mentioned above. Constraint (4e) ensures that the average error rate is no larger than the user-identified acceptable error rate ε. Constraints (4f) and (4g) are variable-type constraints that ensure the objective functions remain true to their definitions.
As an initial check on the MIP approach, we took the S
k
, F
i
, D, and P
k
terms as defined by Francis and Johnson (2011) and used Gurobi Optimizer 5.6 (Gurobi, 2013) to solve the MIP problem. The resulting solution of characters assigned to locations on the keyboard was very similar to what was produced by Francis and Johnson (2011) with a hill-climbing algorithm. However, the MIP algorithm was much faster. It took approximately 3 seconds to find a solution while the hill-climbing algorithm took approximately 3 h to generate essentially the same keyboard layout. Furthermore, the MIP algorithm is guaranteed to find an optimal keyboard layout that satisfies the acceptable error rate condition (if one exists), while the hill-climbing algorithm yields a keyboard layout that may not be optimal.
For any given keyboard design task, most of the terms in the MIP model are readily available: the cursor duration, D, can be taken as a given value for any instance; the character frequencies, F
i
, can be estimated from a text corpus; the number of cursor steps, S
k
, can be calculated for a given cursor path; and characters assigned to fixed positions, \(\mathcal {G}\), can be readily created as needed. The only terms that remain to be identified are the error probabilities for different keyboard positions, P
k
. In the next section, we describe how to estimate these terms with a behavioral study.
Estimates of error probabilities
This section demonstrates one way to estimate the P
k
terms that are needed to drive the optimization approach described above. P
k
corresponds to the probability that a user guiding the cursor to key k on a switch keyboard makes an error some place along the cursor path to that keyboard location. Ideally, these values would be guided by basic research on how quickly and reliably humans respond to dynamic stimuli. Although the literature on reaction time, timing, and predicted movements is enormous (Jensen, 2006; Posner, 1978), we were unable to identify published work that provides a model framework for characterizing the probability of correct responses for the kind of task involved in using a switch keyboard. Speed/accuracy trade-offs are commonly studied in areas such as traditional typing (Yamaguchi, Crump & Logan, 2013), but there the users largely control their actions rather than time their action to coincide with a stimulus event (the cursor being over an appropriate section of the keyboard). Likewise, the switch keyboard task seems related to simple reaction times, but effective use of the keyboard involves planning a precisely timed action rather than quickly responding to a stimulus. Such plans may occur well before the cursor covers a relevant part of the keyboard. Perhaps the area of basic research closest to the actions involved in a switch keyboard are measures of coincidence timing (Smith & McPhee, 1987). Even here, though, the fit is not perfect, as the cursor movements are highly learned and initiated by switch keyboard users. Ultimately, we suspect that switch keyboard users memorize a planned set of timed actions that are learned for commonly used characters with a specific cursor duration (for example, see the user video at http://www.assistiveware.com/a-pivotal-role-in-the-household).
Here we describe an experiment that estimates these error probabilities for the different locations of the keyboard for several different cursor durations. These estimates will both contribute directly to the optimization methods described above and provide basic research into the accuracy of a sequence of timed actions. For the design task, the ideal experiment would estimate these probabilities from the same kind of individuals who will ultimately use the switch keyboard. In many cases, the ultimate users are patients with severe motor disabilities and their performance with a switch device might be quite different from a non-patient population. To complicate matters further, performance on the switch keyboard should be estimated after users have had substantial experience using the switch keyboard; otherwise the estimated probabilities will become inaccurate as users improve with additional practice.
With these constraints in mind, it seemed impractical (and perhaps unjustifiable) to ask patients to participate in a tedious experiment to measure performance on a switch keyboard until we had demonstrated the validity and value of the overall optimization approach. Thus, rather than using locked-in patients, we recruited seven students from Purdue University to complete up to 25 experimental sessions. The use of normal subjects rather than patients with motor disabilities restricts our conclusions about which keyboards should be used for a given person, but this was always going to be a conclusion of this kind of study. The optimized keyboard design for a given patient will depend on the characteristics of the user, their familiarity with the device, the type of text they enter, and their acceptable error rate. Since our intention is to demonstrate an optimization algorithm that can accommodate these individual characteristics, gathering data from normal students is appropriate (and much easier). Of course, interesting optimization outcomes may also be derived by specifically studying the characteristics of patients.
Subjects
Seven student participants used a button-style switch device (Origin Instruments, 2013) that is similar to a computer mouse, but specially designed for people with motor disabilities that use switch keyboards. A participant could attend at most two sessions each day (in the morning and afternoon, respectively). Before the formal experiments, a 1 h tutorial was given so that the participant could practice and get familiar with the keyboard and the switch devices. To motivate participants to perform their best, each participant received a base of $5 for each session and a reward of $0.025 for each correct character entry. On average, participants earned around $45 for a session.