App icon similarity and its impact on visual search efficiency on mobile touch devices

Trapp, Anna K.; Wienrich, Carolin

doi:10.1186/s41235-018-0133-4

Original article
Open access
Published: 17 October 2018

App icon similarity and its impact on visual search efficiency on mobile touch devices

Cognitive Research: Principles and Implications volume 3, Article number: 39 (2018) Cite this article

3950 Accesses
7 Citations
1 Altmetric
Metrics details

Abstract

Users of mobile touch devices are often confronted with a great number of apps, challenging an efficient access to single applications. Especially when looking for infrequently used apps, users have to perform a visual search. We address this problem in two studies by applying knowledge about visual search efficiency to app icons on mobile touch devices. We aimed to transfer findings of similarity grouping for complex stimuli to a more applied setting and to investigate the effect of search efficiency on user experience. In Study 1 (N = 18), we varied set size and target presence as well as visual similarity between icons by color manipulation. Results indicated a highly efficient search when the target was easy to discriminate from the distractors and a less efficient search with increasing similarity. These results were replicated in a second, more realistic use case (N = 36). Regarding user experience, Study 2 showed that perceived usability and intuitiveness increased with search efficiency but that the overall liking also depended on the visual variety of the design. Moreover, although participants showed a general interest in a system supporting their search, most participants had concerns about data privacy with such a system. In conclusion, the results indicate that concepts and findings from basic attention research serve as fruitful heuristics for searches in more realistic (applied) settings. Furthermore, results showed that similarity manipulation with color works without controlling for other icon characteristics (e.g. luminance, shade). The findings might offer a new approach when designing for smooth interaction with mobile touch devices.

Significance

When users look at their mobile touch device to open an app, they first have to find the right icon. We investigate whether searches can be supported by colored app icons and how users feel about such a support system. It is known that search efficiency can be manipulated by similarity grouping. However, these results are mainly based on simple stimuli and a yes/no answer format. App icons are more complex because they have many different visual features and users have to tap the icon to start the app after navigating between different screens of the mobile device. This paper aims to transfer basic knowledge from visual search for these complex stimuli by applying the theoretical foundation of visual search in the context of mobile touch devices. This includes not only similarity grouping, effects of set size, and target presence, but also motoric processes. Additionally, we aim to connect search efficiency with user experience (UX). The results give a first insight into how icons can be designed to allow grouping, which can be used in further research regarding spatial learning processes, semantic categorization of apps, and individual preferences of app organization. The results indicate how important the overall design of mobile device screens is for UX. Moreover, a color grouping scheme might be a way to further empower designers to develop devices that match the user’s hedonistic and pragmatic needs.

Background

On mobile touch devices, efficiency refers to quick access to apps with a minimum of resources spent on the search. However, the increasing number of available apps forces users into a complex and inefficient visual search task. The complex search dynamic arises from factors such as the available screen size of mobile devices, similarity of app icons, the need to swipe through several smartphone pages, diverse use-environments (e.g. on a train, while walking), differing goals and use cases leading to altering target apps, and the need for complex motoric responses (e.g. touch displays with little or no haptic feedback). The search for an app icon is even further complicated because phone manufacturers often implement the possibility to adjust (“individualize”) the spatial icon array. Even though personalization can allow quicker access to certain apps, most phone manufacturers limit the degrees of freedom offered to arrange app icons and personalized arrangements can be altered by updates. Hence, human-centered app icon design faces the challenge of facilitating the complex visual search task on mobile touch devices. Our goal was to transfer knowledge from basic research on visual search to the field of human–computer interaction and app icon design, with the aim of providing mobile touch device users with a fluent experience when searching for an app. We first examined visual search efficiency for app icon selection by similarity grouping with universally applicable colored icons (Study 1). Visual search efficiency was then further investigated with regard to the appeal of these colored icons and their effect on the perception of the interaction qualities in terms of UX (Study 2). Three theoretical areas were considered: basic research regarding visual search; applied research dealing with visual search of icons; and the impact of (search) efficiency on UX.

Guidance of attention in visual search: Basic research

Artificial visual search

In basic experiments, visual search requires the scanning of spatial locations to identify a target item among a field of one or more distractor items. The reaction time as function of the number of items is often referred to as search rate or slope. It is a measure of the search efficiency (Treisman & Gelade, 1980). Low search rates show a more efficient search than high search rates (see Wolfe, 1996, for a review). The search rate is highly correlated with the target’s saliency, which results from physical attributes of the stimuli (i.e. bottom-up), as well as from the amount of selective attention paid to these contrasts (i.e. top-down) (see Moran & Desimone, 1985; Kastner & Ungerleider, 2000). Theoretical models of visual search (e.g. Guided Search Model; Cave & Wolfe, 1990; Wolfe, 1994) assume that attention is driven by an overall (master) map representation of integrated—bottom-up and top-down—priority signals whose output lead to a continuum of visual search results.

The Guided Search Model discriminates between a pre-attentive and an attentive stage. The pre-attentive stage determines the weights of the values represented on the saliency map. The saliency map corresponds to the space in the visual field. Its values correspond to the significance of the visual information at the matching locations (Findlay & Gilchrist, 1998). At the attentive stage, attention is gradually guided to the highest points on the saliency map. Hence, the Guided Search Model describes how basic features (e.g. color) as well as internal states of the observer (e.g. goals) guide the deployment of attention via bottom-up and top-down processes. Many behavioral and neuroscientific investigations have confirmed the operation of these bottom-up and top-down processes in the guidance of attention in a variety of search tasks (e.g. Gaspar & McDonald, 2014; Geng, DiQuattro, & Helm, 2017; for reviews see Fecteau & Munoz, 2006 or Wolfe & Horowitz, 2017). But what makes guidance more or less efficient?

Guidance by similarity

Similarity strongly modulates the salience of a target. In the Attentional Engagement Theory, Duncan and Humphreys (1989) argued that different combinations of similarity relations between display items model the search efficiency, based on a visual grouping effect. More precisely, they describe a pre-attentive stage in which perceptually similar items (e.g. as indicated by similar orientation) form groups which can be jointly rejected or accepted for the selection process. Hence, the saliency of a target, and thereby the search efficiency, decreases with increasing target–distractor similarity (TDS) or with decreasing distractor–distractor similarity (DDS) (see Fig. 1). In the past, a wide range of similarity manipulation has been tested with regard to effective grouping (e.g. using letters: Corcoran & Jackson, 1977; using lines: Treisman & Gormican, 1988; using color patches: Farmer & Taylor, 1980). A key finding is that similarity differs along simple dimensions such as color and orientation (e.g. using orientation: Duncan & Humphreys, 1989; using color: Duncan, 1989; Bundesen & Pedersen, 1983).

Lessons from these models

Saliency is determined by bottom-up features (e.g. color contrast) and top-down factors (e.g. the relevance of color to the task). It guides attention during visual search. Moreover, target saliency can result from similarity relations, e.g. a red bird (target) in front of green leaves (distractors) will be highly salient because the target is dissimilar from the distractor while the distractors are highly similar to each other. In other words, the Guided Search Model (based on saliency) and the Attentional Engagement Theory (based on similarity relations) predict similar results for many search conditions. However, their predictions differ when comparing points b and d of the search continuum (see Fig. 1). Due to the fact that the Guided Search Model predicts a decreasing search efficiency as more items show the same characteristic in a task-relevant feature (e.g. red fruit in a search for red apples), steeper reaction time slopes are assumed for corner b than for corner d in Fig. 1. The opposite is predicted by Attentional Engagement Theory, according to which search rates increase with decreasing DDS and thus the requirement of the rejection of an increasing number of heterogeneously colored distractor items/groups in the process of visual search. We aim to transfer the knowledge from the above-mentioned basic research to the applied setting of visual search on mobile touch devices and to test the two predictions against each other. Hence, with the help of colored app icons, the predictions of the models were tested for four extreme points (compare corners a, c, b, and d of Fig. 1) of the visual search continuum.

Guidance of attention on mobile devices: Applied research

Well perceivable display shapes (e.g. through well-defined perceptual groups) also enable a fluent search (see, for example, Scott, 1993 for a detailed review) under more realistic search environments using app icons as search elements. McDougald and Wogalter (2014) focused on the influence of color to guide the user’s attention. They reported more correct descriptions of pictograms if relevant areas of the pictograms were highlighted by color than if there were no highlighting. The authors concluded that color directs the user’s attention to relevant areas and consequently facilitates comprehension. However, the impact of color-highlighting was only shown in terms of correct answers rather than search time. Hence, the effect on visual search efficiency could not be elucidated. In another study, participants were asked to rank colored icons according to their noticeability (Bzostek & Wogalter, 1999). The authors presented warnings at different screen locations, using different icons and colors. In line with the subjective perceptions, participants were able to notice warnings faster if they were presented with colored icons (blue and red) in comparison to black icons within a black inked text. Other factors such as icon location yielded no additional benefit in the color-present conditions. Results of both studies indicate that color effectively guided attention to relevant areas of the screen.

Other studies have shown that grouping of icons facilitates visual search. Niemelä and Saarinen (2000) found a more efficient search for spatially grouped icons compared to non-grouped icons or non-icon-items (words) on a computer screen. They presented 16 icons with file names in a 4 × 4 grid. Due to arranging four items spatially close together, these items formed perceptional groups that facilitated the search in comparison to random arrangements. Likewise, Brumby and Zhuang (2015) found a facilitation of visual search in menu interfaces due to semantic order and visual grouping, depending on the group size. Semantic order was implemented by listing words of one category together. Visual grouping was realized by framing words of one category. Visual grouping was more effective for larger semantically organized groups (six icons) than for smaller ones (three icons). They also showed the importance of semantic and visual accordance. Visual grouping of semantically unrelated icons was significantly slower in comparison to no grouping. Both studies showed that visual search on a computer screen can benefit from spatial grouping in certain conditions (e.g. semantic and visual accordance). However, neither a mobile nor a touch device was tested.

In sum, we know from applied research that guidance by simple features (e.g. color) and similarity grouping modulates the efficiency of visual search. Notwithstanding the evidence that color highlighting and similarity grouping are important factors of app arrangement on mobile touch devices, a systematic transfer of fundamental visual search results to this applied setting has not been done yet. A validation of results from visual search paradigms in the applied context of mobile touch devices might significantly contribute to the design of more efficient interfaces. Here, we also aim to test how users feel about applications modifying their search behavior. This aim is based on the user-centered design approach, which emphasizes the importance of including the user in the process of designing technology (ISO 9241–210, 2010; Norman & Draper, 1986).

User experience and the efficiency of use

The ISO standard on the ergonomics of human system interaction defines UX as “a person’s perceptions and responses that result from the use or anticipated use of a product, system or service” (ISO 9241–210, 2010, p. 7). Based on this definition, the components of the UX model (CUE model, Thüring & Mahlke, 2007; Minge, Thüring, Wagner, & Kuhr, 2016) assumes three major components of UX: the perception of instrumental qualities; the perception of non-instrumental qualities; and the experienced emotions. Instrumental qualities describe attributes of the system that are beneficial for the task such as usability and usefulness of the product. Non-instrumental qualities are not essential for completing the task and refer to aspects of the visual attractiveness, aesthetics, and of the increase of one’s own status. Emotions describe the inner state of the user that is affected by the interaction. All three components affect the consequences of the interaction with a product, such as the global UX evaluation and acceptance of the system, or the intention of reuse. Conversely, the components are also influenced by the user, the design of the product, and the context of the use. A wide range of studies found results supporting the framework of the CUE model (Aranyi & van Schaik, 2015; Ben-Bassat, Meyer, & Tractinsky, 2006; Hamborg, Hülsmann, & Kaspar, 2014; Lee & Koubek, 2012; Mahlke, Minge, & Thüring, 2006; Minge & Thüring, 2018; Thüring & Mahlke, 2007). These studies show that usability is understood as one of the key components for the perception of instrumental qualities, while visual attractiveness is one of the key components in the perception of non-instrumental qualities.

Focusing on usability, a connection between efficiency and UX appears. As usability is defined as the extent to which a product or service can be used effectively and efficiently while being satisfying to the user (ISO 9241–210, 2010), it becomes clear that efficiency of use is a major component of usability. In fact, efficiency measures in terms of temporal units per successfully completed task are frequently used in order to infer the objective usability of a device (Agarwal & Meyer, 2009; ISO 9241–210, 2010; Hamborg et al., 2014; Mahlke, 2008) and its intuitiveness (Blackler, Popovic, & Mahar, 2003; Blackler, Popovic, & Mahar, 2010). The more efficiently an interaction can be accomplished, the higher the objective usability and the intuitiveness of that device are. Moreover, efficiency measures correspond to reaction times in the visual search paradigm, as they are also a temporal measure per correct response. Hence, reaction time measures are comparable to “total task time”—or “time on task”—measures that are frequently employed in usability and UX testing. Furthermore, they could also be seen as an indicator of the efficiency of the interaction and not only of the efficiency of the search itself.

In sum, in an applied visual search paradigm on mobile touch devices, the efficiency of the search can be understood as an objective indicator of the usability. The efficiency, on the other hand, is also perceived by the user, which influences the perception of the intuitiveness and of the instrumental qualities of the device. Due to the fact that the present paper aims to optimize search screens on a mobile touch device, an increased efficiency should be reflected in a better UX and increased intuitiveness.

Aim of the present paper

Our goal was to increase the efficiency of app selection on mobile touch devices by using an organization scheme of app icons based on color. More precisely, we investigated whether basic findings from the visual search paradigm also hold for a more complex search situation on app icons and how users experience the colored icon, which presumably helps to improve their search. In order to realize the first step of transfer, an artificial visual search with real app icons on a mobile touch device was conducted in Study 1, aiming to replicate effects of TDS as well as DDS. Reaction times were analyzed depending on of set size and target presence. In Study 2, the effect of the TDS and DDS was replicated and its impact on UX and intuitiveness was investigated. Additionally, further aspects of the transformation of basic research to the applied setting of app search on mobile touch devices, such as response format (touching the target) and swiping, were realized. Finally, other important impacts, such as crowding (Pelli & Tillman, 2008) or guidance by memory (e.g. Chun & Jiang, 1998; Geyer, Zehetleitner, & Müller, 2010) were discussed as directions for future work.

Study 1

Methods

Participants

Eighteen participants (11 men) with a mean age of M = 25.6 (SD = 2.7) years took part in the first study. The sample size was computed with help of the pwr-package for R (Champely, 2018) and adjusted upward to fit the balancing process, expected d = 1, α-level = 0.05, power 1-β = 0.8. All participants had normal or corrected to normal vision, with no instances of color-blindness. Fifteen 15 of the participants owned a smartphone. Participation was voluntary. Participants were offered course credits for participating. Before the beginning of the experiment, participants gave their informed consent according to the WMA Declaration of Helsinki.

Stimuli and design

A total of 26 icons were used; 25 of these were retrieved from the Apple App Store or from Google Play and one was designed by the authors. All icons measured 2.5 × 2.5 cm. There was a gap of 3 mm between icons. Participants sat at a desk. The test device lay on the desk. Participants were able to move freely while seated.^{Footnote 1} None of the icons included recognizable letters or numbers and all icons were transformed into a flat design (i.e. no color gradients) and set to black and white. Color was added to vary TDS as well as DDS between icons. To this end, 20 colors were chosen from the HSV color model. Each color belonged to one of four color groups (red, blue, yellow, and green). The colors differed in terms of their hue (red: 340°, 350°, 0°, 10°, 16°; yellow: 50°, 53°, 56°, 59°, 62°; green: 80°, 90°, 100°, 125°, 140°; blue: 210°, 216°, 222°, 228°, 234°), while saturation and value (in terms of lightness) were kept constant at 100%. The colors were chosen to provide high discriminability between color groups and high similarity within color groups. However, we ensured that the colors within one color group were still distinguishable. All icons were colored in each of the 20 colors, resulting in 520 colored icons. The icons differed in their overall luminance as well as the percentage of colored area as real-life app icons do. Colors for distractor icons were chosen to originate either from the same color group as the target (high TDS) or from another color group (low TDS) (see Fig. 2). Distractor colors were chosen from the same color group (high DDS) or from three different color groups (for low DDS). Targets were always presented in a unique color (0°, 222°, 56°, or 100°) while distractor colors were chosen from the other 16 possible colors. In accordance to smartphone displays, set size was varied between the level 8, 16, and 24 icons. The target was present in half of the trials (target presence). In sum, the study was based on a four-factorial within-design (3 × 2 × 2 × 2) with the factors set size, target presence, TDS, and DDS.

Apparatus

The study was conducted on a 10-in. resistive touch device (Faytech) with a resolution of 1024 × 768, connected to a PC running ePrime (version 2.0).

Procedure

Trials started with a fixation cross in the middle of the screen. Participants were instructed to hold their forefinger over the fixation cross and tap the screen when they were ready to start. Once tapped, a colored target icon was presented for 2 s followed by another fixation cross. Again, participants had to tap to show they were ready before the search screen was presented. Each search screen consisted of 8, 16, or 24 colored icons, arranged in rows of four, and one gray box at the bottom of the screen. Participants were instructed to tap the target or to tap the gray box if the target was absent. After the response, a feedback screen was shown with information regarding the correctness of the response, the response time (in milliseconds) and how well they were performing during the ongoing block (percentage of correct trials).

The experiment was divided into 12 blocks according to set size, TDS, and DDS. Blocks with the same set size were presented consecutively while the order of set size presentation was balanced over participants. For each set size, the order of the four similarity conditions was randomized. Each block consisted of 24 randomized trials, of which 12 were target-present and 12 were target-absent trials. The total number of trials was 288 per subject (12 per condition). Before starting the main experiment, participants performed a short tutorial to become acquainted with the experimental procedure.

Of the 26 icons, no icon was shown twice as a target in one block. However, an icon did reappear as a distractor. Furthermore, no icon was presented twice on a search screen. The target color (red, blue, yellow, or green), color groups, and target position were balanced for each participant.

Data analysis

The data analysis was conducted with R version 3.3.2 (R Core Team, 2016) and the following packages: afex (Singmann, Bolker, Westfall, & Aust, 2016); car (Fox & Weisberg, 2011); dplyr (Wickham & Francois, 2016); ez (Lawrence, 2016); ggplot2 (Wickham, 2009); MASS (Venables & Ripley, 2002); and tidyr (Wickham, 2016). Before the data analysis, all incorrectly answered trials (4.78%) were excluded from the dataset and reaction time data were transformed using a Box-Cox power transformation ( λ= − 0.55) to correct for positive skew (Venables & Ripley, 2002). The formula

$$ R{T}_{transformed}=\frac{R{T}^{\lambda }-1}{\lambda } $$

was used to compute transformed reaction time (RT_transformed) from reaction time (RT) with λ as an exponent in the exponential transformation. This power transformation is based on a log-likelihood estimation for an optimal λ. With this procedure, the order of the original data is retained. Thus, high or low values in reaction time translate into high or low values in the transformed reaction time, respectively. Subsequently, an outlier analysis was conducted to exclude trials with reaction times differing by more than two standard deviations from the mean, which was calculated separately for each participant and condition.^{Footnote 2} The remaining data (92.07%) were used to calculate slopes over set size for each participant and condition based on the variables TDS, DDS, and target presence. Hence, for each participant, there were eight slopes indicating the increase of reaction time with increasing set size. Due to the Box-Cox power transformation, these slope gradients were normally distributed. Subsequent analyses were based on slope values. All presented error bars were corrected for within-subject variance by the method suggested by Cousineau (2005). The alpha level was set at 5%. All post-hoc t-tests were corrected by Bonferroni–Holm correction.

Results

The statistical analysis was conducted with an ANOVA (type III) on slope values with the independent within factors TDS, DDS, and target presence. The results are presented in Table 1.

Table 1 ANOVA (type III) on slopes over set size

Full size table

The results revealed a large main effect of TDS. As expected, the increase of reaction time over set size was much steeper (i.e. increasing slopes) when TDS was high than when it was low. DDS showed no significant impact on slope gradients. However, the expected interaction between TDS and DDS was present and revealed a medium effect size. The interaction is visualized in Fig. 3. The graph on the right shows that slopes were smaller when TDS was low (points a and c) compared to high TDS (points b and d). However, the slopes differed significantly between DDS levels at both TDS levels. When TDS was low, slope values were significantly smaller with high DDS compared to low DDS (points a and c, respectively), t(17) = 5.15, p < 0.001, d = 1.23. Conversely, when TDS was high, slope values were larger with high DDS compared to low DDS (points b and d, respectively), t(17) = − 4.39, p < 0.001, d = 1.05.

Regarding target presence, the ANOVA revealed a small effect: slopes were steeper if the target was absent. As expected, this effect interacted with TDS, target presence had no effect when TDS was low, t(17) = − 0.27, p = 0.787, d = 0.07, but a strong effect when TDS was high, t(17) = 5.25, p < 0.001, d = 1.24. This is in line with expectations and suggests once again an efficient search process when TDS is low and an inefficient process when TDS is high.

We reran the analysis with only those individuals who owned a smartphone. Within this subgroup, the main effect of target presence was only marginally significant and the interaction between TDS and DDS was more pronounced. However, the overall results were in line with the results of the entire group.

Discussion

Study 1 aimed at transferring effects of TDS as well as DDS to app icons on mobile touch devices. For this purpose, reaction times were analyzed depending on similarity (between colored app icons), set size (number of app icons), and target app presence. As expected, grouping of similar app icons occurred and enabled an efficient app selection. Hence, our results mirror those from basic (laboratory) visual search studies. Regarding the different predictions of the Guided Search Model and the Attentional Engagement Theory, the results support the prediction of the Guided Search Model (compare corner b and d in Fig. 3 to Fig. 1). When the target is highly similar to the distractors (high TDS), the Guided Search Model predicts a weaker guidance of attention by a basic attribute (e.g. color), the more distractors carry this attribute. In the same scenario, the Attentional Engagement Theory predicts more efficient grouping and rejection with increasing distractor similarity (i.e. leading to steeper reaction time slopes for corner d than b in Fig. 1). When realizing point b, we created an extreme variant of high TDS and high DDS in which all icons shared the same color group. Thus, grouping by color could not offer any relevant information for target–distractor discrimination. Thus, the present pattern of results is not fully compatible with the prediction of Attention Engagement Theory. Instead, the results suggest that priority map models of the search process such as Guided Search are more appropriate to predict search performance in app icon search.

In accordance with basic research findings, interaction effects of TDS and target presence revealed characteristic patterns of efficient and inefficient search processes. With regard to the more applied line of research, our findings are in line with previous research recognizing the influence of color highlighting of computer pictograms (e.g. Bzostek & Wogalter, 1999; McDougald & Wogalter, 2014) or emphasizing the impact of grouping of icons (e.g. Brumby & Zhuang, 2015; Niemelä & Saarinen, 2000).

Although basic research results were confirmed for a mobile touch device setting, a major limitation is that the visual search on mobile touch devices was not adequately replicated. In real life, users mentally visualize the app icon and its features before a search. The location and color of icons are learned and memorized through the interaction with the device. In Study 1, these two dimensions were excluded by randomization in order to focus on effects of similarity grouping. The impact of spatial learning and memory, the preparation of the search by learned colors, as well as the interplay of these complex cognitive processes are further addressed in the general discussion.

Furthermore, icons on mobile touch devices are frequently presented on multiple screens and the user might have to swipe through several screens to find the desired app. That scenario implies an investigation of similarity grouping and learning across multiple screens. Finally, hedonic aspects of color organization schemes and of an accelerated search were not considered in Study 1. Thus, it seemed doubtful whether the increased efficiency of a visual search will be accompanied by a better UX. Study 2 aimed to extend the ecological validity of the results of Study 1 by implementing a visual search on multiple screens and investigating the impact of similarity grouping on UX.

Study 2

The goal of Study 2 was to confirm the effect of color guidance and similarity on reaction time in a more realistic context (objective 1) and to analyze the importance of a quick search for soft dimensions like UX and intuitiveness (objective 2). To this extent, we aimed to increase the visual similarity of the search task towards a more realistic one. On this account, the icons were presented on multiple screens. Additionally, we presented each icon with a name indicating the icon’s purpose to simulate the fact that users’ search for an app is always based on a certain goal (e.g. checking for train connections, not for a train icon as such). Hence, the search space was enhanced and users had to swipe to navigate between the different search screens. However, in order to make Study 2 comparable with Study 1, the presentation of the exact target icon before the search display was retained in Study 2. Regarding the efficiency of the search, we expected to find similar results to those obtained in Study 1. Based on the CUE model (Thüring & Mahlke, 2007; Minge et al., 2016), we further expected that this efficiency correlates with UX components. Mainly, it should affect intuitiveness and perceived usability. Additionally, we expected users to notice the more efficient search if there is a pop-out effect of the target.