Multisensory inclusive design with sensory substitution
Cognitive Research: Principles and Implications volume 5, Article number: 37 (2020)
Sensory substitution techniques are perceptual and cognitive phenomena used to represent one sensory form with an alternative. Current applications of sensory substitution techniques are typically focused on the development of assistive technologies whereby visually impaired users can acquire visual information via auditory and tactile cross-modal feedback. But despite their evident success in scientific research and furthering theory development in cognition, sensory substitution techniques have not yet gained widespread adoption within sensory-impaired populations. Here we argue that shifting the focus from assistive to mainstream applications may resolve some of the current issues regarding the use of sensory substitution devices to improve outcomes for those with disabilities. This article provides a tutorial guide on how to use research into multisensory processing and sensory substitution techniques from the cognitive sciences to design new inclusive cross-modal displays. A greater focus on developing inclusive mainstream applications could lead to innovative technologies that could be enjoyed by every person.
Sensory substitution devices transform the representation of one sensory input into a new representation with a different sensory form. For example, a visual feed from a camera can be turned into sound to be heard or into tactile stimuli that can be felt. The most common applications of this to date are in developing assistive technologies which aid vestibular problems, visual impairments and hearing impairments. State-of-the-art sensory substitution techniques can contribute significantly to our understanding of how the brain processes and represents sensory information. This progressively advances cognitive theories with respect to multisensory perception and cognition, attention, mental imagery and brain plasticity. Sensory substitution techniques provide a novel opportunity to dissociate the stimulus, the task and the sensory modality, and thus offer a unique way to explore the level of representation that is most crucial for cognition. Due to their versatility, sensory substitution phenomena have the potential to help translate the principles underpinning cognitive theories of multisensory perception into other interdisciplinary research areas, such as human-computer interaction and artificial intelligence. In this review, we provide a novel framework which has two main aims: (i) to explain how applying sensory substitution techniques in a multisensory context for inclusive design can have a wide benefit to society beyond individuals with disabilities and (ii) to explain how inclusive cross-modal displays which utilise sensory substitution techniques will contribute to future cognitive theories of sensory processing.
Disability has previously been regarded as something which professionals should seek to cure and attempt to provide rehabilitation for, enabling individuals to make strides towards a more ‘normal’ existence (Dewsbury, Clarke, Randall, Rouncefield, & Sommerville, 2004). This view of disability has been rejected by disability rights activists, and today, barriers created by society are more commonly seen as being disabling for those with ‘impairments’ (Oliver, 2013). In this review, we argue that new cross-modal displays should be developed which target a mainstream audience and have an inclusive design. This approach to design is about creating products or services which address the needs of the widest possible audience and can therefore be used by anyone, regardless of their age or abilities (Design Council, 2019). We focus specifically on sensory substitution techniques, which build on cognitive mechanisms to represent one sensory form with an alternative and can be utilised to develop inclusive cross-modal displays.
What does it mean for design to be inclusive?
The social model of disability arose following the Civil Rights Movement during the 1950s and 1960s (Gallagher, Connor, & Ferri, 2014). This model rejects the idea that disability should be viewed as a personal tragedy, a so-called medical model of disability and, instead, emboldens those who have impairments to demand that disabling barriers in society are dismantled (Burchardt, 2004). Coinciding with this change in perspective of disability rights has been a shift within design and engineering to inventing products and services which fulfil the needs of all users, regardless of any impairments. This is known as inclusive design, which is also termed Design for All (within Europe) and Universal Design (in Japan and America). Inclusive design seeks to remove barriers that people with different levels of capabilities may encounter by keeping potential barriers in mind at every stage of the design process (Newell, 2003).
For mainstream products to have an inclusive design, they should either have the ability to cater to all users without the need for any modification or adaptation or they should have the capacity for specialised access equipment to be attached to the product, giving the original product greater functionality (Newell, 2003). Currently, few design courses teach a social model of disability, and perhaps as a result of this, many mainstream products continue to overlook the needs of individuals with impairments (Gieben-Gamal & Matos, 2017). People who have impairments are continually viewed as somewhat of a niche population, and separate products, often called assistive technologies, are used to support them (Gieben-Gamal & Matos, 2017). However, importantly, good inclusive design which improves access to those with impairments can bring benefits to everyone. When products are made which overcome barriers faced by a subset of users, their effectiveness for all users is often improved (Persson, Åhman, Yngling, & Gulliksen, 2015).
That no design can realistically meet the needs and desires of every individual in the population is widely accepted (Bichard, Coleman, & Langdon, 2007). However, when inclusive designs consider the users to be consumers or customers, a competition between different designs to become the most desirable product will ultimately grow, and this will result in a variety of products to suit different preferences (Newell, 2003). The creation of a diversity of inclusive products provides users who have impairments with the freedom to make choices about the ways in which they would like to engage with the environment.
Future technologies could benefit from applying research from the cognitive sciences to new devices which can be enjoyed by everyone. Embedding cognitive theories into design principles can improve how we build inclusive technologies and interact with them (Obrist, Gatti, Maggioni, Vi, & Velasco, 2017; Oviatt, 1999). This paper presents the argument that the development of new cross-modal displays with multisensory modes for the mainstream will serve to benefit all users, regardless of any impairments they might have.
Crossing the boundaries from cognitive science to human computer interaction research
The current paper will demonstrate how cognitive theories can contribute to shaping future technologies. We provide a tutorial guide on how to use research into multisensory processing to design new inclusive cross-modal displays. We first provide an overview of the concepts surrounding multisensory processing and outline the three guiding principles which are necessary for multisensory processing to occur. Next, we give an overview of the possible outcomes which can be achieved by a cross-modal display and explain how these differ from one another. We provide an overview of sensory substitution techniques including how they work and how they have been implemented. We then explain how sensory substitution techniques can be utilised for the purposes of creating inclusive technologies. Finally, we suggest a number of future applications of inclusive cross-modal displays, which everyone, regardless of ability, could enjoy.
Understand the literature on multisensory processing
Gaining an overview of the key terms used in the literature and the key principles underlying multisensory perception are important when building an inclusive cross-modal display. In this section, we provide a framework for multisensory processing by defining the key terms used by researchers in this area. We then describe the guiding principles underlying multisensory processing.
Framework for multisensory processing
Multisensory perception is where information from more than one sense is processed simultaneously. The terminology associated with multisensory perception can be inconsistent and confusing, and the same terms are sometimes used by cognitive scientists, computer scientists, applied researchers and others to describe different concepts. Stein et al. (2010) provide a useful guideline for a common nomenclature for multisensory phenomena. To deal with the semantic inconsistencies, they suggest using the generic term ‘multisensory processing’ to describe any multisensory phenomena such as multisensory integration or multisensory combination. Table 1 further defines some of the key terms used to describe concepts associated with multisensory processing. Recognising that these concepts vary in the properties to which they have been attributed is important. Some of the concepts refer to a neural or behavioural response, some a display type and some a sensory source (Table 1). For example, the term ‘multisensory’ refers to the internal neural and behavioural response when multiple senses are stimulated. This is fundamentally distinct from the term ‘cross-modal’, which refers to the external sensory source in the environment which emits information that can be processed by more than one sense.
There are three guiding principles which are necessary for multisensory processing to occur: spatial coincidence, temporal coincidence and inverse effectiveness (Kayser & Logothetis, 2007). We next provide an overview of each of these principles.
Principles of spatial and temporal coincidence
In order for multisensory processing to happen, cross-modal information needs to come from spatially aligned sensory sources (Stein, 1998). This is known as the spatial coincidence principle. Cross-modal information also needs to come from sources which are in close temporal proximity to one another (Stein & Wallace, 1996). This is known as the temporal coincidence principle. For example, we can see spatial and temporal coincidence failing when technical glitches cause actors’ speech to become asynchronous to their lip movements. In this case, the auditory information is not aligned with the visual information. Cross-modal information that is not spatially or temporally aligned may be perceived as if it comes from separate sensory sources, causing a depression in the multisensory response and instead leading to separate unisensory responses (Stein, 1998; Stein & Wallace, 1996).
Principles of spatial and temporal coincidence are manipulated in a number of human-computer interaction (HCI) studies to investigate the margins between congruent and incongruent cross-modal cues for improving user experience and performance. For example, congruent visual and audio/tactile stimuli can be used to increase the perceived quality of buttons on touch-screen devices (Hoggan, Kaaresoja, Laitinen, & Brewster, 2008). HCI researchers also make use of the principle of spatial and temporal coincidence by purposefully utilising incongruent multisensory stimuli in some technologies. For example, one limiting factor of current virtual reality technologies is the way in which users underestimate the distance between themselves and the target object, known as distance compression. To manage distance compression, incongruent audio-visual stimuli can be designed to artificially align the two senses (Finnegan, O’Neill, & Proulx, 2016).
Principle of inverse effectiveness
For multisensory processing to happen, the cross-modal stimuli which are compared with one another need to be approximately equally reliable. If a cue from one modality elicits a stronger behavioural response than the cue from another modality when presented together from the same sensory source, the multisensory processing of the sensory source will be weakened (Perrault, Vaughan, Stein, & Wallace, 2003; Stanford, 2005; Stanford & Stein, 2007; Stein & Wallace, 1996). This is known as the inverse effectiveness principle. The multisensory calibration, and thus the reliability, of our senses usually emerges at different critical periods during development (Bremner, 2017). For example, children are thought not to become optimally proficient in integrating visual and tactile information until at least the age of eight (Gori, Del Viva, Sandini, & Burr, 2008; Nardini, Jones, Bedford, & Braddick, 2008; Scheller, Proulx, de Haan, Dahlmann-Noor, & Petrini, 2020). Similarly, acquired senses (e.g., via sensory augmentation devices) can be calibrated with intact senses and adjusted for multisensory processing through learning and experience, thereby increasing their reliability (Proulx, Brown, Pasqualotto, & Meijer, 2014). The principle of inverse effectiveness also gives rise to the phenomenon that users of augmentation displays must rely heavily on their intact senses until the newly acquired sense becomes equally reliable. Furthermore, if more than one acquired sense is utilised via sensory augmentation devices, their reliability is expected to be equal to each other because they are both unfamiliar to the user (Proulx et al., 2014).
Determine the possible outcomes for cross-modal displays
When building an inclusive cross-modal display, gaining a comprehensive understanding of the possible outcomes from the device is important. Two possible outcomes exist from cross-modal displays: multisensory integration or multisensory combination (Table 1). Multisensory integration and multisensory combination are two of the terms which are used inconsistently in the literature (Stein et al., 2010). Applied researchers tend to use the term multisensory integration, but the use of this term is sometimes misleading. Fundamental differences exist between these outcomes, which require some explanation to prevent misunderstandings occurring later in the design process. Therefore, we next provide an explanation of these two separate outcomes.
Multisensory integration is where cross-modal cues are integrated to give a perception which is significantly different from the perception experienced when only one cue is processed (Stein et al., 2010). We can conceptualise this by imagining how we might perceive a piece of fruit, such as a pear. If we only see the pear, we might say its size is approximately 6 units. However, we might perceive the pear to be slightly bigger, perhaps 8 units, if we were given the pear to hold with our eyes closed. If we hold the pear while looking at it, assuming both senses are equally reliable, the size of the pear would be perceived to be approximately 8 units. Multisensory integration happens when both senses are used to inspect the pear, reducing uncertainty regarding its size and giving an estimate that is somewhere between each individual unisensory estimate (Rock & Victor, 1964).
Multisensory integration is viewed as the neural process of integrating redundant sensory cues in an optimal fashion (Rohde, van Dam, & Ernst, 2016). Here, the level of redundancy is a function of the principles of spatial and temporal coincidence and of inverse effectiveness. By this definition, a high level of redundancy occurs when the reliability of multiple senses is approximately equal and when the sensory sources are spatially and temporally aligned. Thus, in our pear example, a high level of redundancy is present. To perceive the pear accurately as 7 units, we rely equally on our vision and our touch. The level of redundancy can change when environmental circumstances temporarily reduce the reliability of one or more of our senses. For example, when the pear is inspected under a magnifying glass, the reliability of our vision decreases to estimate its real size. When the discrepancy between the reliability of different senses becomes higher, the level of redundancy decreases. When redundancy is too low, no multisensory integration occurs. For example, if we were to hold the pear while looking at it under a magnifying glass, our tactile perception would still suggest it has a size of 8 units, yet our visual perception might suggest it has a size of 20 units. An automatic process is thought to happen during multisensory integration, whereby greater statistical weight is assigned to the more reliable sensory source (Talsma, Senkowski, Soto-Faraco, & Woldorff, 2010). Thus, when magnification is applied, more weight is assigned to our tactile sense, and the high discrepancy in reliability means redundancy becomes too low for multisensory integration to occur. Rather than a multisensory percept, we perceive the visual and auditory information as unisensory. Since more weight is assigned to our tactile sense, we perceive the pear to have a size of 8 units.
Multisensory combination is another possible outcome of multisensory processing. This again provides a more accurate estimation of something in space, such as an object, compared to unisensory perception, but the process to arrive at this estimation is fundamentally different from the process of multisensory integration. In this case, the perception we experience from one sense provides complementary information to the perception derived from another sense. The two experiences are combined to give a more robust estimation of the sensory source (Bülthoff & Mallot, 1988). Using the pear example, when we see the pear from the front we might perceive the pear to be approximately 6 units. Next, we pick the pear up and we can feel a bulge on the back of the pear. We could not tell from looking at the pear that its shape is asymmetrical and its back is much more convex than its front; this information is not redundant. This new information causes us to change our perception of the size of the pear (Newell, Ernst, Tjan, & Bülthoff, 2001). We now know that the pear must be bigger than 6 units, therefore our multisensory perception has provided a more accurate estimate than our visual perception alone could provide.
A summary to show how the outcomes of multisensory processing can be applied when prototyping cross-modal display modes is provided in Fig. 1. Cross-modal displays can either achieve multisensory integration, which provides the user with redundant sensory information. This enables the user to use multiple senses to acquire a more accurate estimate of the sensory source than would be achieved by a unisensory alternative. In achieving multisensory combination, on the other hand, cross-modal displays are not limited by providing redundant information. Instead, they enable the user to use multiple senses to gain additional information which has the effect of maximising what they can perceive from the environment.
Most studies within HCI research aim to achieve multisensory integration (Stein et al., 2010). While multisensory integration is a common natural phenomenon, such integration is very difficult to achieve using artificial devices such as sensory augmentation devices. This is due in part to the length of time that is necessary for adaptation. Spatially and temporally aligning sensory information to resemble a natural sensory source is also challenging in digital environments. Few HCI researchers refer to multisensory combination. However, some have used complementary cross-modal cues to decrease user’s cognitive loads in mobile devices (Hoggan & Brewster, 2007). While not mentioned explicitly, this is an example of multisensory combination because the perceived experience of touchscreen buttons were enhanced with the combination of visual and complementary audio/tactile feedback (Hoggan et al., 2008). A gap currently exists between cognitive theory and applied research because HCI researchers find it difficult to demonstrate that devices are effective in achieving multisensory integration. However, a greater focus on multisensory combination could enable applied researchers to evidence the effectiveness of their devices more easily. We therefore recommend that applied researchers should aim for multisensory combination when building new inclusive multisensory devices.
Utilising sensory substitution techniques to design inclusive cross-modal displays
The cross-modal displays that we are particularly interested in, when thinking about how to create inclusive technology, utilise sensory substitution techniques. These devices have the potential to be used by the mainstream, to enhance any users’ capabilities to interact with their environment. Our next section explains how sensory substitution techniques work, who they are currently used by, and the potential they have for expansion into a broader market.
The mechanisms underpinning sensory substitution techniques
A single physical feature in the environment clearly can be processed by multiple senses. For example, the edge of a cup can be seen and also touched. Traditionally, it was thought that the brain consisted of independent unisensory modules which processed information before multisensory percepts occurred through bottom-up facilitation only (Choi, Lee, & Lee, 2018). However, this view was later challenged by evidence showing our brains execute metamodal computations and tasks through utilising an integrated network, known as the metamodal organisation of the brain (Pascual-Leone & Hamilton, 2001). This hypothesis takes the view that brain organisation is not necessarily organised by sensory modality but rather by the computational or functional task being carried out (Proulx et al., 2014). For example, seeing and touching the edge of a cup will lead to the activation of multiple senses. Due to the metamodal organisation of the brain, multiple senses will evoke shared cognitive forms to perceive the edge of the cup. The metamodal hypothesis has been repeatedly supported by a growing body of empirical evidence (Brefczynski-Lewis & Lewis, 2017; Ortiz-Terán et al., 2016; Ricciardi, Bonino, Pellegrini, & Pietrini, 2014; Ricciardi & Pietrini, 2011). Indeed, research demonstrates our brains have evolved the ability to use incoming information from multiple senses to create a coherent perception of the environment (Ghazanfar & Schroeder, 2006; Spence, 2011). Furthermore, the bottom-up sensory responses are shown to be modulated by top-down facilitations (e.g., memory and attention) such that previously acquired associations can enhance task-relevant multisensory responses. For a detailed review, see (Choi et al., 2018).
As a result of the metamodal organisation of the brain, sensory information and cognitive forms are learnt and hence gradually associated with one another in relation to bottom-up and top-down facilitations. Since two entirely separate concepts can have shared cognitive forms it is possible for seemingly random concepts to become associated with one another. For example, while an inedible object would not appear to have a taste in one’s mind, research finds that individuals conceive boulders to be sour (Woods, Spence, Butcher, & Deroy, 2013). According to the metamodal theory, this is because boulders and a sour taste are represented by a shared cognitive form in the brain; therefore, the two concepts have become associated. Other unusual findings include lemons being conceived to be fast and prunes to be slow (Woods et al., 2013). Strangely, research has even found both sighted individuals and the early blind, who never experienced colour perception, perceive the colour red to be heavy (Barilari, de Heering, Crollen, Collignon, & Bottini, 2018; Woods et al., 2013). These abstract associations are argued to be the result of shared conceptual dimensions between cognitive forms which are common across cultures and languages (Barilari et al., 2018; Spence, 2011; Spence & Parise, 2012b).
Cognitive scientists have investigated these associations across cultures and languages by looking at the mappings between different sensory forms, which are termed cross-modal correspondences (Spence & Parise, 2012b). An example of cross-modal correspondences is where a higher-pitched signal of an auditory form is associated with a higher vertical elevation of a visual form (Melara & O’Brien, 1987), and a louder sound, with a brighter visual form (Marks, 1974). This means that, due to the metamodal organisation of the brain, stimulating one sense can result in the activation of the same cognitive form that would have been activated when a different sense was stimulated (Fig. 2). The representation of features of a sensory experience, such as seeing, using a different sensory form, such as hearing, is called sensory substitution (Esenkaya & Proulx, 2016). Accordingly, sensory substitution techniques take advantage of cross-modal correspondences by evoking in the brain a cognitive form by stimulating a different sense from the one usually stimulated. Sensory substitution devices (SSDs) are essentially cross-modal displays (Kaczmarek, Webster, Bach-y-Rita, & Tompkins, 1991) which take advantage of the way in which complementary cross-modal cues are associated with one another (Parise & Spence, 2012; Spence, 2011; Spence & Parise, 2012b).
Sensory substitution devices for assisting those with disabilities
Currently, SSDs are mainly regarded as technology which can assist individuals who have disabilities. Research into SSDs for assisting those who are blind or partially sighted dominates the field, over any other disability. This section provides an overview of the ways in which SSDs work in the context of individuals who are blind and describes current uses of SSDs.
Applying sensory substitution techniques to the visually impaired enables access to visual information via non-visual cross-modal cues (Chebat et al., 2018; Maidenbaum, Levy-Tzedek, Chebat, & Amedi, 2013; Proulx & Harder, 2008). The mappings between visual and auditory forms, in terms of elevation and pitch, and brightness and loudness, can be utilised via SSDs to represent some features of a visual form with an auditory form (Meijer, 1992). In the long term, these pairings may be strongly associated such that late blind people can have visual imagery similar to that of the perception of sight (Esenkaya & Proulx, 2016; Ortiz et al., 2011; Ward & Meijer, 2010). With SSDs, it is possible to acquire visual information by means of sonifications (Meijer, 1992) or two-dimensional tactile cues (Bach-y-Rita & W. Kercel, 2003). It is also possible to acquire auditory information by means of vibrotactile cues (Butts, 2015; Eagleman, Novich, Goodman, Sahoo, & Perotta, 2017).
In developing assistive devices for individuals with visual impairment, research has predominantly investigated visual-to-auditory and visual-to-tactile sensory substitution techniques. For examples of devices, see EyeMusic (Abboud, Hanassy, Levy-Tzedek, Maidenbaum, & Amedi, 2014), Vibe (Durette, Louveton, Alleysson, & Hérault, 2008), See ColOr (Bologna, Deville, & Pun, 2009), The PSVA (Capelle, Trullemans, Arno, & Veraart, 1998), Elektroftalm (Starkiewicz & Kuliszewski, 1963) and the Optophone (d’Albe, 1914). These techniques have been shown to be successful in emotion conveyance, object recognition, localisation, avoidance and navigation tasks (see Table 2 for references). The vast majority of research on visual-to-auditory devices has focused on the associations between the direction of pitch and movement. For some devices, e.g., The vOICe, higher pitched sonification signals are paired with higher elevations of tactile signals. Another device, Synaestheatre, incorporated multiple auditory components and associated these signals with movement. A 3D sensor was used to record depth information using spatialised sounds, enabling azimuth (the horizontal angle) and elevation to be conveyed (Hamilton-Fletcher, Obrist, Watten, Mengucci, & Ward, 2016).
Devices which use visual-to-tactile sensory substitution techniques utilise cross-modal pairings, which are more intuitive and analogical than visual-to-auditory sensory substitution techniques. For example, a circle can be directly conveyed on the skin (e.g., on the back or tongue) via tactile cues presented in a two-dimensional circular pattern. To enhance navigation, tactile sensory substitution techniques have been used to represent magnetic North or to provide positional information using a tactile belt or vest (Jones, Nakamura, & Lockyer, 2004; Rochlis, 1998; Visell, 2009). For other examples of visual-to-tactile SSDs, see Tongue Display Unit (Sampaio, Maris, & Bach-y-Rita, 2001), TVSS (Bach-y-Rita, Collins, Saunders, White, & Scadden, 1969; Bach-y-Rita & W. Kercel, 2003), Optacon (Linvill & Bliss, 1966) and Optohapt (Geldard, 1966). Another line of research has manipulated the strength of tactile vibrations to convey distance information that would otherwise be perceived visually. For example, see EyeCane (Maidenbaum, Levy-Tzedek, Chebat, Namer-Furstenberg, & Amedi, 2014), ETA (electronic travel aid) and EOA (electronic orientation aid) (Dakopoulos & Bourbakis, 2010; Farcy et al., 2006; Liu, Liu, Xu, & Jin, 2010), UltraCane and UltraBike (Sound Foresight Technology, 2019a, 2019b). Visual-to-tactile sensory substitution techniques have enabled users to successfully complete a variety of object recognition, localisation, avoidance and navigation tasks (Table 2).
Many approaches have successfully conveyed colour information using cross-modal auditory and tactile feedback. For a detailed review of SoundView, Eyeborg, Kromophone, See ColOr, ColEnViSon, EyeMusic and Creole, see Hamilton-Fletcher and Ward (2013) and Hamilton-Fletcher, Wright, and Ward (2016). For example, EyeMusic utilises the cross-modal correspondences between musical instruments and colour to convey colour information.
In recent years, a number of cross-modal prototypes that utilise both auditory and tactile feedback have been prototyped and studied in the context of spatial cognition, with encouraging results: EyeCane (Amedi & Hanassy, 2011) and SoV (Hoffmann, Spagnol, Kristjánsson, & Unnthorsson, 2018). In Fig. 3, a low fidelity audio-tactile cross-modal display prototype developed by the authors is also shown. Here, BrainPort (Wicab, 2019) and The vOICe (Meijer, 1992), two commercially available devices that utilise sensory-substitution techniques, are used. BrainPort is a visual-to-tactile sensory substitution device that delivers visual information captured from a live camera via an electro-tactile interface, which is placed on users’ tongues. The vOICe is a visual-to-auditory sensory substitution device that transforms live camera feed into sonifications. Inside the box is a camera connected to these devices. The live camera feed captures an aerial map of multiple targets and delivers this information to the users in tactile, sonification or tactile-sonification forms. Here, the user can acquire spatial information necessary for navigation via electro-tactile stimulation on her tongue and also via sonifications which are delivered by bone conduction headphones (Jicol et al., 2020). The cross-modal display prototype here applies the principles of spatial and temporal coincidence by aligning the sensory information available to the camera. This is simply achieved by fixating the camera with a scaffold inside a box. The box also ensures the consistency of environmental factors for experimentation purposes. The principle of inverse effectiveness enables users to rely equally on the cross-modal feedback from two novel display modes. This reliance is achieved as users have not previously used BrainPort or The vOICe. From a theoretical perspective, this research investigates the relationship between multisensory integration and multisensory combination and how sensory substitution techniques are represented in multisensory processing. From an applied point of view, this research aimed to develop inclusive cross-modal displays that can efficiently deliver the same sensory information in different sensory forms.
Overall, sensory substitution techniques have been prototyped using both modality-specific and cross-modal display modes and have been successfully demonstrated as assistive technologies in a variety of use cases.
Challenges to widespread adoption of sensory substitution techniques
Despite their documented success in laboratory settings, sensory substitution techniques have not yet gained widespread adoption within the visually impaired population (Chebat et al., 2018). Different groups of researchers offered various explanations for this (Chebat et al., 2018; Spence, 2014; Lenay & Declerck, 2018; Auvray & Farina, 2017; Deroy & Auvray, 2012). They have mainly been criticised for their lack of generalisability beyond the laboratory (Lenay & Declerck, 2018), and some have argued that it is simply implausible that one sense can substitute another (Auvray & Harris, 2014). These arguments, however, are largely aimed at claims that sensory substitution techniques literally substitute a sensory form (i.e., ‘seeing with the brain’ (Bach-y-Rita et al., 1969), ‘seeing with the skin’ (White, Saunders, Scadden, Bach-Y-Rita, & Collins, 1970) or ‘seeing with sound’ (Meijer, 2019).
Approximately 30% of assistive devices are reportedly abandoned before they are even implemented (Phillips & Zhao, 2010). Possible reasons for the abandonment of assistive technologies include the lack of a user-centric approach, difficulty of procurement, poor performance, inability to meet changes in user needs and unaffordable financial costs (Chebat et al., 2018; Phillips & Zhao, 2010). The early abandonment of assistive prototypes arguably has a detrimental impact on individuals with impairments and on wider society. We next explain how adopting an inclusive design mindset when applying sensory substitution techniques could overcome some of the current barriers to the implementation of cross-modal displays for supporting those with disabilities. We also explain how this could lead to further benefits for wider society.
Sensory substitution techniques as inclusive cross-modal displays
Despite our rich multisensory capabilities within the physical world (Calvert, Spence, & Stein, 2004), relatively few studies have investigated using cross-modal display modes to enhance the ways in which individuals interact with their environments (Sreetharan & Schutz, 2019). Sensory substitution techniques have the potential to transform, extend and augment our perceptual capacities by enabling novel forms of interaction with the environment (Auvray & Myin, 2009; Lenay, Canu, & Villon, 1997; Lenay, Gapenne, Hanneton, Marque, & Genouëlle, 2003). As cross-modal correspondences exist across cultures and languages, they could support a wide range of people, regardless of their capabilities and needs (Jordan & Vanderheiden, 2013). Sensory substitution techniques and cross-modal displays therefore have huge potential to serve different purposes than those served by the assistive technologies described so far.
Extensive research into sensory substitution techniques suggests that various sensory forms (e.g., auditory or tactile) could be utilised interchangeably to have access to the same sensory information (see Table 2). In this way, digital interactions could be made to switch between different senses when delivering the same information. The technologies could be made to be adaptable allowing them to flexibly deliver sensory information depending on user preferences and needs. This would allow sensory substitution techniques to be implemented in a variety of inclusive use cases from extended reality platforms to information and communication applications (Lenay et al., 1997, 2003). Other than assistive technology, a small number of technologies are currently in development which aim to enhance individual’s intact sensory capabilities when their sensory signal strength is temporarily weakened. Cross-modal displays which employ sensory substitution techniques to enhance sensory capabilities are classified as sensory augmentation devices (National Research Council, 2008). The inexpensive application of sensory substitution techniques is possible in the context of sensory augmentation devices with customisable builds and settings (Dublon & Paradiso, 2012). So far, sensory substitution techniques have been applied to firefighters who use tactile gloves equipped with ultrasound sensors when their vision is restricted. These tactile cues provide information about distance, thereby enhancing mobility (Carton & Dunne, 2013). Sensory substitution techniques have also been applied in technologies used by the military, and the alerting systems used in cars to signal an incoming obstacle can be thought of as taking advantage of sensory substitution techniques (Grah et al., 2015; National Research Council, 2008). The potential exists for the use of sensory substitution techniques to benefit a wide range of users.
As sensory substitution stands between perception and cognition (Arnold, Pesnot-Lerousseau, & Auvray, 2017; Esenkaya & Proulx, 2016), exploring sensory substitution phenomena in a broader multisensory context could contribute new insights into how different sensory information and forms are interconnected with each other via cognitive forms. If we can better understand the ways in which sensory substitution techniques work, we may be able to better support individuals who have disabilities by developing inclusive technologies. This could eventually overcome some of the adoption challenges that have been identified with sensory substitution techniques applied as assistive technologies.
Developing a single product or service that appeals to a great number of people is challenging. Inclusive design does not claim to provide omnipotent and omnipresent solutions to address every barrier to usability and accessibility. Instead, the inclusive designer aims to develop flexible and adjustable technologies that appeal to all of us. We suggest this can be done by considering our shared perceptual and/or cognitive capabilities. While individuals vary in terms of their sensory experiences, with some individuals experiencing impairments, more individuals will be similar in their cognitive processing of sensory information. Rather than aim to compensate for impaired sensory forms using specialist devices built for sub-populations, technologies which take advantage of the metamodal organisation of the brain could be used by all individuals.
Future applications of inclusive cross-modal displays
Sensory substitution techniques have the potential to enable users to alternate between different cross-modal display modes which would allow a wide range of users to access the same device. A simple example could be the way that pedestrians use navigation applications which require frequent screen-dependent feedback. This means of human-computer interaction is more difficult for a pedestrian with a visual impairment, resulting in the development of multiple specialist solutions. While specialist solutions are helpful, an inclusive alternative could co-exist, which would benefit all parties. A navigation application which utilises cross-modal display modes would enable users to switch between auditory and tactile sensory channels as required. The same information will be provided in each of the sensory channels. A visually impaired pedestrian would benefit from this technology since they could receive information about the environment via auditory and/or tactile display modes. Meanwhile, the use of auditory and/or tactile display modes would allow a sighted pedestrian to navigate their environment without relying on visual feedback from a screen, allowing for greater enjoyment of their surroundings. In this way, both users benefit from the use of the same cross-modal displays.
Other future cross-modal displays using sensory substitution techniques could include artistic applications; games; extended reality environments; portable and intuitive systems; and mobility, communication or education platforms (Lenay et al., 1997, 2003; National Research Council, 2008). Sensory substitution techniques can be used to enrich our experiences with the digital world by complementing, and hence reducing, some of the visual information using non-visual cross-modal cues (Hoggan & Brewster, 2007). For example, cross-modal displays could be deployed in conveying emotions via novel sensory forms which do not have a screen dependency, which could improve our tangible interactions with one another.
These examples are all hypothetical. To our knowledge, no such mainstream technologies currently exist which make use of sensory substitution techniques. However, enormous potential exists to develop such cross-modal devices in the future. The scientific literature offers a vast amount of sensory substitution techniques with distinct methods of transforming sensory signals. Investigating their information capacity and perceived resolutions (e.g., Richardson et al., 2019) expands state-of-the-art knowledge regarding multisensory and cross-modal information processing. If these cognitive mechanisms were utilised by applied researchers, the development of innovative technologies which improve access to external information and enhance sensory capabilities of all individuals, regardless of any sensory impairments, would be possible.
Wider benefits of inclusive cross-modal displays
In recent years, the concepts of cross-modal cognition have spread into multiple disciplines, and examples of their applications can be found in neural networks, artificial intelligence and cognitive robotics (Corradi, Hall, & Iravani, 2017; Hawkins & Blakeslee, 2005; Li, Zhu, Tedrake, & Torralba, 2019; Di Nuovo & Cangelosi, 2015). Now there are new opportunities to make use of sensory substitution techniques in a similar way. These techniques are argued to be ‘universal brain-computer interfaces’ (Danilov & Tyler, 2005) because they make use of the brain’s capacity to inclusively process information, regardless of its original form. In this way, sensory substitution techniques allow us to make sense of information otherwise inaccessible to our natural sensory organs. In this context, sensory substitution techniques can be considered the cognitive transmutation of information to interface. Thinking about sensory substitution in this way brings opportunities to the ways in which we solve modern problems. For example, instead of converting an already existing graphical game (e.g., Pacman or Space Invaders) into an auditory form to improve accessibility for those who are blind, sensory substitution techniques could be utilised to create new forms of multisensory entertainment, to be enjoyed by users with and without visual impairments simultaneously. Why not develop new tools and approaches for novel forms of art (Kim, 2015; Todd Selby, 2011) that can be enjoyed by a wider range of people? Why not focus on multisensory tangible interactions to democratise the ‘pixel empire’ (Ishii, 2019) equally with other senses? Inclusive cross-modal displays have the potential to change how we interact with technology and how technology interacts with us.
Inclusion is as much about technology, art, policies, social institutions, and commercial models as it is about how one accepts and tolerates others in society. It is a mindset that can be applied in thinking, designing and creating, thereby encouraging all individuals to exist in equilibrium with one another. Overall, these premises offer an inclusive alternative to the usability and accessibility perspectives that are built on a legacy of traditional frameworks, commercial models, and social and academic conversations which view disability as something which accompanies individuals, rather than something which is created by environmental barriers. Human-technology interactions can take advantage of the information processing capability of the metamodal brain in a multisensory context. Rather than creating tools which are merely assistive to compensate for sensory impairments, research and development into sensory substitution techniques could be unified by a motivation for inclusion. New technologies which benefit all individuals could be developed. Accumulated knowledge might then be transferred laterally in a multidisciplinary context, and practically applied to inclusive innovations that appeal to us all.
Availability of data and materials
Electronic travel aid
Electronic orientation aid
Human computer interactions
Tactile vision sensory substitution
Sensory substitution devices
Abboud, S., Hanassy, S., Levy-Tzedek, S., Maidenbaum, S., & Amedi, A. (2014). EyeMusic: introducing a ‘visual’ colorful experience for the blind using auditory sensory substitution. Restorative Neurology and Neuroscience, 32(2), 247–257. https://doi.org/10.3233/RNN-130338.
Akita, J., Komatsu, T., Ito, K., Ono, T., & Okamoto, M. (2009). CyARM: haptic sensing device for spatial localization on basis of exploration by arms. Advances in Human-Computer Interaction, 2009, 1–6. https://doi.org/10.1155/2009/901707.
Amedi, A. & Hanassy, S. (2011). Infra red based devices for guiding blind and visually impaired persons. https://patents.google.com/patent/WO2012090114A1/en. Accessed 11 Nov 2019.
Arnold, G., Pesnot-Lerousseau, J., & Auvray, M. (2017). Individual differences in sensory substitution. Multisensory Research, 30(6), 579–600. https://doi.org/10.1163/22134808-00002561.
Auvray, M., & Farina, M. (2017). Patrolling the Boundaries of Synaesthesia. In O Deroy (Ed.), Synaesthesia: Philosophical & Psychological Challenges (pp. 248–274). Oxford: Oxford University Press.
Auvray, M., Hanneton, S., & O’Regan, J. K. (2007). Learning to perceive with a visuo—auditory substitution system: localisation and object recognition with ‘the voice. Perception, 36(3), 416–430. https://doi.org/10.1068/p5631.
Auvray, M., & Harris, L. R. (2014). The state of the art of sensory substitution. Multisensory Research, 27(5–6), 265–269. https://doi.org/10.1163/22134808-00002464.
Auvray, M., & Myin, E. (2009). Perception with compensatory devices: from sensory substitution to sensorimotor extension. Cognitive Science, 33(6), 1036–1058. https://doi.org/10.1111/j.1551-6709.2009.01040.x.
Bach-y-Rita, P., Collins, C. C., Saunders, F. A., White, B., & Scadden, L. (1969). Vision substitution by tactile image projection. Nature, 221(5184), 963–964. https://doi.org/10.1038/221963a0.
Bach-y-Rita, P., & W. Kercel, S. (2003). Sensory substitution and the human–machine interface. Trends in Cognitive Sciences, 7(12), 541–546. https://doi.org/10.1016/J.TICS.2003.10.013.
Barilari, M., de Heering, A., Crollen, V., Collignon, O., & Bottini, R. (2018). Is red heavier than yellow even for blind? I-Perception, 9(1), 204166951875912. https://doi.org/10.1177/2041669518759123.
Bermejo, F., Di Paolo, E. A., Hüg, M. X., & Arias, C. (2015). Sensorimotor strategies for recognizing geometrical shapes: a comparative study with different sensory substitution devices. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.00679.
Bichard, J. A., Coleman, R., & Langdon, P. (2007). Does my stigma look big in this? Considering acceptability and desirability in the inclusive design of technology products. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4554 LNCS (PART 1), 622–631. https://doi.org/10.1007/978-3-540-73279-2_69.
Bologna, G., Deville, B., & Pun, T. (2009). On the use of the auditory pathway to represent image scenes in real-time. Neurocomputing, 72(4–6), 839–849. https://doi.org/10.1016/J.NEUCOM.2008.06.020.
Borenstein, J. (1990). The NavBelt - a computerized multi-sensor travel aid for active guidance of the blind. In Proceedings of the Csun’s Fifth Annual Conference on Technology and Persons with Disabilities, (pp. 107–116) 10.1.1.23.9115.
Borenstein, J., Ulrich, I., & Shoval, S. (2000). Computerized obstacle avoidance systems for the blind and visually impaired. In H. N. L. Teodorescu, & L. Jain (Eds.), Intelligent systems and technologies in rehabilitation engineering, (pp. 414–448). https://doi.org/10.1201/9781420042122.ch14.
Botzer, A., Shvalb, N., & Ben-Moshe, B. (2018). Using sound feedback to help blind people navigate. In Proceedings of the 36th European Conference on Cognitive Ergonomics - ECCE’18, (pp. 1–3). https://doi.org/10.1145/3232078.3232083.
Brefczynski-Lewis, J. A., & Lewis, J. W. (2017). Auditory object perception: a neurobiological model and prospective review. Neuropsychologia, 105, 223–242. https://doi.org/10.1016/j.neuropsychologia.2017.04.034.
Bremner, A. J. (2017). Multisensory development: calibrating a coherent sensory milieu in early life. Current Biology, 27(8), R305–R307. https://doi.org/10.1016/J.CUB.2017.02.055.
Brown, D., Macpherson, T., & Ward, J. (2011). Seeing with sound? Exploring different characteristics of a visual-to-auditory sensory substitution device. Perception, 40(9), 1120–1135. https://doi.org/10.1068/p6952.
Bülthoff, H. H., & Mallot, H. A. (1988). Integration of depth modules: stereo and shading. Journal of the Optical Society of America A, 5(10), 1749. https://doi.org/10.1364/JOSAA.5.001749.
Burchardt, T. (2004). Capabilities and disability: the capabilities framework and the social model of disability. Disability and Society, 19(7), 735–751. https://doi.org/10.1080/0968759042000284213.
Butts, A. M. (2015). Enhancing the perception of speech indexical properties of cochlear implants through sensory substitution. Tempe: Arizona State University.
Calvert, G. A., Spence, C., & Stein, B. E. (2004). The handbook of multisensory processing. Cambridge: MIT Press.
Cancar, L., Díaz, A., Barrientos, A., Travieso, D., & Jacobs, D. M. (2013). Tactile-sight: a sensory substitution device based on distance-related vibrotactile flow. International Journal of Advanced Robotic Systems, 10(6), 272. https://doi.org/10.5772/56235.
Capelle, C., Trullemans, C., Arno, P., & Veraart, C. (1998). A real-time experimental prototype for enhancement of vision rehabilitation using auditory substitution. IEEE Transactions on Biomedical Engineering, 45(10), 1279–1293. https://doi.org/10.1109/10.720206.
Cardin, S., Thalmann, D., & Vexo, F. (2007). A wearable system for mobility improvement of visually impaired people. The Visual Computer, 23(2), 109–118. https://doi.org/10.1007/s00371-006-0032-4.
Carton, A., & Dunne, L. E. (2013). Tactile distance feedback for firefighters. In Proceedings of the 4th Augmented Human International Conference on - AH ‘13, (pp. 58–64). https://doi.org/10.1145/2459236.2459247.
Chebat, D.-R., Harrar, V., Kupers, R., Maidenbaum, S., Amedi, A., & Ptito, M. (2018). Sensory substitution and the neural correlates of navigation in blindness. In Mobility of visually impaired people, (pp. 167–200). https://doi.org/10.1007/978-3-319-54446-5_6.
Chebat, D.-R., Schneider, F. C., Kupers, R., & Ptito, M. (2011). Navigation with a sensory substitution device in congenitally blind individuals. NeuroReport, 22(7), 342–347. https://doi.org/10.1097/WNR.0b013e3283462def.
Choi, I., Lee, J.-Y., & Lee, S.-H. (2018). Bottom-up and top-down modulation of multisensory integration. Current Opinion in Neurobiology, 52, 115–122. https://doi.org/10.1016/J.CONB.2018.05.002.
Corradi, T., Hall, P., & Iravani, P. (2017). Object recognition combining vision and touch. Robotics and Biomimetics, 4(1), 2. https://doi.org/10.1186/s40638-017-0058-2.
d’Albe, E. E. F. (1914). On a type-reading optophone. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 90(619), 373–375. https://doi.org/10.1098/rspa.1914.0061.
Dakopoulos, D., & Bourbakis, N. G. (2010). Wearable obstacle avoidance electronic travel aids for blind: a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(1), 25–35. https://doi.org/10.1109/TSMCC.2009.2021255.
Danilov, Y., & Tyler, M. (2005). BrainPort: an alternative input to the brain. Journal of Integrative Neuroscience, 04(04), 537–550. https://doi.org/10.1142/S0219635205000914.
Design Council. (2019). Design council. https://www.designcouncil.org.uk/. Accessed 11 Nov 2019.
Dewsbury, G., Clarke, K., Randall, D., Rouncefield, M., & Sommerville, I. (2004). The anti-social model of disability. Disability and Society, 19(2), 145–158. https://doi.org/10.1080/0968759042000181776.
Di Nuovo, A., & Cangelosi, A. (2015). Artificial mental imagery in cognitive robots interaction. In 2015 IEEE Symposium Series on Computational Intelligence, (pp. 91–96). https://doi.org/10.1109/SSCI.2015.23.
Dublon, G., & Paradiso, J. A. (2012). Tongueduino. In Proceedings of the 2012 ACM Annual Conference Extended Abstracts on Human Factors in Computing Systems Extended Abstracts - CHI EA ‘12, (p. 1453). https://doi.org/10.1145/2212776.2212482.
Dunai, L., Peris-Fajarnés, G., Lluna, E., & Defez, B. (2013). Sensory navigation device for blind people. Journal of Navigation, 66(3), 349–362. https://doi.org/10.1017/S0373463312000574.
Durette, B., Louveton, N., Alleysson, D., & Hérault, J. (2008). Visuo-auditory sensory substitution for mobility assistance: testing TheVIBE. Workshop on Computer Vision Applications for the Visually Impaired, 1–13.
Deroy, Ophelia, & Auvray, M. (2012). Reading the World through the Skin and Ears: A New Perspective on Sensory Substitution. Frontiers in Psychology, 3. https://doi.org/10.3389/fpsyg.2012.00457.
Eagleman, D. M., Novich, S. D., Goodman, D., Sahoo, A., & Perotta, M. (2017). Method and system for providing adjunct sensory information to a user. https://patents.google.com/patent/US10198076B2/en.
Esenkaya, T., & Proulx, M. J. (2016). Crossmodal processing and sensory substitution: is ‘seeing’ with sound and touch a form of perception or cognition? Behavioral and Brain Sciences, 39, e241. https://doi.org/10.1017/S0140525X1500268X.
Farcy, R., Leroux, R., Jucha, A., Damaschinin, R., Grégoire, C., & Zogaghi, A. (2006). Electronic travel aids and electronic orientation aids for blind people: technical, rehabilitation and everyday life points of view. In Conference on Assistive Technology for Vision and Hearing Impairment (CVHI).
Faugloire, E., & Lejeune, L. (2014). Evaluation of heading performance with vibrotactile guidance: The benefits of information–movement coupling compared with spatial language. Journal of Experimental Psychology: Applied, 20(4), 397–410. https://doi.org/10.1037/xap0000032.
Finnegan, D. J., O’Neill, E., & Proulx, M. J. (2016). Compensating for distance compression in audiovisual virtual environments using incongruence. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems - CHI ‘16, (pp. 200–212). https://doi.org/10.1145/2858036.2858065.
Froese, T., McGann, M., Bigge, W., Spiers, A., & Seth, A. K. (2012). The enactive torch: a new tool for the science of perception. IEEE Transactions on Haptics, 5(4), 365–375. https://doi.org/10.1109/TOH.2011.57.
Gallagher, D. J., Connor, D. J., & Ferri, B. A. (2014). Beyond the far too incessant schism: special education and the social model of disability. International Journal of Inclusive Education, 18(11), 1120–1142. https://doi.org/10.1080/13603116.2013.875599.
Geldard, F. A. (1966). Cutaneous coding of optical signals: the optohapt. Perception & Psychophysics, 1(11), 377–381. https://doi.org/10.3758/BF03215810.
Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends in Cognitive Sciences, 10(6), 278–285. https://doi.org/10.1016/J.TICS.2006.04.008.
Gieben-Gamal, E., & Matos, S. (2017). Design and disability. developing new opportunities for the design curriculum. The Design Journal, 20(sup1), S2022–S2032. https://doi.org/10.1080/14606925.2017.1352721.
Gori, M., Del Viva, M., Sandini, G., & Burr, D. C. (2008). Young children do not integrate visual and haptic form information. Current Biology, 18(9), 694–698. https://doi.org/10.1016/j.cub.2008.04.036.
Grah, T., Epp, F., Wuchse, M., Meschtscherjakov, A., Gabler, F., Steinmetz, A., & Tscheligi, M. (2015). Dorsal haptic display: a shape-changing car seat for sensory augmentation of rear obstacles. In Proceedings of the 7th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, (pp. 305–312). https://doi.org/10.1145/2799250.2799281.
Grant, P., Spencer, L., Arnoldussen, A., Hogle, R., Nau, A., Szlyk, J., et al. (2016). The functional performance of the brainport v100 device in persons who are profoundly blind. Journal of Visual Impairment & Blindness, 110(2), 77–88. https://doi.org/10.1177/0145482X1611000202.
Hamilton-Fletcher, G., Obrist, M., Watten, P., Mengucci, M., & Ward, J. (2016). “I Always Wanted to See the Night Sky”: Blind user preferences for sensory substitution devices. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems - CHI ‘16, (pp. 2162–2174). https://doi.org/10.1145/2858036.2858241.
Hamilton-Fletcher, G., & Ward, J. (2013). Representing colour through hearing and touch in sensory substitution devices. Multisensory Research, 26(6), 503–532.
Hamilton-Fletcher, G., Wright, T. D., & Ward, J. (2016). Cross-modal correspondences enhance performance on a colour-to-sound sensory substitution device. Multisensory Research, 29(4–5), 337–363.
Hawkins, J., & Blakeslee, S. (2005). On intelligence, (1st ed., ). New York: Henry Holt and Co.
Hoffmann, R., Spagnol, S., Kristjánsson, Á., & Unnthorsson, R. (2018). Evaluation of an audio-haptic sensory substitution device for enhancing spatial awareness for the visually impaired. Optometry and Vision Science, 95(9), 757–765. https://doi.org/10.1097/OPX.0000000000001284.
Hoggan, E., & Brewster, S. (2007). Designing audio and tactile crossmodal icons for mobile devices. In Proceedings of the Ninth International Conference on Multimodal Interfaces - ICMI ‘07, (p. 162). https://doi.org/10.1145/1322192.1322222.
Hoggan, E., Kaaresoja, T., Laitinen, P., & Brewster, S. (2008). Crossmodal congruence. In Proceedings of the 10th International Conference on Multimodal Interfaces - IMCI ‘08, (p. 157). https://doi.org/10.1145/1452392.1452423.
Ishii, H. (2019). SIGCHI lifetime research award talk. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems - CHI EA ‘19, (pp. 1–4). https://doi.org/10.1145/3290607.3313769.
Ito, K., Okamoto, M., Akita, J., Ono, T., Gyobu, I., Takagi, T., et al. (2005). CyARM: an alternative aid device for blind persons. In CHI ‘05 Extended Abstracts on Human Factors in Computing Systems - CHI ‘05, (p. 1483). https://doi.org/10.1145/1056808.1056947.
Jicol, C., Lloyd-Esenkaya, T., Proulx, M. J., Lange-Smith, S., Scheller, M., O’Neill, E., & Petrini, K. (2020). Efficiency of sensory substitution devices alone and in combination with self-motion for spatial navigation in sighted and visually impaired. Frontiers in Psychology, 11, 1443.
Jones, L. A., Nakamura, M., & Lockyer, B. (2004). Development of a tactile vest. In 12th International Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, 2004, HAPTICS ‘04. Proceedings, (pp. 82–89). https://doi.org/10.1109/HAPTIC.2004.1287181.
Jordan, J. B., & Vanderheiden, G. C. (2013). Modality-independent interaction framework for cross-disability accessibility. In Cross-cultural design. methods, practice, and case studies, (pp. 218–227). https://doi.org/10.1007/978-3-642-39143-9_24.
Kaczmarek, K. A., Webster, J. G., Bach-y-Rita, P., & Tompkins, W. J. (1991). Electrotactile and vibrotactile displays for sensory substitution systems. IEEE Transactions on Biomedical Engineering, 38(1), 1–16. https://doi.org/10.1109/10.68204.
Kayser, C., & Logothetis, N. K. (2007). Do early sensory cortices integrate cross-modal information? Brain Structure and Function, 212(2), 121–132. https://doi.org/10.1007/s00429-007-0154-0.
Kim, C. S. (2015). Christine Sun Kim: The enchanting music of sign language | TED Talk. https://www.ted.com/talks/christine_sun_kim_the_enchanting_music_of_sign_language. Accessed 11 Nov 2019.
Kupers, R., Chebat, D. R., Madsen, K. H., Paulson, O. B., & Ptito, M. (2010). Neural correlates of virtual route recognition in congenital blindness. Proceedings of the National Academy of Sciences, 107(28), 12716–12721. https://doi.org/10.1073/pnas.1006199107.
Lenay, C., Canu, S., & Villon, P. (1997). Technology and perception: the contribution of sensory substitution systems. In Proceedings Second International Conference on Cognitive Technology Humanizing the Information Age, (pp. 44–53). https://doi.org/10.1109/CT.1997.617681.
Lenay, C., & Declerck, G. (2018). Technologies to access space without vision. some empirical facts and guiding theoretical principles. In Mobility of visually impaired people, (pp. 53–75). https://doi.org/10.1007/978-3-319-54446-5_2.
Lenay, C., Gapenne, O., Hanneton, S., Marque, C., & Genouëlle, C. (2003). Sensory substitution: limits and perspectives. In Y. Hatwell, A. Streri, & E. Gentaz (Eds.), Touching for knowing: cognitive psychology of haptic manual perception, (pp. 275–292). Amsterdam: John Benjamins Publishing.
Levy-Tzedek, S., Novick, I., Arbel, R., Abboud, S., Maidenbaum, S., Vaadia, E., & Amedi, A. (2012). Cross-sensory transfer of sensory-motor information: visuomotor learning affects performance on an audiomotor task, using sensory-substitution. Scientific Reports, 2(1), 949. https://doi.org/10.1038/srep00949.
Li, Y., Zhu, J. Y., Tedrake, R., & Torralba, A. (2019). Connecting touch and vision via cross-modal prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 10609–10618).
Linvill, J. G., & Bliss, J. C. (1966). A direct translation reading aid for the blind. Proceedings of the IEEE, 54(1), 40–51. https://doi.org/10.1109/PROC.1966.4572.
Liu, J., Liu, J., Xu, L., & Jin, W. (2010). Electronic travel aids for the blind based on sensory substitution. In 2010 5th International Conference on Computer Science & Education, (pp. 1328–1331). https://doi.org/10.1109/ICCSE.2010.5593738.
Maidenbaum, S., Levy-Tzedek, S., Chebat, D.-R., & Amedi, A. (2013). Increasing accessibility to the blind of virtual environments, using a virtual mobility aid based on the ‘EyeCane’: feasibility study. PLoS One, 8(8), e72555. https://doi.org/10.1371/journal.pone.0072555.
Maidenbaum, S., Levy-Tzedek, S., Chebat, D. R., Namer-Furstenberg, R., & Amedi, A. (2014). The effect of extended sensory range via the eyecane sensory substitution device on the characteristics of visionless virtual navigation. Multisensory Research, 27(5–6), 379–397. https://doi.org/10.1163/22134808-00002463.
Marks, L. E. (1974). On associations of light and sound: the mediation of brightness, pitch, and loudness. The American Journal of Psychology, 87(1–2), 173–188.
Meijer, P. (1992). An experimental system for auditory image representations. IEEE Transactions on Biomedical Engineering, 39(2), 112–121. https://doi.org/10.1109/10.121642
Meijer, P. (2019). Seeing with sound. https://www.seeingwithsound.com/. Accessed 11 Nov 2019.
Melara, R. D., & O’Brien, T. P. (1987). Interaction between synesthetically corresponding dimensions. Journal of Experimental Psychology: General, 116(4), 323–336. https://doi.org/10.1037/0096-34184.108.40.2063
Nagel, S. K., Carl, C., Kringe, T., Märtin, R., & König, P. (2005). Beyond sensory substitution—learning the sixth sense. Journal of Neural Engineering, 2(4), R13–R26. https://doi.org/10.1088/1741-2560/2/4/R02
Nardini, M., Jones, P., Bedford, R., & Braddick, O. (2008). Development of cue integration in human navigation. Current Biology, 18(9), 689–693. https://doi.org/10.1016/J.CUB.2008.04.021
National Research Council (2008). Emerging cognitive neuroscience and related technologies. https://doi.org/10.17226/12177. Accessed 11 Nov 2019.
Nau, A., Bach, M., & Fisher, C. (2013). Clinical tests of ultra-low vision used to evaluate rudimentary visual perceptions enabled by the brainport vision device. Translational Vision Science & Technology, 2(3), 1. https://doi.org/10.1167/tvst.2.3.1.
Newell, A. (2003). Inclusive design or assistive technology. Inclusive Design, 172–181. https://doi.org/10.1007/978-1-4471-0001-0_11.
Newell, F. N., Ernst, M. O., Tjan, B. S., & Bülthoff, H. H. (2001). Viewpoint dependence in visual and haptic object recognition. Psychological Science, 12(1), 37–42. https://doi.org/10.1111/1467-9280.00307.
Obrist, M., Gatti, E., Maggioni, E., Vi, C. T., & Velasco, C. (2017). multisensory experiences in HCI. IEEE Multimedia, 24(2), 9–13. https://doi.org/10.1109/MMUL.2017.33.
Oliver, M. (2013). The social model of disability: thirty years on. Disability and Society, 28(7), 1024–1026. https://doi.org/10.1080/09687599.2013.818773.
Ortiz, T., Poch, J., Santos, J. M., Requena, C., Martínez, A. M., Ortiz-Terán, L., et al. (2011). Recruitment of occipital cortex during sensory substitution training linked to subjective experience of seeing in people with blindness. PLoS One, 6(8), e23264. https://doi.org/10.1371/journal.pone.0023264.
Ortiz-Terán, L., Ortiz, T., Perez, D. L., Aragón, J. I., Diez, I., Pascual-Leone, A., & Sepulcre, J. (2016). brain plasticity in blind subjects centralizes beyond the modal cortices. Frontiers in Systems Neuroscience, 10. https://doi.org/10.3389/fnsys.2016.00061.
Oviatt, S. (1999). Ten myths of multimodal interaction. Communications of the ACM, 42(11), 74–81. https://doi.org/10.1145/319382.319398.
Parise, C. V., & Spence, C. (2012). Audiovisual crossmodal correspondences and sound symbolism: a study using the implicit association test. Experimental Brain Research, 220(3–4), 319–333. https://doi.org/10.1007/s00221-012-3140-6.
Pascual-Leone, A., & Hamilton, R. (2001). The metamodal organization of the brain. Progress in Brain Research, 134, 427–445. https://doi.org/10.1016/s0079-6123(01)34028-1.
Pasqualotto, A., & Esenkaya, T. (2016). Sensory substitution: the spatial updating of auditory scenes ‘mimics’ the spatial updating of visual scenes. Frontiers in Behavioral Neuroscience, 10. https://doi.org/10.3389/fnbeh.2016.00079.
Perrault, T. J., Vaughan, J. W., Stein, B. E., & Wallace, M. T. (2003). Neuron-specific response characteristics predict the magnitude of multisensory integration. Journal of Neurophysiology, 90(6), 4022–4026. https://doi.org/10.1152/jn.00494.2003.
Persson, H., Åhman, H., Yngling, A. A., & Gulliksen, J. (2015). Universal design, inclusive design, accessible design, design for all: different concepts—one goal? On the concept of accessibility—historical, methodological and philosophical aspects. Universal Access in the Information Society, 14(4), 505–526. https://doi.org/10.1007/s10209-014-0358-z.
Phillips, B., & Zhao, H. (2010). Predictors of assistive technology abandonment. Assistive Technology, 5(1), 36–45. https://doi.org/10.1080/10400435.1993.10132205.
Proulx, M. J., Brown, D. J., Pasqualotto, A., & Meijer, P. (2014). Multisensory perceptual learning and sensory substitution. Neuroscience & Biobehavioral Reviews, 41, 16–25. https://doi.org/10.1016/J.NEUBIOREV.2012.11.017.
Proulx, M. J., & Harder, A. (2008). Sensory substitution: visual-to-auditory sensory substitution devices for the blind. Tijdschrift Voor Ergonomie, 6(33).
Renier, L., Laloyaux, C., Collignon, O., Tranduy, D., Vanlierde, A., Bruyer, R., & De Volder, A. G. (2005). The ponzo illusion with auditory substitution of vision in sighted and early-blind subjects. Perception, 34(7), 857–867. https://doi.org/10.1068/p5219.
Ricciardi, E., Bonino, D., Pellegrini, S., & Pietrini, P. (2014). Mind the blind brain to understand the sighted one! Is there a supramodal cortical functional architecture? Neuroscience & Biobehavioral Reviews, 41, 64–77. https://doi.org/10.1016/J.NEUBIOREV.2013.10.006.
Ricciardi, E., & Pietrini, P. (2011). New light from the dark: what blindness can teach us about brain function. Current Opinion in Neurology, 24(4), 357–363. https://doi.org/10.1097/WCO.0b013e328348bdbf.
Richardson, M., Thar, J., Alvarez, J., Borchers, J., Ward, J., & Hamilton-Fletcher, G. (2019). How much spatial information is lost in the sensory substitution process? Comparing visual, tactile, and auditory approaches. Perception, 48(11), 1079–1103. https://doi.org/10.1177/0301006619873194.
Rochlis, J. (1998). A vibrotactile display for aiding extravehicular activity (EVA) navigation in space. Cambridge: Massachusetts Institute of Technology.
Rock, I., & Victor, J. (1964). Vision and touch: an experimentally created conflict between the two senses. Science, 143(3606), 594–596. https://doi.org/10.1126/science.143.3606.594.
Rohde, M., van Dam, L. C. J., & Ernst, M. (2016). Statistically optimal multisensory cue integration: a practical tutorial. Multisensory Research, 29(4–5), 279–317.
Sampaio, E., Maris, S., & Bach-y-Rita, P. (2001). Brain plasticity: ‘visual’ acuity of blind persons via the tongue. Brain Research, 908(2), 204–207. https://doi.org/10.1016/S0006-8993(01)02667-1.
Scheller, M., Proulx, M. J., de Haan, M., Dahlmann-Noor, A., & Petrini, K. (2020). Late- but not early-onset blindness impairs the development of audio-haptic multisensory integration. Developmental Science, (September 2019), 1–17. https://doi.org/10.1111/desc.13001.
Segond, H., Weiss, D., & Sampaio, E. (2005). Human spatial navigation via a visuo-tactile sensory substitution system. Perception, 34(10), 1231–1249. https://doi.org/10.1068/p3409.
Todd Selby. (2011). Todd Selby x Christine Sun Kim | NOWNESS. https://www.nowness.com/story/todd-selby-x-christine-sun-kim. Accessed 11 Nov 2019.
Shoval, S., Borenstein, J., & Koren, Y. (1998). Auditory guidance with the Navbelt-a computerized travel aid for the blind. IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 28(3), 459–467. https://doi.org/10.1109/5326.704589.
Siegle, J. H., & Warren, W. H. (2010). Distal attribution and distance perception in sensory substitution. Perception, 39(2), 208–223. https://doi.org/10.1068/p6366.
Sound Foresight Technology. (2019a). UltraBike. https://www.ultracane.com/ultra_bike. Accessed 11 Nov 2019.
Sound Foresight Technology. (2019b). UltraCane. https://www.ultracane.com/about_the_ultracane. Accessed 11 Nov 2019.
Spence, C. (2011). Crossmodal correspondences: a tutorial review. Attention, Perception, & Psychophysics, 73(4), 971–995. https://doi.org/10.3758/s13414-010-0073-7.
Spence, C. (2014). The Skin as a Medium for Sensory Substitution. Multisensory Research, 27(5–6), 293–312. https://doi.org/10.1163/22134808-00002452.
Spence, C., & Parise, C. V. (2012). The cognitive neuroscience of crossmodal correspondences. I-Perception, 3(7), 410–412. https://doi.org/10.1068/i0540ic.
Sreetharan, S., & Schutz, M. (2019). Improving human–computer interface design through application of basic research on audiovisual integration and amplitude envelope. Multimodal Technologies and Interaction, 3(1), 4. https://doi.org/10.3390/mti3010004.
Stanford, T. R. (2005). Evaluating the operations underlying multisensory integration in the cat superior colliculus. Journal of Neuroscience, 25(28), 6499–6508. https://doi.org/10.1523/JNEUROSCI.5095-04.2005.
Stanford, T. R., & Stein, B. E. (2007). Superadditivity in multisensory integration: putting the computation in context. NeuroReport, 18(8), 787–792. https://doi.org/10.1097/WNR.0b013e3280c1e315.
Starkiewicz, W., & Kuliszewski, T. (1963). The 80-channel elektroftalm. In L. Clark (Ed.), Proceedings of the International Congress on Technology and Blindness, (2nd ed., p. 157). New York: American Foundation for the Blind.
Stein, B. E., & Wallace, M. T. (1996). Comparisons of cross-modality integration in midbrain and cortex. Progress in Brain Research, 112, 289–299. https://doi.org/10.1016/s0079-6123(08)63336-1.
Stein, B. E. (1998). Neural mechanisms for synthesizing sensory information and producing adaptive behaviors. Experimental Brain Research, 123(1–2), 124–135. https://doi.org/10.1007/s002210050553.
Stein, B. E., Burr, D., Constantinidis, C., Laurienti, P. J., Alex Meredith, M., Perrault, T. J., … Lewkowicz, D. J. (2010). Semantic confusion regarding the development of multisensory integration: a practical solution. European Journal of Neuroscience, 31(10), 1713–1720. https://doi.org/10.1111/j.1460-9568.2010.07206.x.
Stiles, N. R. B., & Shimojo, S. (2015). auditory sensory substitution is intuitive and automatic with texture stimuli. Scientific Reports, 5(1), 15628. https://doi.org/10.1038/srep15628.
Stiles, N. R. B., Zheng, Y., & Shimojo, S. (2015). Length and orientation constancy learning in 2-dimensions with auditory sensory substitution: the importance of self-initiated movement. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.00842.
Stoll, C., Palluel-Germain, R., Fristot, V., Pellerin, D., Alleysson, D., & Graff, C. (2015). Navigating from a depth image converted into sound. Applied Bionics and Biomechanics, 2015, 1–9. https://doi.org/10.1155/2015/543492.
Striem-Amit, E., Cohen, L., Dehaene, S., & Amedi, A. (2012). Reading with sounds: sensory substitution selectively activates the visual word form area in the blind. Neuron, 76(3), 640–652. https://doi.org/10.1016/J.NEURON.2012.08.026.
Stronks, H. C., Mitchell, E. B., Nau, A. C., & Barnes, N. (2016). Visual task performance in the blind with the BrainPort V100 Vision Aid. Expert Review of Medical Devices, 13(10), 919–931. https://doi.org/10.1080/17434440.2016.1237287.
Talsma, D., Senkowski, D., Soto-Faraco, S., & Woldorff, M. G. (2010). The multifaceted interplay between attention and multisensory integration. Trends in Cognitive Sciences, 14(9), 400–410. https://doi.org/10.1016/J.TICS.2010.06.008.
van Erp, J. B. F., Van Veen, H. A. H. C., Jansen, C., & Dobbins, T. (2005). Waypoint navigation with a vibrotactile waist belt. ACM Transactions on Applied Perception, 2(2), 106–117. https://doi.org/10.1145/1060581.1060585.
Visell, Y. (2009). Tactile sensory substitution: models for enaction in HCI. Interacting with Computers, 21(1–2), 38–53. https://doi.org/10.1016/j.intcom.2008.08.004.
Ward, J., & Meijer, P. (2010). Visual experiences in the blind induced by an auditory sensory substitution device. Consciousness and Cognition, 19(1), 492–500. https://doi.org/10.1016/J.CONCOG.2009.10.006.
Wicab. (2019). Wicab, Inc. BrainPort Technologies. United States. https://www.wicab.com/wicab-inc. Accessed 11 Nov 2019.
White, B. W., Saunders, F. A., Scadden, L., Bach-Y-Rita, P., & Collins, C. C. (1970). Seeing with the skin. Perception & Psychophysics, 7(1), 23–27. https://doi.org/10.3758/BF03210126.
Woods, A. T., Spence, C., Butcher, N., & Deroy, O. (2013). Fast lemons and sour boulders: testing crossmodal correspondences using an internet-based testing methodology. I-Perception, 4(6), 365–379. https://doi.org/10.1068/i0586.
Zelek, J. S., Bromley, S., Asmar, D., & Thompson, D. (2003). A haptic glove as a tactile-vision sensory substitution for wayfinding. Journal of Visual Impairment & Blindness, 97(10), 621–632. https://doi.org/10.1177/0145482X0309701007.
The authors thank Bora Esenkaya for his generous discussions leading to the conceptualisation of this paper.
This research has been funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 665992, and the UK’s EPSRC Centre for Doctoral Training in Digital Entertainment (CDE), EP/L016540/1.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lloyd-Esenkaya, T., Lloyd-Esenkaya, V., O’Neill, E. et al. Multisensory inclusive design with sensory substitution. Cogn. Research 5, 37 (2020). https://doi.org/10.1186/s41235-020-00240-7