Beyond small-scale spatial skills: Navigation skills and geoscience education

Background Research examining the relation between spatial skills and the science, technology, engineering and mathematics (STEM) fields has focused on small-scale spatial skills, even though some STEM disciplines—particularly the geography and geoscience (GEO) fields—involve large-scale spatial thinking at the core of their professional training. In Study 1, we compared large-scale navigation skills of experienced geologists with those of experienced psychologists, using a novel virtual navigation paradigm as an objective measure of navigation skills. In Study 2, we conducted a longitudinal study with novice Geographic Information Systems (GIS) students to investigate baseline navigational competence and improvement over the course of an academic semester. Results In Study 1, we found that geologists demonstrated higher navigational competence and were more likely to be categorized as integrating separate routes, compared to their non-STEM counterparts. In Study 2, novice GIS students showed superior baseline navigational competence compared to non-STEM students, as well as better spatial working memory and small-scale mental rotation skills, indicating self-selection. In addition, GIS students’ spatial skills improved more over the course of a semester than those of non-STEM students. Conclusions Our findings highlight the importance of large-scale spatial thinking for enrollment and success in the GEO fields but likely also across the broader range of thinking involving spatial distributions. We discuss the potential of GIS tools to develop spatial skills at an early age.


Significance
People with strong spatial skills are more likely to pursue and succeed in science, technology, engineering and mathematics (STEM) careers. Fortunately, spatial skills are malleable, and improvements in them are both durable and generalizable (Uttal et al., 2013). Improving spatial skills during educational training may be an effective way of increasing a gender-balanced STEM workforce. However, there are two gaps in current knowledge, which the current research addresses. First, existing research focuses almost exclusively on small-scale spatial skills, leaving a gap in our understanding of the relation between large-scale navigation skills and STEM learning, specifically related to the Geography and Geoscience (GEO) fields. Using a large-scale virtual environment, we tested the navigation skills of expert geologists and compared their performance with that of expert psychologists. Second, there is a need for a sustainable spatial training plan that can be integrated into classrooms; one possibility is the integration of Geographic Information Systems (GIS) tools in STEM teaching. The current project tested the effectiveness of GIS training in improving spatial skills in novice students.

Background
Humans need spatial skills to survive and function in a spatial world: to navigate from point A to point B, to manipulate objects and to invent tools. Strong spatial skills also predict interest and success in science, technology, engineering, and mathematics (STEM) disciplines (Kell, Lubinski, Benbow, & Steiger, 2013;Shea, Lubinski, & Benbow, 2001;Wai, Lubinski, & Benbow, 2009). However, these studies involve paper-and-pencil assessments of "small-scale" object-based manipulations such as mental rotation. Decades of behavioral research, and more recent findings from neuroscience, suggest that spatial thinking is a multidimensional construct involving different cognitive mechanisms and distinct neural networks for dealing with the space of objects (small scale) or environments (large scale; Aguirre & D'Esposito, 1999;Morris & Parslow, 2004;Philbeck, Behrmann, Black, & Ebert, 2000). Thus, an important dimension to consider when investigating the relation between STEM learning and spatial skills is scale. An important gap in our understanding of the reciprocal relation between spatial skills and STEM success involves whether large-scale spatial skills, like small-scale skills, predict enrollment in STEM fields and contribute to STEM success. Do good navigators make good scientists? Do good scientists develop navigationally relevant skills?
Traditional attempts to define and categorize spatial ability into constituent spatial skills (Carroll, 1993;Eliot, 1987;Linn & Petersen, 1985) have not highlighted the scale distinction, likely because psychometric tests did not really tap into large-scale skills. Montello (1993) discusses the importance of scale in understanding psychological spaces and defines four classes of psychological spaces based on the projective and not the absolute size of space relative to the human body-figural, vista, environmental and geographical. Small-scale spatial skills are needed at the figural (e.g., a small object) and vista (e.g., a single room) scales, where an individual can visually observe all spatial characteristics with minimal movement (i.e., from a single vantage point). However, large-scale spatial thinking comes into play at the environmental (e.g., a city) scale, where an individual may obtain information about the spatial properties of the space through considerable locomotion and at the geographical scale (e.g., a country), where direct locomotion must be replaced by symbolic learning from maps and models in order to obtain spatial information about the space. If large-scale navigation skills are indeed relevant for success in STEM fields, they should be most relevant in fields that require spatial reasoning on a large scale. The core of professional training in the GEO STEM disciplines (we use GEO to encompass the geography and geoscience disciplines that focus on spatial patterns and include geology, geography, geographic information systems, geophysics, oceanography and atmospheric science, among others) is engagement in spatial encoding and transformation on an environmental and geographical scale. Thus, GEO disciplines may rely on and hone large-scale thinking, which may not be the case in STEM fields like chemistry and physics or even engineering. So far, the disproportionate focus on small-scale spatial skills and their relation to general STEM learning ignores the heterogeneity of both spatial skills as well as that of STEM training.
A more nuanced approach motivated by findings from neuroscience and psychology (Chatterjee, 2008) is to categorize spatial skills based on the use of intrinsic/extrinsic object information and static/dynamic movement information, as illustrated in Fig. 1 (Newcombe, 2018;Newcombe & Shipley, 2015;Uttal et al., 2013). Common spatial measures used to test spatial skill in the laboratory exist in each of the four quadrants. For example, the small-scale spatial skill of mental rotation involves movement (dynamic) of a single object (intrinsic), whereas the large-scale skill of navigation frequently involves movement (dynamic) of oneself with respect to a set of objects (extrinsic). Research on skills in the extrinsic-dynamic cell at the bottom right has been largely empty.
One of the main reasons for the gap in extrinsic-dynamic spatial research is the challenge associated with conducting lengthy and standardized real-world navigation experiments (Choi,McKillop,Ward & L'Hirondelle,Fig. 1 Categorization of spatial skills by intrinsic/extrinsic object information and static/dynamic movement information 2006; Heth, Cornell & Alberts, 1997;Holscher, Tenbrink & Wiener, 2011;Schinazi, Nardi, Newcombe, Shipley, & Epstein, 2013). Virtual environment navigation paradigms overcome this challenge. One paradigm-Virtual Silcton-has been used with hundreds of participants of varying ages (Blacker, Weisberg, Newcombe, & Courtney, 2017;Galati, Weisberg, Newcombe, & Avraamides, 2017;Nazareth, Weisberg, Margulis & Newcombe, 2018;Weisberg & Newcombe, 2016;Weisberg, Schinazi, Newcombe, Shipley, & Epstein, 2014). In Virtual Silcton, participants explore a virtual environment modeled along the lines of a college campus comprising two main routes and two connecting routes. Participants use computer arrow keys to navigate through the virtual world and to learn the names and locations of target buildings along the two main routes. Using this methodological tool, participants exhibit large and robust individual differences in their navigation skills. Both adults and children can be categorized into three distinct navigator types: integrators, non-integrators and imprecise navigators. Integrators can connect different routes to create a cognitive map of the environment; non-integrators can represent independent routes but are unable to relate them to each other; and imprecise navigators have trouble even representing independent routes. With this new tool, we can now ask the question of whether large-scale spatial skills contribute to STEM success. In this article, we aim to address the gap in the spatial and STEM literature by conducting two studies of GEO disciplines to evaluate the relation between navigation skills and experience with these kinds of science. In two studies, we investigate individuals who differ in their mode of learning and levels of domain expertise but have visualization and manipulation of environmental and geographical spaces at the core of their training. There were two approaches, involving two designs with two different disciplines at two different levels of expertise. In Study 1, we investigated large-scale navigation skills in experienced solid earth geologists-individuals who have acquired a master's or Ph.D. degree in geology accompanied by field experience-in contrast with psychologists with equivalent years of experience in that field. In Study 2, we examined self-selection and learning in undergraduates taking a GIS course in contrast to those selecting a nonspatial course.

Study 1
Geology is the study of observable large-scale environmental features to deduce natural events that may have occurred over a period of millions of years. Geologists study physical processes that span large-scale spatial relations of tectonic plates to more microscopic phenomenon like the spatial organization of mineral grains (Hegarty, Crookes, Dara-Abrams, & Shipley, 2010;Kastens, Agrawal, & Liben, 2008;Kastens et al., 2009). Thus, one would expect that an experienced geologist is comfortable making judgments pertaining to spatial pattern detection and transformation associated with geological events. However, do these domain-specific skills extend to spatial skills? Previous research has established a connection between small-scale spatial skills and geology training (Kali & Orion, 1996;Orion, Ben-Chaim, & Kali, 1997;Piburn et al., 2005). In a recent study, Hambrick et al. (2012) studied 67 geologists with varying levels of experience who completed a realistic bedrock-mapping task along with a battery of cognitive ability tasks. The authors found that visuospatial ability predicted performance on the bedrock-mapping task for novice but not experienced geologists, suggesting that high domain knowledge may sometimes allow the circumvention of domain-general cognitive limitations in domain-specific tasks.
In an initial study of large-scale spatial skills, Hegarty et al. (2010) used an online questionnaire to collect self-reports on spatial skills at different scales from 796 scientists and specialists in different disciplines. They found that geoscientists (here, the term geoscientists refers to specialists in geology, oceanography and meteorology, but does not include geography or GIS) reported the highest levels of navigational competence and confidence as measured by the Santa Barbara Sense of Direction Scale (SBSOD). Geographers were a close second. Other scientists reported lower skill levels. Although self-reported navigation correlates with objective measures of navigation behavior (e.g., Weisberg & Newcombe, 2016;Weisberg et al., 2014), it is an indirect measure of ability and it is important to determine whether geologists do indeed have better navigation skills.
In Study 1, we anticipated that our sample of experienced geologists would point more accurately between different points in the environment and be more likely to be categorized as integrators, demonstrating higher navigational competence as compared to experienced psychologists. Thus, Study 1 addresses the gap in spatial skills literature by directly linking large-scale navigation skills to the GEO fields and taking the claim beyond self-reporting.

Method
Participants A total of 28 experienced geologists (12 female; mean (M) age = 40.7 years, standard deviation (SD) = 9.7) were recruited via email as well as in person at geology-centered conferences with the goal of collecting as much data as possible over a period of 1 year. The majority of geologists (n = 20) held a Ph.D. at the time of participation in this study. Data were also collected from geologists who had completed a terminal master's program (n = 5) and Ph.D. students nearing completion of their degree (n = 3). Geologists who had not completed a Ph.D. at the time of this study performed as well as participants who had completed a Ph.D., and, as such, their data were included during analysis. All 28 geologists completed the virtual environment navigation tasks. Eighteen geologists were also administered a psychometric measure, detailed in the following, and a questionnaire to collect demographics, handedness, specific education level, area of specialty and whether and how much time they had in the field. Of the 18 geologists for whom we have data, 17 identified as white and one identified as mixed race.
For the expert comparison group, a total of 27 experienced psychologists (12 female; M age = 37 years, SD = 11.63) were recruited via email. The majority of psychologists (n = 19) held a Ph.D. at the time of participation in this study. Data were also collected from psychologists who had completed a terminal master's program (n = 4) and Ph.D. students nearing completion of their degree (n = 4). All 27 psychologists completed the virtual environment navigation tasks and were also administered a psychometric measure, detailed in the following, and a questionnaire on demographics, handedness, specific education level and area of specialty; however, data for one psychologist was not recorded due to a computer crash. Of the 26 psychologists who reported on racial information, 16 psychologists identified as white, six as Asian, one as African-American and three as mixed race.
Our larger comparison group comprised 294 undergraduate students (168 female, two did not report gender) between the ages of 18 and 40 years from a large urban research university who participated in one of four studies which assessed them on Virtual Silcton performance. These data were reported previously in two manuscripts (Weisberg & Newcombe, 2016;Weisberg et al., 2014). In those studies, undergraduates who did not complete the second session of any study were excluded, but all undergraduates for whom we have Virtual Silcton data are included here. Age was not collected. One undergraduate identified as American Indian, 26 as Asian, two as Black/ Hispanic, 37 as Black, six as Hispanic, 10 as White, 133 as White/non-Hispanic, six as other, four omitted this information and data were not collected for 69. Finally, we included the 77 geoscientists tested by Hegarty et al. (2010) for a comparison of self-reported SBSOD scores.
The current research received the university's Institutional Review Board approval (Protocol number 13394: 'Computer-Based Spatial Abilities').

Materials
Geologists who were recruited via email (N = 10) completed the study on their own personal computers. None of these participants reported any technological issues. Geologists who were recruited at conferences (N = 18) and all other participants completed the study on a Windows 10 64-bit computer. The computer had an Intel Core i7-4720HQ CPU @ 2.60 GHz and Nvidia GeForce GTX 960 M video card. The virtual environment (VE) was displayed on a 34.54 cm × 19.43 cm LCD monitor with a refresh rate of 60 Hz and resolution of 1920 × 1080. The VE was modeled on a real-world college campus (Schinazi et al., 2013;Weisberg et al., 2014) using Unity3D and Google Sketchup. The VE was designed to replicate the saliency and spatial location of buildings and nonbuilding objects like trees, trashcans and so forth, without replicating the exact architecture of the real-world structures (Schinazi et al., 2013).

Virtual environment navigation paradigm (Virtual Silcton)
Virtual Silcton is a desktop-based virtual environment (VE) navigation paradigm. It comprises two main routes in different areas of the same VE and two connecting routes (see Fig. 2). Each main route consists of four unique target buildings for a total of eight target buildings. In the learning phase, participants were first instructed to learn the names and locations of each of the eight target buildings by virtually walking along each main route indicated by red arrows. They were told to pay attention to the front door of each building, as that was the specific spot they would be asked to point at later in the experiment. Target buildings in the VE were indicated by a blue gem hovering near the name of the target building. The two main routes were counterbalanced between participants. Participants walked from the start of each route to the end and then back to the start; thus, each route was completed twice before moving on to the next route. They were told not to veer off the path marked by red arrows, but that they could take as much time as they liked on each route. Each of the routes was surrounded by invisible walls, which kept the participant along the arrowed routes. Participants used the arrow keys on a computer keyboard to move along the virtual paths and a computer mouse to look 360 o around the VE. The experimenter encouraged participants to practice using the controls and to ask clarification questions before beginning the task. After learning the four target buildings on each main route, participants learned how the eight target buildings were related by walking down two connecting routes.
Before starting the two connecting routes, participants were told that these paths would "connect" or "go in between" the first two paths they had just learned. The experimenter noted that these connecting routes would not include any new buildings for participants to remember, and that instead their role was to help participants understand how the buildings related to one another. Similar to the main routes, the connecting routes were counterbalanced between participants (but always occurred after the main routes were learned). Participants were reminded to stay on the route marked by red arrows and the invisible walls along the connecting routes prevented participants from veering off-course.
In the testing phase, participants completed two spatial tasks-a pointing task and a model-building task-which tested the participant's ability to create accurate and integrated representations of the virtual environment. In addition to the two spatial tasks, participants completed a cued building recognition task.

Pointing task
In the pointing task, participants were located next to one of the eight target buildings and were prompted to point in the direction of each of the other seven buildings using a virtual crosshair (see Fig. 3). Thus, three of the seven buildings would be on the same route as that of the participant in the VE and four buildings would be on the second main route. Participants pointed a virtual crosshair by rotating on the horizontal plane using the mouse in the direction of the front door of the target building and recorded their response by clicking. They were instructed to point their crosshair, specifically, at the front door of each building, and to be careful to only click once to record their answer. This process was repeated for each of the eight buildings in the VE. A pointing error score for each participant was calculated based on the absolute value of the participant's answer minus the correct answer. If that value exceeded 180, we corrected it by subtracting the value from 360. Performance on the pointing task was subdivided into a within-route and a between-route pointing performance based on the position of the target building in relation to the participant's pointing location in the VE. A within-route error score was calculated for trials in which the target building was on the same route as that of the participant. A between-route error score was calculated for trials in which the target building was on a different main route to that of the participant. There were a total of 24 within-route trials and 32 between-route trials.

Model-building task
In the model-building task, participants were told that they would construct a map of the virtual environment using a bird's-eye view. Participants were shown an aerial view of the eight buildings and their names alongside a blank box on a computer screen. Participants had to drag and drop the miniature models of the eight buildings into the blank box at spatial locations relative to each other in order to recreate a and the layout of buildings on each route. The letter-number combinations are used to indicate the start and end points along each of the main and connecting routes (i.e., participants walked from point 1 to point 2 and then back to point 1 for each of the main and connecting routes, thus traversing each route twice). Participants had to learn the names and locations of four target buildings on each of the two main routes. The presentation of the main routes were counterbalanced (A first or B first) and the presentation of the connecting routes were counterbalanced (C first or D first) map-like representation of the virtual environment in the blank box. A bidimensional regression analysis (Friedman & Kohler, 2003;Tobler, 1994) was used to calculate the R 2 value for each participant. The R 2 value corrects for rotational, translational and scale differences between the participant map and the actual map, and indicates the remaining proportion of variance in the participant's map accounted for by the actual map. It can be interpreted as configurational accuracy.

Psychometric measures
The Santa Barbara Sense of Direction Scale (SBSOD; Hegarty, Richardson, Montello, Lovelace, & Subbiah, 2002) is a self-report measure of one's "sense of direction" or the ability to orient oneself within an environment. The measure consists of 15 items using a 7-point Likert scale with statements such as "I very easily get lost in a new city."

Procedure
After consenting to participate, participants completed a short demographic form to collect information such as age, education level and area of expertise. Participants then completed a computerized version of the SBSOD, followed by the virtual environment navigation paradigm. Participants were instructed to learn the names and locations of the eight target buildings as they explored the two main routes and two connecting routes in Virtual Silcton. Participants were then asked to complete the pointing and model-building tasks. Finally, participants were debriefed and thanked for their participation. The entire study, from start to finish, took approximately 45 min-1 h to complete.

Results
We first evaluated whether self-reported navigation skill as measured by the SBSOD differed between psychologists and geologists in the current sample. We also included the larger sample of 77 geoscientists (41 female; M age = 34.98 years, SD = 11.96) tested by Hegarty et al. (2010) and undergraduate students tested by Weisberg et al. (2014Weisberg et al. ( , 2016. As hypothesized, one-way ANOVA revealed significant differences across the four groups, F(3,417) = 28.88, p < 0.001. A post-hoc test revealed no significant differences in SBSOD scores between the geologists in the current study (M = 5.12, SD = 1.06) and geoscientists in the Hegarty et al. survey (M = 5.50, SD = 0.86) (p = 0.18, d = 0.40, Bayes factor (B) = 1.04), suggesting that the current sample is not likely to be different from the discipline at large.

Types of navigators
Previously we found that navigators clustered along two dimensions-performance on within-route and betweenroute pointing-into three groups (Weisberg & Newcombe, 2016;Weisberg et al., 2014). One group performed well on both tasks (integrators), and another performed poorly on both (imprecise navigators). A third group performed well on within-route pointing but poorly on between-route pointing (non-integrators). The ratio of navigators falling into each of these groups was approximately 1:2:1 (integrators:non-integrators:imprecise navigators). Figure 4 displays the scatter plot resulting from plotting the performance on between-route trials on the x axis and the within-route pointing performance on the y axis. As is visually apparent, more geologists are in the lower left of the graph, proportionally, than psychologists and undergraduate students, relative to the lower-right and upper-right quadrants. To address this analytically, due to the small sample size of geologists, we used the cutoff values from the undergraduate data to determine the number of participants in each navigator group (integrator:non-integrator:imprecise navigator). This resulted in a significant cluster difference between geologists (16:10:2), psychologists (9:9:8) and undergraduate students (84:131:79), χ 2 (4, N = 348) = 11.88, p = 0.02, Cramer's V = 0.13. A post-hoc test-using adjusted residuals and a Bonferroni correction for multiple comparisons between nine cells (three groups × three types of navigators)-showed that the number of geologists categorized as integrators (p = 0.0019) was significantly higher than integrators among psychologists and undergraduates, respectively. No other cells were significantly different from each other. Fig. 4 Scatterplot of performance on between-route and within-route pointing trials grouped by geologists and non-STEM undergraduates. Quadrants are based on cluster membership cutoff values-good between/good within (integrators), good between/bad within (non-integrators) and bad between/bad within (imprecise navigators)-established in previous studies using Virtual Silcton (Weisberg & Newcombe, 2016;Weisberg et al., 2014). STEM science, technology, engineering and mathematics

Discussion
Study 1 broadens our understanding of the relation between spatial skills and STEM to an underinvestigated kind of large-scale spatial reasoning, namely the extrinsic-dynamic spatial processes related to navigation. Our results align with the findings of Hegarty et al. (2010) in that geologists report significantly higher levels of navigation skills as compared to psychologists who have completed comparable years of study and experience in the field and to non-STEM undergraduate students. Using a virtual navigation paradigm, we found that geologists indeed exhibit higher navigational competenceas measured by pointing and model-building tasks-as compared to psychologists and non-STEM students. Thus, geologists not only report higher levels of navigation skills but also demonstrate superior navigation performance than their non-STEM counterparts, comparable in years of professional training. The lack of significant differences in navigation performance between psychologists and non-STEM students further lends support to the hypothesis that additional years of non-STEM education alone do not suffice to improve large-scale spatial skills substantially. Thus, our results provide evidence for the link between large-scale navigation skills and training in the GEO fields. Such a link is not entirely surprising given the integral nature of navigation in data collection for geology. Solid earth data are often collected over a large field area where one data collection location is not visible from another. For a field-based project, most geology Ph.D. students would collect data over an area greater than 100 km 2 for their theses. Thus, GEO experts must coordinate multiple extrinsic relationships-between the location of the self and the map to record where data were collected, among data points on the map, between the spatial relations in a rock sample and regional spatial patterns, and among locations in the field area to decide where next to collect data (Shipley & Tikoff, in press). Geologists, who are good at efficiently covering a mapping area, spending more time in the areas that provided the highest quality information for discerning among possible interpretations, tend to be more accurate in inferring the underlying geological structure (Baker, Johnson, Callahan, & Petcovic, 2016). Finally, there is the practical advantage of being a good navigator in a field that often requires working away from established paths-good navigators are more likely to return quickly and successfully to their vehicles at the end of the day.
One of the limitations of Study 1, however, is that it is impossible to hypothesize about the domain expert's spatial competence prior to their domain training. Do GEO experts get better at large-scale and/or small-scale spatial tasks because of their training? Or do high-spatial individuals self-select to specific STEM disciplines? Or do both effects occur? To overcome this drawback as well as to delineate the role of fieldwork in improving spatial skills, we investigated novices in a related GEO discipline requiring large-scale spatial reasoning and pattern detection across space and time.

Study 2
In Study 2, we explored the links between self-selection to a STEM field-Geographic Information Systems (GIS)-and improvement in navigation skills after extended exposure to domain knowledge from that field. GIS involves the use of an integrated toolbox of hardware and software systems and processes designed to allow an individual to store, retrieve, visualize and transform spatial data. Over the last three decades, GIS applications have extended beyond the field of geography and into various educational domains (Madsen & Rump, 2012) with the ultimate goal to enhance our ability to address planning and management problems (National Research Council, 2006). Not unlike the field of geology, GIS entails large-scale spatial reasoning and transformations, albeit through a different medium of learning. Where geology expertise often relies on fieldwork in the real world, GIS training focuses on a technology-assisted ability to store, visualize and manipulate digitized spatial information. So, does a suite of spatial visualization and analyses software at a figural scale demand high largescale spatial thinking and does domain-specific knowledge in this GEO field translate into better spatial skills, specifically navigation skills? Lee and Bednarz (2009) found that students enrolled in a GIS course outperformed a control group on a spatial test. In addition, GIS participants showed significant improvement in spatial thinking during the semester. However, the questions on the spatial test created to measure spatial thinking skill were closely related to the GIS course work and as such may not have been reflective of domain-general large-scale and small-scale spatial skills. Similarly, Hall-Wallace and McAuliffe (2002) found a significant positive correlation between smallscale spatial skills-measured by the surface development and cubes comparison tasks-and GIS learning. Although limited, there is a growing body of research investigating the relation between spatial thinking skills and GIS learning (e.g., Albert & Golledge, 1999;Baker & Bednarz, 2003;Britz & Webb, 2016;Kim & Bednarz, 2013). However, research so far has been limited to small-scale spatial thinking and to spatial tests closely related to the GIS curriculum.
In Study 2, we compared large-scale and small-scale spatial skills of novice GIS students with students enrolled in a nonspatial communications (COM) course at the start (T1) and end (T2) of an academic semester. As in Study 1, participants in Study 2 completed a virtual navigation paradigm in addition to mental rotation and spatial working memory tasks. Spatial and nonspatial skill at T1 was used as a baseline to examine improvement over the course of a semester. We hypothesized that: GIS students will have significantly better spatial skills at T1 as compared to COM students; GIS students will show greater improvement in spatial skills, specifically in navigation skills, from T1 to T2 compared to COM students; and mental rotation and spatial working memory may mediate the relation between academic course and spatial skills improvement.

Method
Participants A total of 90 undergraduate students (55 female) agreed to participate in the current study. Participants were recruited from introductory GIS (n = 47; 26 female) and communication (n = 43; 29 female) courses at Temple University. GIS introductory courses at the university where data were collected have an average class size of 12 students, and the goal was to collect as much data as possible over a period of 2 years (two Fall and two Spring semesters). Of the 90 participants who signed up for the study, 70 participants completed both pre-test (T1) and post-test (T2) components of the study. An equal number of GIS and COM students dropped out at T2. Age was not recorded but ranged between 18 and 25 years, which was an eligibility criterion for participation. Of those participants who chose to disclose their racial and ethnic information, four participants identified as American Indian, eight as Asian, seven as Black/African American, two as more than one race, one as Native Hawaiian, 31 as White and two as other race. The current research received the university's Institutional Review Board approval (Protocol number 23379: 'Exploring Links between STEM Success and Spatial Skills: Undergraduate GIS Courses and a Spatial Turn of Mind'). Participants received a $15 gift card on completion of T1 and an additional $20 gift card on completion of T2.

Materials
The study was administered on a Windows 7 64-bit computer. The computer had an Intel Core i5-6600 CPU @ 3.30 GHz and Nvidia GeForce GT 610 video card. The virtual environment (VE) was displayed on a 40 cm × 62 cm LCD monitor with a refresh rate of 60 Hz and resolution of 1680 × 1050.

Virtual environment navigation paradigm (Virtual Silcton)
The virtual environment navigation paradigm in Study 2 was identical to that of Study 1. After exploring the VE, participants completed the pointing task followed by the model-building task. In addition to Virtual Silcton, participants completed three psychometric and self-report measures: a mental rotation test, a spatial working memory task and a verbal ability test.

Psychometric and self-report measures
The Mental Rotation Test (MRT; Vandenberg & Kuse, 1978, adapted by Peters et al., 1995 consists of 20 items each made up of one target figure and four response items. Two of the four response items are identical to the target figure, but presented at varying orientations. The remaining two items are mirror images of the target figure in varying orientations. Participants were asked to identify the two response items that were identical but rotated images of the target figure. Before beginning the task, participants were given three practice trials. If they got any of the practice problems incorrect, they reviewed their answers with the experimenter and found the right one before moving on to the actual task. Participants received 2 points for each correct response and lost 2 points for each incorrect response. The Spatial Working Memory Complex Span (Symmetry span; Unsworth, Heitz, Schrock, & Engle, 2005) was also used. For the spatial working memory (SWM) task, participants had to remember the location of one red square in a 4 × 4 matrix of otherwise white squares. They then had to judge whether a separate array of black and white squares were bilaterally symmetrical or not. After a series of between three and five items (e.g., red square, symmetry judgment, red square, symmetry judgment, etc.), participants must recall the red square locations in the correct order. Participants' scores are calculated by summing all correctly recalled items.
The Wide Range Achievement Test, Word Reading Subtest (WRAT-4; Wilkinson & Robertson, 2006) is a measure of verbal IQ that correlates very highly with the WAIS-III and WISC-IV (Strauss, Sherman, & Spreen, 2006). The WRAT-4 Word Reading Subtest requires participants to pronounce 55 individual words. Each participant's score is the number of words pronounced correctly out of 55.
The Philadelphia Verbal Ability Scale (PVAS; Hegarty et al., 2010) is a self-report measure of how good participants feel their verbal ability is. The measure comprises 10 items using a 7-point Likert scale (Cronbach's α = 0.78) with statements like "I am very good at scrabble." The Philadelphia Spatial Ability Scale (PSAS; Hegarty et al., 2010) is a self-report measure of how well participants feel they can perform on common small-scale spatial tasks. The measure comprises 16 items using a 7-point Likert scale (Cronbach's α = 0.77) with statements like "I enjoy putting together puzzles."

Procedure
All participants completed the pre test (T1) within the first 3 weeks of the semester and completed the post test (T2) during the last 3 weeks of the semester. We ensured that the number of weeks between the pre and post tests stayed approximately constant across participants. During the pre test (T1), participants signed a consent form informing them about the two-timepoint study. Participants could opt out at any point during the study. On consenting to participate, the investigator first administered the WRAT. Participants then filled out a short demographic form and completed the online version of the mental rotation task. This was followed by the virtual environment navigation paradigm. Participants were instructed to explore the two main routes and two connecting routes in Virtual Silcton, and to complete the pointing and model-building tasks. Finally, participants completed an e-prime version of the SWM measure. The entire study from start to finish took approximately 1 h per session and not more than 2.5 h for both sessions.

Results
To evaluate our three hypotheses, we ran repeated-measures ANOVA followed by post-hoc tests to compare baseline competency and improvement over time in the navigation and mental rotation skills of GIS and COM participants. We also ran mediation models to investigate the role of mental rotation and spatial working memory in improving navigation skills.
Prior to analysis, the data were evaluated for multivariate outliers by examining leverage indices for each individual (Jaccard & Wan, 1993). No outliers were detected. Further, a Levene's test for homogeneity of variance demonstrated that the assumption of equal variances was met (all p > 0.05). Our sample had missing data (approximately 22% attrition; i.e., participants who completed T1 but did not return for testing at T2). To deal with the missing data, we ran a multiple imputation analysis using SPSS v20 and followed the guidelines for multiple imputation analysis specified in Jeličić, Phelps, and Lerner (2009) (see also Rezvan, Lee, & Simpson, 2015, for a review). The MI analysis was conducted using 23 imputations so as to exceed the percentage of attrition that was found to be approximately 22% (White, Royston, & Wood, 2011). All of the following analyses were conducted using the imputed dataset and all figures/tables present in the imputed dataset. Table 1 presents descriptive statistics for the spatial tasks and psychometric measures grouped by participant course. In order to test for baseline competency and improvement in spatial performance over time moderated by participant course, we ran repeated-measures ANO-VAs followed by post-hoc t tests. GIS and COM participants were comparable on nonspatial verbal ability as measured by the WRAT, t(88) = 1.49, p = 0.14, d = 0.32, and the PVAS, t(88) = 0.94, p = 0.35, d = 0.20. There were also no significant differences between the groups on the PSAS, t(88) = 1.64, p = 0.10, d = 0.35.

Within-route pointing error
There was a significant main effect of participant course, F(1,88) = 11.53, p < 0.001, partial η 2 = 0.12, and time, F(1,88) = 13.74, p < 0.001, partial η 2 = 0.14. However, there was no significant interaction between time and course, F(1,88) = 1.88, p = 0.17, partial η 2 = 0.02 (see Fig. 5a). Thus, overall, GIS participants outperformed COM participants on the within-route pointing trials and there was significant improvement from T1 to T2 for both groups. However, there was no significant difference in the rates of improvement from T1 to T2. A t test revealed that at baseline GIS participants were significantly better than COM participants on the within-route pointing task, t(88) = 2.17, p = 0.03, d = 0.46. This task was further divided into seen and unseen within-route trials based on the intervisibility of target buildings along a route. The pattern of results is consistent with the overall within-route pointing error, with no significant differences between trials when the target was visible or not.

Spatial working memory
There was a significant main effect of participant course, F(1,88) = 10.49, p = 0.002, partial η 2 = 0.12, and time, F(1,88) = 30.88, p < 0.001, partial η 2 = 0.26, but no significant interaction between time and course, F(1,88) = 1.40, p = 0.24, partial η 2 = 0.02. Thus, overall, GIS participants outperformed COM participants on SWM and there was significant improvement from T1 to T2 for both groups. However, there was no significant difference in the rates of improvement from T1 to T2. At baseline, GIS participants were significantly better than COM participants on the SWM task, t(88) = 3.18, p < 0.001, d = 0.67. These analyses were also run using listwise deletion instead of multiple imputations. All results stayed the same except in the case of model-building performance. Between-route pointing error. c Model-building (R 2 value). d Mental rotation skill. Overall, GIS participants significantly outperformed COM participants and there was significant improvement in both groups from T1 to T2. In addition, GIS participants improved at a faster rate than COM participants on all measures except the within-route pointing task. Error bars reflect ±1 standard error of the mean. COM Communication, GIS Geographic Information Systems, T1 pre test (start of academic semester), T2 post test (end of academic semester). For within-and betweenpointing errors, a low value (error) indicates high accuracy There was no significant main effect of time or time × course interaction. However, listwise deletion is a less optimal strategy for dealing with missing data in a longitudinal design and can reduce statistical power with small sample sizes (Acock, 2005). Hence, we used results of the multiple imputations to interpret our findings. The presented analyses were also run controlling for verbal ability as measured by the WRAT. There were no changes in our findings; we do not present these additional analyses for the sake of brevity. Figure 6 is a quiver (velocity) plot of participant performance along two dimensions: performance on withinroute and between-route pointing. The arrow length and direction represent the scaled magnitude of change and the direction of change in pointing performance from T1 to T2 (down and to the left represent improvement). As is visually apparent, on average GIS participants (blue arrow) showed more improvement than COM participants (yellow arrow). To test this pattern statistically, we ran a constrained cluster analysis (number of clusters = 3), similar to that conducted in Study 1 and in previous studies (Weisberg & Newcombe, 2016;Weisberg et al., 2014). At T1, there was no significant relation between cluster-membership (integrators:non-integrators:imprecise navigators) between GIS (11:26:10) and COM (10:15:18) participants, χ 2 (2, N = 90) = 5.12, p = 0.08, Cramer's V = 0.24. At T2, a chi-square analysis found a significant difference in cluster membership between GIS (15:25:7) and COM (7:19:17) participants, χ 2 (2, N = 90) = 7.73, p = 0.02, Cramer's V = 0.29. GIS participants were more likely to be integrators and COM participants were more likely to be imprecise navigators. Overall, 60 participants (GIS = 33, COM = 27) recorded no change in cluster membership from T1 to T2, 17 participants (GIS = 10, COM = 7) demonstrated a positive change (i.e., they moved into a better navigator category from T1 to T2) and 13 participants (GIS = 4, COM = 9) recorded a negative change (i.e., they moved into a lower navigator category from T1 to T2). However, there was no significant difference in change in cluster membership between the two groups, χ 2 (2, N = 90) = 2.88, p = 0.24, Cramer's V = 0.18.

Role of mental rotation and spatial working memory in navigation skills
We found significant differences in mental rotation and spatial working memory between GIS and COM participants, which were parallel to differences on the navigation tasks. Hence, mental rotation and spatial working memory at T1 were assessed as potential mediators influencing the relation between course and navigation performance at T2, controlling for baseline performance at T1. However, neither MRT nor SWM were found to mediate the relation between course and pointing for SWM as a potential mediator. Thus, the differences in navigation performance are attributable to improvements on navigation-specific processes, and not small-scale spatial skill or spatial working memory.
Finally, we examined the interaction of all three variables: time, course and participant sex. For the within-route pointing task, there was a significant interaction, F(1,85) = 0.4.59, p = 0.04, partial η 2 = 0.05. There was no significant interaction for the between-route pointing, model-building, MRT and spatial working memory tasks.

Discussion
Study 2 addresses the bidirectional relation between GEO training and spatial thinking skills. We had three main hypotheses in Study 2. First, we hypothesized that GIS students will have significantly better spatial skills at baseline as compared to COM students. Our results indicate that some (but not all) baseline spatial skills of GIS students are better than those of COM participants. Thus, high-spatial individuals may be self-selecting to the high-spatial GIS discipline. However, differences in spatial skills were limited to the large-scale within-route pointing task and the small-scale mental rotation task. At baseline, GIS participants are not significantly better than COM participants at between-route pointing, which involves integrating the different routes in the VE, or at creating a map of the environment in the modelbuilding task. One explanation for this could be that our sample size for the two groups was too small to detect significant differences for small to medium effect sizes for these variables (d < 0.3, Cramer's V = 0.24). However, it is possible that students enrolling in the high-spatial GIS course may have some relevant spatial skills but that this advantage is non-existent or weak for exactly those kinds of abilities supported by thinking about spatial distributions and integrating them. Second, we hypothesized that GIS students will show greater improvement in spatial skills, specifically in navigation skills from T1 to T2 as compared to COM students. As hypothesized, GIS students recorded a significantly faster rate of improvement from T1 to T2 in the between-route pointing and model-building tasks compared to COM students. In addition, GIS students showed a significantly faster rate of improvement in small-scale mental rotation skills as compared to COM students. Interestingly, spatial working memory improved for both GIS and COM students from T1 to T2 but there were no significant differences in the rates of improvement. These gains may reflect retesting effects. Taken together, our findings not only suggest the predictive role of spatial skills in self-selection to the high-spatial GIS courses but support GIS as a potential tool for improving spatial skills. It is important to note here that results of the listwise deletion approach to missing data suggest that model-building performance should be interpreted with some caution.
Finally, we hypothesized that mental rotation and spatial working memory might mediate the relation between academic course and improvement in spatial skills. However, our data did not support those ideas; baseline small-scale mental rotation skills did not mediate the relation between academic course and improvement in large-scale navigational proficiency. This finding supports the dissociation between small-scale object-based and large-scale environmental space transformations (Hegarty, Montello, Richardson, Ishikawa, & Lovelace, 2006;Hegarty & Waller, 2004;Newcombe & Shipley, 2015). Perspective-taking skills seem to play a more important role than mental rotation skills in predicting navigational performance in the VE (Nazareth, Weisberg, Margulis, & Newcombe, in press;Schinazi et al., 2013) and should be investigated in research on GIS in the future. It is more surprising that baseline spatial working memory was not found to mediate the relation between academic course and improvement in large-scale navigational proficiency, because previous research has found both verbal and spatial working memories to correlate with navigation performance (Weisberg & Newcombe, 2016).
Why did GIS seem to improve mental rotation? Arguably, GIS technology engages small-scale spatial manipulations on a computer screen, which would explain improvements in small-scale mental rotation skill over the course of a semester. For example, an introductory GIS course may require a student to solve a social or management issue by creating a graphical representation, using computer software (e.g., create a map using geographic information) and analyzing spatial patterns. In contrast, the introductory communication course for the control group may involve discussions on, but not graphical visualizations of, social and strategic communication issues.
What about the GIS curriculum aids in the development of large-scale navigation skills? One explanation is that when GIS tools are used appropriately in the classroom, the technology improves the quality of learning by immersing a student in spatial analysis and making all geographic assumptions explicit through graphical visualizations (Meyer, Butterick, Olkin, & Zack, 1999). Interactive pattern learning coupled with the visual component of GIS facilitates the understanding of the underlying geographic and spatial principles, and consequently can help in the development of spatial reasoning skills (Goodchild, 1993). In a way, GIS tools reduce the ambiguity associated with abstractions in scale, projections, geometry and topology (Bednarz & Ludwig, 1997;Self, Gopal, Golledge, & Fenstermaker, 1992). Even introductory GIS courses-like those in the current study-include large components of extrinsic-dynamic types of spatial relations and application and require students to develop GIS-based solutions to geographic (large-scale) modeling tasks. A focus on mapping principles, map overlays and cartography may further help develop perspective-taking skills, which consequently improves large-scale navigation proficiency. Of course, variations in the content and style of teaching GIS software at the university level could greatly influence improvement in large-scale navigation.

Conclusion
Spatial skills appear to be at the core of several scientific disciplines. However, there may be differences in the amount and type of spatial demands in the STEM fields. For example, physicists and geographers study phenomena that occur at different scales. There is growing evidence for dissociations between small-scale object-based spatial skills like mental rotation and large-scale perspective-taking and navigation skills through behavioral (Hegarty et al., 2010) and functional magnetic resonance imaging (Lambrey, Doeller, Berthoz, & Burgess, 2012) findings, allowing us to identify gaps in the literature linking spatial thinking to enrollment and success in the different STEM fields. One such gap is the study of large-scale navigation skills and its relation to training and expertise in the GEO fields. Existing literature focuses almost exclusively on small-scale spatial skills and therefore little is known about large-scale spatial skills like navigation, which may be particularly important for the GEO fields. The current study provides evidence for the link between large-scale navigational competence and geology training. In Study 1 we found that geologists not only report higher self-ratings but also demonstrate higher navigational competence in a VE than non-STEM undergraduates.
Spatial skills are malleable, and gains through spatial training are durable and transfer to other skills (Uttal et al., 2013). Thus, early improvement in large-scale and small-scale spatial skills may be one route toward increasing the STEM workforce overall, and addressing a potential factor responsible for the underrepresentation of women in STEM. However, we lack a sustainable spatial training model that can be integrated into classrooms with minimum disruption in existing school and university curricula; achieving this goal requires the assessment of spatial training tools and interventions that impact the relation between STEM and spatial skills. GIS software and courses present a viable spatial training tool that can be integrated into existing school and university curricula. The effective use of GIS to promote spatial thinking depends on our ability to understand the technology, its benefits and shortcomings and its relation to specific spatial skills. Although the relation between the field of geography and the development of a "spatial turn of mind" has received some attention (e.g., Albert & Golledge, 1999;Goodchild & Janelle, 2010;Oldakowski, 2001), particularly with regard to cartography or map-reading, there is a lack of research examining how the use of GIS tools may enhance spatial thinking skills (Britz & Webb, 2016). From a cognitive perspective, the lack of systematic empirical research examining the effectiveness of GIS training in improving spatial thinking makes it difficult to identify how spatial skills are impacted by new spatial visualization software. We are already beginning to see the benefits of geospatial curriculum at the school level on small-scale spatial thinking (Jant, Uttal, & Kolvoord, 2014). The current article extends the literature on the benefits of GIS training to large-scale navigational skills at the university level. In Study 2, we found that novice GIS students show higher baseline mental rotation skills and, to some extent, navigational skills. However, over the course of an academic semester, GIS students improve at a faster rate than non-STEM undergraduates in both large-scale and small-scale spatial skills.
In conclusion, the current study broadens our understanding of the relation between spatial skills and STEM fields to a hitherto underinvestigated type of spatial reasoning-navigation skills. Logically, large-scale spatial skills involved in navigation should be related to STEM fields like the GEO disciplines that involve encoding and transformation of geographical and environmental space, and the current study empirically supports the bidirectional nature of this linkage.

Limitations
In Study 1, we were limited by the amount of testing time available with expert geologists and psychologists. As a result, we were unable to administer many small-scale and large-scale assessments. Geologists may not only be better on large-scale navigation skills but may also have superior mental rotation and perspective-taking skills, which mediate the relation between discipline and navigation performance. In the absence of these data, we were unable to test more complex statistical models of difference in spatial skills between the experimental and comparison groups. Secondly, the convenience afforded by a virtual navigation paradigm comes at the cost of important proprioceptive and vestibular cues and a limited field of view (FOV), which are important for navigation (Maguire, Burgess & O'Keefe, 1999;Richardson, Montello and Hegarty, 1999). Arguably, GIS students may simply have more experience using virtual interfaces, and in the absence of this advantage may not demonstrate better navigation skills as compared to communication students in a real-world environment. Finally, we only used one measure of spatial working memory (i.e., Symmetry span) and hence findings pertaining to working memory should be interpreted with caution.