The effect of landmark visualization in mobile maps on brain activity during navigation: A virtual reality study

Cheng, Bingjie; Wunderlich, Anna; Gramann, Klaus; Lin, Enru; Fabrikant, Sara I.

doi:10.3389/frvir.2022.981625

ORIGINAL RESEARCH article

Front. Virtual Real., 15 November 2022

Sec. Virtual Reality and Human Behaviour

Volume 3 - 2022 | https://doi.org/10.3389/frvir.2022.981625

This article is part of the Research TopicHuman Spatial Perception, Cognition, and Behaviour in Extended RealityView all 13 articles

The effect of landmark visualization in mobile maps on brain activity during navigation: A virtual reality study

Enru Lin¹

¹Department of Geography and Digital Society Initiative, University of Zurich, Zürich, Switzerland
²Department of Biological Psychology and Neuroergonomics, Technische Universität Berlin, Berlin, Germany

The frequent use of GPS-based navigation assistance is found to negatively affect spatial learning. Displaying landmarks effectively while providing wayfinding instructions on such services could facilitate spatial learning because landmarks help navigators to structure and learn an environment by serving as cognitive anchors. However, simply adding landmarks on mobile maps may tax additional cognitive resources and thus adversely affect cognitive load in mobile map users during navigation. To address this potential issue, we set up the present study experimentally to investigate how the number of landmarks (i.e., 3 vs. 5 vs. 7 landmarks), displayed on a mobile map one at a time at intersections during turn-by-turn instructions, affects spatial learning, cognitive load, and visuospatial encoding during map consultation in a virtual urban environment. Spatial learning of the environment was measured using a landmark recognition test, a route direction test, and Judgements of Relative Directions (JRDs). Cognitive load and visuospatial encoding were assessed using electroencephalography (EEG) by analyzing power modulations in distinct frequency bands as well as peak amplitudes of event-related brain potentials (ERPs). Behavioral results demonstrate that landmark and route learning improve when the number of landmarks shown on a mobile map increases from three to five, but that there is no further benefit in spatial learning when depicting seven landmarks. EEG analyses show that relative theta power at fronto-central leads and P3 amplitudes at parieto-occipital leads increase in the seven-landmark condition compared to the three- and five-landmark conditions, likely indicating an increase in cognitive load in the seven-landmark condition. Visuospatial encoding indicated by greater theta ERS and alpha ERD at occipital leads with a greater number of landmarks on mobile maps. We conclude that the number of landmarks visualized when following a route can support spatial learning during map-assisted navigation but with a potential boundary—visualizing landmarks on maps benefits users’ spatial learning only when the number of visualized landmarks shown does not exceed users’ cognitive capacity. These results shed more light on neuronal correlates underlying cognitive load and visuospatial encoding during spatial learning in map-assisted navigation. Our findings also contribute to the design of neuro-adaptive landmark visualization for mobile navigation aids that aim to adapt to users’ cognitive load to optimize their spatial learning in real time.

1 Introduction

1.1 Landmarks and navigation

Imagine that your new colleague at work asks for directions from your workplace to the main train station. You will probably give route directions such as, “go straight until you see a school, then turn left,” or “turn right at the church.” Schools and churches are examples of prominent features in an environment known as landmarks. Landmarks are defined as “geographic objects that structure human mental representations of space” (Richter and Winter, 2014). They mark a salient feature in the environment and serve as points of reference that allow for spatial orienting and structuring of the environment (Presson and Montello, 1988). Ample evidence has shown that landmarks facilitate wayfinding efficiency (Wenig et al., 2017; Yesiltepe et al., 2021) and spatial memory of environments (Credé et al., 2019; Ligonnière et al., 2021).

Despite the long-standing literature on the importance of landmarks for human navigation (Richter and Winter, 2014), existing mobile navigation systems typically do not directly refer to landmarks when providing turn-by-turn wayfinding directions. The omission of landmarks in navigation systems could be one reason why navigation systems are often found to negatively affect spatial learning (Parush et al., 2007; Anacta et al., 2017; Wenig et al., 2017; Ligonnière et al., 2021). When guided by turn-by-turn directions, navigators tend to passively follow the given route shown on mobile maps and do not actively make spatial decisions (Fenech et al., 2010; Clemenson et al., 2021). Turn-by-turn directions drive navigators’ attention away from environmental features and lead to divided attention between navigation-assistive devices and the traversed environments (Gardony et al., 2013, 2015). Guided by such navigation devices, navigators are thus not supported in the active cognitive investment of encoding environmental information such as landmarks, route directions, and spatial relations of landmarks in the environment into memory (Parush et al., 2007; McKinlay, 2016; Dahmani & Bohbot, 2020; Sugimoto et al., 2021). As a consequence, overreliance on navigation systems may be detrimental to users’ spatial skills (Ishikawa, 2019; Ruginski et al., 2019). Considering the increasing reliance of navigators on mobile map applications in a digital age (Zenrin, 2017; Ishikawa, 2019), the importance of spatial learning abilities in healthy aging (Merhav and Wolbers, 2019; Ramanoël et al., 2022) and for education (Uttal and Cohen, 2012), there is a need to counter the negative effects of using GPS-based navigation systems on users’ spatial learning.

The inclusion of landmarks in mobile maps for pedestrian navigation has been proposed to counter the negative effects of using GPS-based navigation systems on users’ spatial learning (Raubal and Winter, 2002; Duckham et al., 2010; Liu et al., 2022). Indeed, Wunderlich and Gramann, (2021) showed that pedestrian navigation assistance that presents landmarks at intersections with verbal directions improved navigators’ spatial knowledge acquisition compared to standard navigation instructions in mobile applications that communicate turn-by-turn directions using metric distance information (e.g., “turn right in 200 m”). However, few studies have empirically examined the effects of landmark visualization in mobile maps on wayfinders’ spatial learning (Li, 2020; Münzer et al., 2020). Although landmarks can help users to process their environments for learning, they also require cognitive resources in users. The depiction of landmarks on mobile maps could therefore increase users’ cognitive load during an already demanding wayfinding task, especially in unfamiliar environments (Montello, 2005; Farr et al., 2012). Hence, excessive landmark information on mobile maps may possibly lead to users’ cognitive overload, considering their limited cognitive capacity (Baddeley, 2003). Cognitive overload may not only diminish the benefits of displayed landmark information in spatial learning but also may lead to decreased navigation efficiency (Münzer et al., 2012) and/or failure to acquire accurate spatial knowledge (Wen et al., 2013). Therefore, it is important to investigate how landmarks displayed on mobile maps affect cognitive load during navigation and spatial learning.

1.2 Cognitive load

Cognitive load is defined as the total amount of resources being used in information processing in the present task (Sweller, 1988; Sweller et al., 1998; Baddeley, 2003). Cognitive resources are available for three types of cognitive load: 1) intrinsic cognitive load, which is associated with the nature of the task itself; 2) extraneous cognitive load, which arises when cognitive resources are used for irrelevant information; and 3) germane cognitive load, which is characterized as learning relevant information (Sweller et al., 1998). Indeed, the literature on cognitive capacity suggests that learning performance plateaus (or even drops) when the number of learning items exceeds learners’ limited cognitive capacity—typically four items (or chunks) (Luck and Vogel, 1997; Cowan, 2001; Baddeley, 2003; Alvarez and Cavanagh, 2004). Navigators’ cognitive load is thus an essential aspect to consider when visualizing landmarks on mobile maps; the number of landmarks should be displayed such that it optimizes spatial learning and wayfinding performance but not lead to overload.

Based on the previous literature on working memory capacity, we selected three, five, and seven landmarks as manipulations of low, medium, and high cognitive load in the present study, respectively. We used these three sets of landmarks as a starting point to investigate the potential impact of the number of landmarks shown on mobile map displays and how they affect cognitive load and spatial learning. Prior research that investigated cognitive load during navigation commonly used a dual-task paradigm where participants complete a working memory task while learning an environment during navigation. Cognitive load in such paradigms is assessed by either measuring the impact of the secondary (working memory) task on performance in the primary spatial learning task (Meneghetti et al., 2021) and/or using self-reports (e.g., NASA-Task Load Index, Hart and Staveland, 1988). However, self-reports are typically administered after participants have completed the navigation tasks and therefore do not capture cognitive load in real time while dual-task paradigms interrupt the spatial learning and navigation process.

We thus turned to electroencephalography (EEG), an established method to measure human electrocortical activity that allows for assessing brain dynamic features that might reflect cognitive load without interrupting the navigation task at hand. EEG measures neural activity with high temporal resolution (milliseconds) and thus captures the neural dynamics accompanying cognitive processes, such as spatial orienting (Gramann et al., 2010, 2021), spatial learning (Gehrke et al., 2018), visual processing (Wang et al., 2018), and memory processing (Onton et al., 2005; Maurer et al., 2015). Therefore, EEG is a more sensitive way to capture cognitive processes and their subcomponents (Cohen, 2014), compared to behavioral assessments and introspective self-reports.

Previous EEG studies have shown that one’s level of cognitive load is associated with power modulations in distinct frequency bands e.g., (Klimesch, 1999). Changes in EEG power reflect changes in synchronization of neuronal activity at different frequencies. Several frequencies in the EEG power spectrum, most notably the theta (4–8 Hz) and alpha (8–12 Hz) bands, have been associated with spatial navigation (Kahana et al., 1999; Bohbot et al., 2017; Do et al., 2021), memory processes (Klimesch, 1999; Klimesch et al., 2008; Sauseng et al., 2010), and attention (Pennekamp et al., 1994; Gevins and Smith, 2003; Sauseng et al., 2005; Doesburg et al., 2009). Specifically, a large body of research has found that theta frequency band power recorded over the frontal cortex increases in response to stimulus presentation (i.e., event-related synchronization, ERS), signaling increasing load during cognitive tasks (Smith et al., 2001; Jensen and Tesche, 2002; Gevins and Smith, 2003; Scharinger et al., 2017; Ratcliffe et al., 2022). ERS in the theta band in frontal regions has been proposed to reflect cognitive resource expenditure and relate to the integration and control of a variety of cognitive processes, such as visuospatial and verbal working memory (Sauseng et al., 2010). Previous research has also found that alpha frequency band power (8–12 Hz) in parietal regions decreases (i.e., event-related desynchronization, ERD) with increasing cognitive load (Stipacek et al., 2003; Doesburg et al., 2009). Decreased alpha power in parietal regions may indicate individuals’ maintenance of attention and working memory toward a focal task (Pfurtscheller et al., 1996; Sauseng et al., 2005; Fukuda et al., 2015) and a higher state of arousal (Carp and Compton, 2009).

Cognitive load also manifests through the modulation of components in event-related potentials (ERPs). ERPs refer to averaged time-varying EEG activity that is time-locked to a particular event during a task (Fu and Parasuraman, 2006). Of the ERP research on cognitive load, a large body has investigated the P3 component (Kok, 2001; Fu and Parasuraman, 2006; Polich, 2007; Ghani et al., 2020). The P3 component is a relatively large and slow positive deflection that appears approximately 300–800 ms after stimulus onset (Watter et al., 2001). The P3 has a maximum amplitude over the posterior cortex (Kok, 2001; Polich, 2007). Prior research has suggested that P3 amplitude reflects the demands of a task on cognitive resources (e.g., attention, working memory) and indicates the cognitive load of the task (Kok, 2001). Moreover, previous studies have shown that greater task complexity, higher stimulus complexity, and overall information provided to participants leads to more pronounced P3 amplitudes (Kok, 2001; Watter et al., 2001; Polich, 2007; Ghani et al., 2020). For example, in single-task/attention focus paradigms (Van der Stelt et al., 1998) and in focused compared to divided attention (Heinze et al., 1990), P3 amplitude was greater when participants paid attention to target stimuli compared to unattended stimuli. In dual-task paradigms, the P3 amplitude evoked by a secondary task is typically reduced when the difficulty of a primary task is increased, indicating a reallocation of processing resources away from the secondary task to the primary task (Kramer et al., 1991; Watter et al., 2001).

Taken together, the findings of previous studies using EEG point toward several possible indices of cognitive load. To investigate the changes in cognitive load affected by the number of landmarks, we thus examined theta oscillations at fronto-central leads, alpha oscillations at parieto-occipital leads, and P3 amplitude over the parieto-occipital cortex.

1.3 Visuospatial encoding

Considering that landmarks seen in the environment and visualized on maps include both visual features (e.g., color, shape, texture, etc.,) and spatial features (e.g., geometries, distance, density, etc.,) that are both essential for navigation support, it is particularly important to assess navigators’ visuospatial information processing during map-assisted navigation. Psychophysiological studies have suggested that theta oscillations in the occipital and parietal regions underlie visual and spatial information encoding, respectively. Increased theta oscillations in the occipital region have been frequently implicated in visual processing and selective visual attention (Gladwin and De Jong, 2005; McDermott et al., 2017; Wang et al., 2018). Similarly, increased theta oscillations in the parietal region have been found during computerized spatial tasks as well as spatial navigation tasks in naturalistic virtual environments (Plank et al., 2010; Delaux et al., 2021; Do et al., 2021). Research on alpha oscillations in the occipital region during visuospatial processing shows the opposite—alpha power has been found to decrease with increased visual attention (Klimesch et al., 1998; Nelli et al., 2017; Wang et al., 2018; Delaux et al., 2021).

ERP studies have found that early visual attention can modulate P1 (80–120 ms) amplitude (Fu and Parasuraman, 2006) in posterior areas as measured at occipital and parietal leads. The posterior P1 is related to early visual encoding of visual stimuli presented to viewers, and its amplitude in the occipital region has been shown to increase with greater visual attention allocation to these visual stimuli (Luck et al., 1990; Hillyard and Anllo-Vento, 1998; Awh et al., 2000; Handy et al., 2001; Fu and Parasuraman, 2006). To further investigate whether and how the number of landmarks on mobile maps affects visuospatial encoding, we thus examined theta and alpha oscillations as well as P1 amplitude at occipital sites.

1.4 Navigation in virtual reality

Recording EEG during navigation in the real world is challenging, as EEG experiments usually require markers for event-related analyses of the signal. There is also little control over stimulus context and presentation (e.g., participants’ familiarity of the environment, mobile map consulting, etc.,) and external environmental factors (e.g., traffic, weather, etc.,) in real-world settings. We thus turned to virtual reality (VR), which reproduces real-life environments and can be used with EEG. VR displays three-dimensional (3D) dynamic images with high quality (Sanchez-Vives and Slater, 2005). VR can also integrate movement inputs by combining it with other interfaces, such as joysticks, foot pedals, or treadmills, which provides a more immersive experience and naturalistic sensory feedback (Gramann, 2013) compared to desktop-based experiments. With such unique characteristics, VR provides high ecological validity and is commonly employed in experiments that investigate navigation processes (Darken and Peterson, 2002). Indeed, studies have shown that spatial learning outcomes and cognitive load in virtual environments are fairly similar to spatial learning and cognitive load in the real world (Armougum et al., 2019; Clemenson et al., 2020; Pastel et al., 2022).

Furthermore, VR technology enables more control over experimental protocols. Researchers can create novel virtual cities with similar styles and manipulate augmented objects in the virtual environments while keeping other features in the environment constant. This is not possible in the real world. Therefore, in our current experiment, we employed VR technology to create three virtual urban environments with European-style architecture. We depicted three, five, and seven landmarks on mobile maps during navigation for each city. We also integrated VR technology with an EEG device to measure navigators’ brain activity during navigation in virtual environments.

1.5 The present study

We utilized a within-participant design with presentation of three different numbers of landmarks (three, five, and seven) to investigate how the number of landmarks displayed on a mobile map affects navigators’ 1) spatial learning, 2) cognitive load, and 3) visuospatial encoding when they were asked to follow a given route in an urban virtual environment. The current paper investigated cognitive load and visuospatial encoding in navigators and focused on electrocortical activity in the fronto-central, parieto-occipital, and occipital regions while they were viewing mobile maps. We tested the following hypotheses:

Spatial learning (H1): As the behavioral outcome of the load induction of landmark depiction, we expected navigators’ spatial learning to be better when the number of landmarks depicted on the map along a given route increases from three to five. Further, we expected that depicting seven landmarks along the route would generally exceed navigators’ cognitive capacities and thus counter the benefit of showing landmarks on the mobile map. Therefore, participants’ spatial learning performance is not expected to further increase or may even decrease when the number of landmarks increases from five to seven as a result of increased cognitive load.

Cognitive load (H2): Following the findings of previous studies (e.g., Gevins and Smith, 2003), we expected that when participants view the mobile map, 1) theta ERS would increase at fronto-central leads; 2) alpha ERD at parieto-occipital leads would be more pronounced; and 3) P3 amplitude at parieto-occipital leads would increase along with increasing numbers of landmarks shown on the mobile map. This is because cognitive load increases when navigators have to process more landmarks.

Visuospatial encoding (H3): We expected theta ERS in posterior leads to increase, and P1 amplitude and alpha ERD in occipital leads to be more pronounced during map viewing. This is because navigators need to encode more visuospatial information in the brain with increasing numbers of landmarks displayed on the mobile map.

2 Materials and methods

2.1 Apparatus and materials

We used a three-sided cave automatic virtual environment (CAVE) to simulate stereoscopic vision with frame sequential projection (1,280 pixel × 800 pixel resolution at 120 Hz frequency). Participants navigated through three virtual urban environments at 3.8 m/s using a foot-operated controller (3D Rudder, Aix-en-Provence, France). Tilting the controller with their feet towards the front and back resulted in forward and backward translation, respectively. Tilting the controller towards the left and right resulted in left and right rotation, respectively. Participants were allowed to turn their heads when navigating along the route. There are six cameras in the CAVE that track the head direction of participants through the donned 3D glasses. The perspective view in the virtual cities changes according to participants’ head direction. The three city models used for navigation were developed using ArcGIS City Engine 2018 (Esri, CA, United States). We employed three European-style urban environments including low-rise buildings with heights between 5 m and 25 m, streets, trees, and open spaces (Figure 1). The sizes of the three cities are 426′562 m², 510′910 m², and 516′868 m², respectively. The experimental tasks were rendered using Unity 3D 2018.4 LTS (Unity Technologies; San Francisco, CA, United States) and MiddleVR for Unity 1.0 (Truchtersheim, France).

FIGURE 1

FIGURE 1. (A) Bird’s eye view of one of the virtual cities; (B) participants’ view of the environment during navigation; and (C) a participant wearing stereoscopic goggles, seated on a chair approx. 30 cm away from the center of the VR system (CAVE). The participant placed her feet on a foot-operated controller, which allowed her to navigate through the environment, and was connected to the EEG during the navigation experiment. The virtual city displayed on the CAVE screen was in mono-mode to present a first-person perspective of the virtual environments during navigation.

2.2 Study design

Participants completed three within-participant landmark conditions (three, five, and seven landmarks), which were visualized on a digital map while they navigated along a given route through three different virtual urban environments, respectively. The three navigation routes each consisted of five intersections with lengths between 800 m and 900 m. Each route further contained seven salient buildings that served as landmarks: the starting building (home), five landmarks at the five intersections, and the destination building (goal). Displaying visually salient buildings (e.g., of varying size, shapes, and colors, Itti and Koch, 2001) on mobile maps can help navigators to identify the buildings easily in the environment (Kapaj et al., 2021). Including visually salient buildings as landmarks on mobile maps can help reduce uncertainty and confusion, should there be a conflict between transient landmarks and turn directions displayed on the mobile maps (Gardony et al., 2015; Tenbrink et al., 2020).

In the 3-landmark condition, the start building, destination, and the salient building at the third intersection were displayed on the map (Figure 2A). In the 5-landmark condition, the landmarks at the first and fourth intersections were visualized on the map in addition to the three landmarks in the 3-landmark condition (Figure 2B). In the 7-landmark condition, the landmarks at the second and fifth intersections were visualized on the mobile map in addition to the five landmarks in the 5-landmark condition (Figure 2C). The selections of landmark positions for each landmark condition were done to ensure that the distributions of landmarks along the route were equally spaced. The assignment of the three landmark conditions to the three cities was counterbalanced. The order in which participants underwent the landmark conditions was also counterbalanced.

FIGURE 2

FIGURE 2. The three different landmark conditions in three different virtual environments. The (A–C) depict the map condition with three, five, and seven landmarks displayed on the mobile map, respectively.

2.3 Mobile map assistance

Along each navigation route, participants were shown a mobile map at specific points along the route: in the middle of a straight segment of the followed route where the next intersection was visible; shortly before the next intersection; and shortly after the past intersection, resulting in 17 views of the mobile map in total (Figure 3A). The mobile map was displayed in the center of the VR screen for 5 s. It showed the current intersection and the route direction oriented with participants’ heading direction. It provided turn-by-turn directions (i.e., a black line) and participants’ current location (i.e., a blue dot), following the design of current navigation system displays (Figure 3B). Depending on the landmark condition, the respective 3D landmark at the intersections was visualized on the mobile map (Figure 3B). The landmarks visualized on the map are exactly like the landmarks seen in the navigated environments, including their first-person viewing perspectives along the route. To reflect a real-life scenario of mobile map consultation (Brügger et al., 2019), the virtual urban environment faded away when the map was displayed, and participants’ navigation through the virtual environment was disabled. The 17 map-onset events were used for event-related EEG analyses (i.e., ERP and frequency band power analysis).

FIGURE 3

FIGURE 3. (A) Red dots along the black navigation route indicate the 17 map pop-up spots during navigation; (B) a mobile map that rotates along with the participant’s head direction, as seen by the participant at the location of the green dot in panel (A). The blue dot in panel (B) indicates the participant’s current location in the virtual city. The black line indicates the path the participant needs to follow. A 3D landmark building or an imprint of the building is shown on the map at a turning intersection, depending on the landmark condition.

2.4 Procedure

The experiment was conducted in German or English based on participants’ language preferences. After giving their informed consent, participants were introduced to the procedure of the experiment. Subsequently, they completed a questionnaire assessing their self-reported spatial and navigation abilities using the Santa Barbara Sense of Direction Scale (SBSOD; Hegarty et al., 2002) and their spatial orientation skills using the Perspective Taking/Spatial Orientation Test (PT/SOT; Hegarty and Waller, 2004) before they were connected to the EEG. Next, participants were given the instructions for the navigation task. Participants then practiced walking in a training virtual city with the 3D rudder and using the electronic pointing device to answer the spatial learning tests. After the training trial, once participants had no further questions, the main experiment started.

The main experiment consisted of three blocks. Each experimental block comprised a map-assisted navigation task and spatial learning tests. During the navigation phase, participants were asked to follow the route indicated on the map as quickly as possible to a specific destination, and to learn the landmarks along the route that were displayed on the map. In all three landmark conditions, participants were also told that some landmarks at the intersections that were not visualized on the map would be tested after navigation. Participants finished the navigation task when they arrived at the destination. We instructed participants to learn the landmarks and the environments intentionally, which simulates real-world scenarios when navigators attempt to learn the environments to make independent route decisions for future wayfinding.

After each navigation trial in each city, participants’ spatial knowledge acquisition, which was subcategorized as landmark knowledge, route knowledge, and survey knowledge (Siegel and White, 1975), were assessed with a landmark recognition test, a route memory test, and a judgment of relative direction (JRD) test, respectively (Figure 4).

FIGURE 4

FIGURE 4. Spatial learning tests: Panels (A), (B), and (C) illustrate how a participant responded in the landmark recognition test, the route direction test, and the JRD test respectively. All tests were carried out in the CAVE using a 3D pointing device.

After the main experiment, participants completed the Corsi block-tapping task (CBTT) on a computer to assess their visuospatial capacity.

2.5 Spatial learning tests and analyses

2.5.1 Landmark recognition test

Seven buildings that were seen along the route (i.e., starting building, destination building, and five buildings at the intersections) and that served as landmarks on the map, and seven buildings from the same traversed environment but not seen along the route were presented to participants. Participants were asked whether they had seen the building along the route and to answer with “yes” or “no” using an electronic pointing device (Figure 5A). Signal detection theory (SDT, Parks, 1966) was used to analyze participants’ landmark recognition performance (Huang et al., 2012; Wunderlich and Gramann, 2020, 2021). Buildings that served as landmarks along the route were considered as “signal” while buildings not seen along the route were considered as “noise.” D-prime (d’) indicates participants’ recognition discriminability where a higher d’ score reflects better discriminability in landmark recognition.

FIGURE 5

FIGURE 5. Landmark and route learning improved when more than three landmarks were shown. No improvement was observed in JRD performance when more landmarks were shown. The means of d’, choice accuracy, and absolute response error in each landmark condition are presented in the three plots with the error bars representing the 95% CI of the mean.

2.5.2 Route direction test

For buildings that participants answered “yes,” they were additionally asked which direction they took in reference to these buildings. Participants used the electronic pointing device to choose between “left,” “right,” “straight,” and “destination” (Figure 5B). “Destination” indicated that participants recognized the last building along the route, which signaled the end of the navigation task. This was explained to participants during the training trial. The paradigm of the landmark recognition and route direction tests were adapted from tests used in previous navigation studies (Huang et al., 2012; Wunderlich and Gramann, 2020, 2021). Performance on the route direction test was calculated as the percentage of correctly answered trials over the total number (seven) of landmarks.

2.5.3 Judgments of relative directions (JRDs)

The assessment of JRDs is a well-established method to assess acquired (metric) survey knowledge of navigators (Huffman and Ekstrom, 2018). Participants were asked to imagine standing at a first landmark while facing a second landmark and to point to a third landmark. For each JRD, participants pointed the electronic pointing device to the estimated direction of the third landmark and confirmed their decision by pressing a button. Each JRD consisted of three of the seven landmarks seen on the route (Figure 5C). Fourteen JRDs were pseudo-randomly selected out of the 35 possible trials. The seven landmarks appeared six times out of the 14 JRD trials. Pointing accuracy was defined as the absolute angular difference between estimated direction and the actual direction of a target landmark relative to a reference landmark (i.e., JRD error).

2.6 Power analysis

We conducted a power analysis for a mixed-effect model prior to the experiment. We estimated a small-to-medium effect size (d = 0.3–0.5) for the three within-subjects conditions with 14 JRD trials in each condition. The power analysis suggested testing 50 participants to obtain a statistical power of 73% for a small effect and 89% for a medium effect, respectively.

2.7 Participants

We recruited 49 participants (29 females) between the ages of 18 and 35 years (M = 25.6 years, SD = 4.09) for this study. One participant was excluded from analysis due to a self-reported mental ailment during the experiment and requested to have their data excluded. All participants were reimbursed 30 CHF for their participation. We conducted this study in compliance with the ethical standards of the University of Zurich Ethics Board, the Swiss Psychological Society, and the American Psychological Association.

2.8 Electroencephalography (EEG)

2.8.1 Data collection

Participants’ brain activity was continuously recorded using a 64-channel EEG device with active electrodes (LiveAmp, Brain Products GmbH, Gilching, Germany). The impedance of the channels was set below 10 kOhm. Electrodes were placed according to the extended 10% system (Oostenveld and Oostendorp, 2002). All electrodes were referenced to FCz with a ground electrode at Fpz. The EEG was recorded at a 500 Hz sampling rate. The raw EEG signal was streamed wirelessly via a BlueTooth adapter (UBT21) and was recorded continuously for the navigation task and the CBTT task. The EEG signal and all trigger events were synchronized using Windows Operating System’s interprocess communication (I.P.C.).

2.8.2 Data processing

The BeMoBIL pipeline (Klug et al., 2022) was used to preprocess and clean the EEG data in the MATLAB (Mathworks, Inc.) toolbox EEGLAB (Delorme and Makeig, 2004). We first downsampled the raw EEG data to 250 Hz, and then removed spectral peaks at 50 Hz, corresponding to power line frequency, using the ZapLine Plus function (Klug and Kloosterman, 2022). Then, we applied the automated rejection function clean_artifacts from EEGLAB to identify noisy channels with ten iterations. We removed channels that were detected as bad channels more than four times and interpolated them by spherical interpolation of neighboring channels and applied re-referencing to the common average. On the cleaned dataset, we performed an independent component analysis (ICA) using an adaptive mixture independent component analysis (AMICA) algorithm (Palmer et al., 2011). Subsequently, for each resultant independent component (IC), we computed an equivalent current dipole model using DIPFIT routines from EEGLAB (Oostenveld and Oostendorp, 2002). We used the ICLabel algorithm (Pion-Tonachini et al., 2019) with the default classifier to classify the resultant ICs as eye, brain, or other components and removed ICs that reflected eye movements with a probability of 70% or higher. Next, we applied a 1–30 Hz bandpass filter to remove higher-frequency signals that are not relevant to our analysis. After these preprocessing steps, we excluded the EEG data from one participant because of severe artifacts.

We corrected the event latencies in wireless synchronization according to the projector (33 ms) and EEG trigger (100 ms) latencies, and then extracted 17 map-onset epochs from the continuous data with a time window of 0–5 s with respect to map onset and with a pre-event baseline of −1 to 0 s. We then performed an automatic epoch artifact detection and rejection using the function pop_autorej in EEGLAB: Epochs that fluctuated more than ±80 μV were excluded (Duncan et al., 2009). We used a probability threshold of three in standard deviation for the detection of improbable data. A maximum of 10% of total trials were rejected per iteration (five iterations in total). On average, we excluded 4.19% of all trials (0.7 out of 17 epochs) based on these criteria.

To examine the general effect of the number of landmarks on cognitive load while participants were consulting mobile maps during navigation, we averaged the map-onset events along each navigation route.

We selected the following regions of interest (ROIs): fronto-central (FC1, FCz, FC2), parieto-occipital (PO1, POz, PO2), and occipital (O1, Oz, O2) regions for the analyses in the time-frequency domain and ERPs. The ROIs and electrode clusters were chosen based on previous literature reporting maximal effects of cognitive load for theta-frequency band power in the fronto-central region and for alpha band power in the parietal-occipital region (Dong et al., 2015; Scharinger et al., 2017; Wei and Zhou, 2020), as well as visuospatial processing in the occipital and parieto-occipital regions (Handy et al., 2001; Wei and Zhou, 2020).

2.8.3 ERS/ERD analysis

We calculated frequency band power using the function spectopo from EEGLAB, which uses Matlab’s pwelch function to calculate power spectral density (PSD). For PSD estimation, we used a 2-s Hanning window that led to a frequency resolution of 0.5 Hz to capture spectral changes in the EEG data. We set four frequency bands with the following frequency ranges: delta (1–3.9 Hz), theta (4–7.9 Hz), alpha (8–12.9 Hz), and beta (13–29.9 Hz), and computed the absolute spectra of the four frequency bands within the 0–5 s epoch with stimulus onset. To reduce inter-individual deviation, we computed relative power indices for each band (i.e., delta, theta, alpha, and beta) as power in a given frequency band relative to the entire bandwidth (i.e., 1–30 Hz) (Wang et al., 2015; Nishiyori et al., 2021) using the following formula:

Relative theta power = [absolute theta power/absolute power of (theta + alpha + beta + delta)]* 100

To obtain baseline power, we calculated relative power indices during the time before the navigation experiment started when participants were sitting on a chair and viewing a dark blue screen. We extracted baseline epochs with a length of 1 s from this pre-experiment phase. Baseline epochs had 200 ms overlap with subsequent epochs.

We then calculated ERD (negative values) and ERS (positive values) with respect to the pre-experiment baseline (Pfurtscheller and Lopes da Silva, 1999; Krause et al., 2000; Dong et al., 2015) for theta and alpha bands using the following formula:

ERD or ERS = (relative power during map-event - relative power during baseline)/relative power during baseline.

2.8.4 ERP analysis

We corrected single-trial EEG data epochs with a pre-stimulus baseline from −200 to 0 ms. Based on visual inspections of ERP plots, we selected the following time windows: P1 (80–150 ms) at occipital sites and P3 (450–700 ms) at the parietal-occipital region for individual peak detection. Peak amplitude was calculated by taking the mean of the maximum peak value in the respective search windows and the neighboring +1 and −1 sample points (in total three data samples) (Wunderlich and Gramann, 2020).

2.9 Statistical analysis: Multilevel linear regression

To assess the effect of the number of landmarks on cognitive load and visual processing, we entered the theta ERS, alpha ERD, and peak amplitudes of P1 and P3 in R and ran for each parameter a linear regression model, with the α level set at α = 0.05 for all analyses. Multilevel/hierarchical modeling is a generalized form of regression analysis that enables hypothesis testing for nested study designs such as multiple trials within participants (Gelman, 2006) and for missing values in predictors (Fitzmaurice and Molenberghs, 2008).

We developed the multilevel models using the lmer4 package in R version 4.0 (Bates et al., 2011). To identify the maximal appropriate random effects structure that would converge, we included by-participant and by-item intercepts and slopes in the random structure based on a within-participants design. Subsequently, we simplified the maximal random effects model until it converged. Our multilevel model can be described with the following equation:

T h e t a {E R S}_{i j} = β_{00} + β_{01} * {C o n d i t i o n}_{j} + μ_{0 j} + γ_{i j}

The mixed-effect regression follows a hypothesis-driven confirmatory approach and models the effect of the number of landmarks on cognitive load measured by EEG and on spatial learning. We dummy-coded categorical variables (i.e., condition: number of landmarks) to 0 and 1 for each contrast. We then fitted separate models for each EEG feature (i.e., theta ERS, alpha ERD, and ERP component amplitudes) and spatial learning outcomes (i.e., landmark recognition, route direction, and JRDs).

3 Results

Participants spent on average 8.11 min (SD = 1.63 min) to navigate from the starting position to the destination in the three cities. There is no significant difference in navigation time between the three landmark conditions (ps > 0.507). In the following sections, we describe the results of spatial learning and EEG measures in detail.

3.1 Spatial learning performance

Overall, the 48 participants produced 144 d’s, 144 route direction choices, and 1981 JRD responses. Due to technical reasons in Unity, 35 JRD responses were lost. The mean d’s was 1.83 (SD = 0.76), the mean percentage of correct route direction choice was 62% (SD = 0.26), and the mean of the absolute JRD response error was 72.64° (SD = 48.06°). For a complete overview of the results of the multilevel models, see Supplementary Table S1 in the supplementary materials.

The multilevel regression models reveal significant effects of the number of landmarks on landmark recognition and route learning. For the landmark recognition task, recognition discriminability d’ increases by 0.51 when the number of landmarks increases from three to five (β = 0.51, 95% CI [0.30, 0.72], p < 0.001). There is no further increase in landmark recognition discriminability from five landmarks to seven landmarks [β = −0.11, 95% CI (−0.32, 0.10), p = 0.31]. D-prime increases by 0.4 when the number of landmarks increases from three to seven [β = 0.40, 95% CI (0.19, 0.61), p < 0.001].

For the route direction choice task, we find a similar pattern for the number of landmarks. Route direction memory significantly increases by 12% on average from three landmarks to five landmarks [β = 0.12, 95% CI (0.57, 0.67), p < 0.001], and does not improve further from five landmarks to seven landmarks [β = −0.02, 95% CI (−0.09, 0.06), p = 0.71]. Route direction memory also significantly improves by 10% on average from three landmarks to seven landmarks [β = 0.10, 95% CI (0.02, 0.18), p = 0.01].

For the JRD performance, the linear mixed effect model revealed no significant main effect of the number of landmarks (ps > 0.68). Figure 5 displays the relationship between the number of landmarks and spatial learning (i.e., landmark recognition, route direction knowledge, and JRDs).

3.2 Event-related synchronization (ERS)/event-related desynchronization (ERD)

3.2.1 Cognitive load

The multilevel linear regression models reveal that theta ERS in the fronto-central region is significantly greater in the 7-landmark condition compared to theta ERS in the 3- and 5-landmark conditions [7 vs. 3: β = 0.07, 95% CI (0.01, 0.14), p = 0.026; 7 vs. 5: β = 0.10, 95% CI (0.04, 0.16), p = 0.002]. We do not observe significant differences in theta ERS in the fronto-central region between the 3-landmark and 5-landmark conditions [5 vs. 3: β = −0.03, 95% CI (−0.09, 0.04), p = 0.391].

We do not find statistically significant differences in alpha ERD in the parieto-occipital region between the three different landmark conditions [5 vs. 3: β = −0.03, 95% CI (-0.09, 0.02), p = 0.220; 7 vs. 3: β = −0.04, 95% CI (−0.10, 0.01), p = 0.120; 7 vs. 5: β = -0.01, 95% CI (−0.06, 0.05), p = 0.738].

3.2.2 Visuospatial encoding

Supporting our hypothesis, theta ERS in the parieto-occipital leads increases significantly with increasing numbers of landmarks [5 vs. 3: β = 0.06, 95% CI (0.01, 0.11), p = 0.027; 7 vs. 3: β = 0.15, 95% CI (0.10–0.20), p < 0.001; 7 vs. 5: β = 0.09, 95% CI (0.04, 0.14), p = 0.001]. The same pattern of theta ERS is observed in the occipital region [5 vs. 3: β = 0.06, 95% CI (0.01, 0.11), p = 0.027; 7 vs. 3: β = 0.15, 95% CI (0.10, 0.20), p < 0.001; 7 vs. 5: β = 0.08, 95% CI (0.02, 0.14), p = 0.008].

Alpha ERD in the occipital region is smallest in the 3-landmark condition and significantly smaller compared to the 5- and 7-landmark condition [5 vs. 3: β = -0.09, 95% CI (−0.16, −0.03), p = 0.006; 7 vs. 3: β = -0.07, 95% CI (−0.14, −0.01), p = 0.031], which is again in line with our hypothesis. No significant difference is observed between the 5-landmark and 7-landmark condition [7 vs. 5: β = 0.02, 95% CI (−0.05, 0.09), p = 0.541].

Figure 6 below depicts the averaged theta ERS and alpha ERD across the three landmark conditions. Supplementary Table S2 in the supplementary materials presents a complete overview of the results of the multilevel models for ERD/ERS.

FIGURE 6

FIGURE 6. Mean ERD/ERS values of the map-event window (i.e., 0–5 s) for mean frontal-central theta ERS and mean parieto-occipital alpha ERD indicating cognitive load changes (top panel), and mean parieto-occipital theta ERS, occipital theta ERS and occipital alpha ERD indicating visuospatial encoding (bottom panel). Error bars indicate ±1.96 standard error (i.e., 95% CI) of the mean. Means in the 3-landmark condition are depicted in black, serving as a baseline in each violin plot. Significant differences between means at p < 0.05 are shown with different colors within the same violin plot. Means presented in the same color within the same violin plot indicate no significant difference between the means at p ≥ 0.05.

3.3 Event-related potentials (ERPs)

3.3.1 Cognitive load

The linear mixed-effect models reveal that P3 amplitude in the parieto-occipital region in the 7-landmark condition is significantly greater than in the 3- and 5-landmark conditions, which is in accordance with our hypothesis. P3 amplitude increases by 143% on average from the 3-landmark to 7-landmark condition [7 vs. 3: β = 1.43, 95% CI (0.47, 2.39), p = 0.004] and by 139% on average from the 5-landmark to 7-landmark condition [7 vs. 5: β = 1.39, 95% CI (0.44, 2.35), p = 0.004]. We do not find a significant difference between the 3- and 5-landmark conditions [5 vs. 3: β = 0.04, 95% CI (−0.92, 1.00), p = 0.936].

3.3.2 Early visual encoding

We do not find any significant difference between the three conditions in P1 amplitude in the occipital region and in the parietal-occipital region (ps > 0.139).

Figure 7 plots the group-mean amplitude of the ERPs and the detected peak amplitude for each landmark condition. Supplementary Table S3 in the supplementary materials presents a complete overview of the results of the multilevel models of the ERPs.

FIGURE 7

FIGURE 7. (A,B) Grand averaged amplitudes of ERPs for each landmark condition at (A) parieto-occipital leads (PO3, POz, PO4), and (B) occipital leads (O1, Oz, O2). The ERP signals served as the basis for individual peak detection—areas shaded in gray indicate the time window where (A) the P3 (450–700 ms), and (B) P1 (80–150 ms) were extracted for each participant. (C,D) Violin plots display the distribution of detected peak amplitudes together with mean and ±1.96 standard error (i.e., 95% CI) in each landmark condition for (C) parieto-occipital P3, and (B) occipital P1. The line plot in the top panel and means in the bottom panel highlighted in purple indicate significant differences at p < 0.05.

4 Discussion

In the current paper, we set up to examine cognitive load and visuospatial encoding measured with EEG in a naturalistic, map-assisted urban navigation experiment, carried out in virtual reality. We were driven by the research question of whether and how varying the number of landmarks displayed along a route on a mobile map during navigation would affect navigators’ spatial learning, cognitive load, and visuospatial encoding. We found that navigators’ landmark and route learning improve without neurophysiological indications of increased cognitive load; that is, we do not find a significant increase in frontal theta ERS or parieto-occipital P3 amplitude when the number of landmarks increases from the lowest number (three) to a medium (five) number of landmarks, which does not support our hypothesis of cognitive load. However, cognitive load increases significantly when the maximum assessed number of landmarks (seven) is shown on the mobile map as indicated by increased theta ERS and increased P3-amplitudes of the event-related potential, which is in line with our cognitive load hypothesis. This might explain why we do not find further benefits in navigators’ spatial learning: Navigators experienced cognitive overload in the 7-landmark condition and could not improve their spatial learning beyond a medium level of cognitive load. The results partly support our initially stated hypotheses informed by prior research (based mostly on stationary studies using artificial stimuli) on cognitive capacity and learning, cognitive load, and visuospatial encoding in the context of navigation and wayfinding. In the following sections, we discuss our main results in more detail.

4.1 Spatial learning

Our current finding that spatial learning performance appears to plateau at five (medium number) landmarks shown on a mobile map in a map-assisted navigation task suggests that for most participants, their cognitive capacity was saturated at five landmarks, which is in line with our hypothesis of spatial learning. This finding is consistent with prior research on cognitive capacity suggesting that learning performance reaches a plateau after the learnt items exceed capacity (Luck and Vogel, 1997; Baddeley, 2003). In our current navigation study, following a given route through an environment is associated with intrinsic cognitive load as the navigation task (i.e., controlling the foot pedal to steer body movement) had to coordinate with the changing visual information provided by the environment and the landmark learning task, thus rendering the navigation task a dual- or multiple-task setting. Increasing the number of depicted landmarks from three to five benefits spatial learning and possibly induces germane cognitive load due to the relevance of landmarks to the spatial learning task at hand. Depicting landmarks on mobile maps at intersections may make navigators more aware of and let them pay more attention to landmarks at navigation decision points and landmark-based actions, which in turn leads to better spatial memory of the traversed environment (Gardony et al., 2013). However, increasing the depicted landmarks from five to seven does not further improve spatial learning, possibly due to overload in germane cognitive load.

The effects of the number of shown landmarks can be generalized to landmark and route knowledge but cannot be applied to survey knowledge. In our current study, participants’ JRDs demonstrate high pointing errors. JRD performance in the current study is poorer than the performance reported in most prior related studies e.g., (Zhang et al., 2014; Credé et al., 2020). One possible explanation for this is that navigating in a novel environment only once is not sufficient for most navigators to obtain reliable configurational layout knowledge (Frankenstein et al., 2012; Huffman and Ekstrom, 2018). Previous studies have found that JRDs improve significantly with increasing exposure to the environment (Zhang et al., 2014; Huffman and Ekstrom, 2018). Another possibility for the poor JRD performance is the lack of body-based cues during navigation in our current experiment, such as motor, vestibular, and proprioceptive cues (Gramann, 2013). All of these body-based cues have been known to facilitate survey knowledge acquisition (Chrastil and Warren, 2013).

4.2 Cognitive load–theta ERS and P3 amplitude

Previous cognitive load research found a relationship between increasing task demands and increasing frontal midline theta ERS (Krause et al., 2000; Maurer et al., 2015; Scharinger et al., 2017). Similarly, we found that fronto-central midline theta increases with the number of landmarks visualized on a mobile map during navigation in visually complex urban VR environments.

Our finding that parieto-occipital P3 amplitude increases with increasing numbers of landmark depictions is also consistent with previous literature. It was shown that parieto-occipital P3 amplitude increases along with increasing demands on cognitive capacity (Kok, 2001; Polich, 2007; Scharinger et al., 2017; Wei and Zhou, 2020). Importantly, our findings on frontal theta power converge with those of parieto-occipital P3 amplitude. They are aligned with prior research demonstrating that more pronounced theta power occurs along with more pronounced P3 amplitude during cognitive tasks (Spencer and Polich, 1999; Scharinger et al., 2017; Wei and Zhou, 2020). By demonstrating a similar effect on parieto-occipital P3 amplitude and fronto-central theta ERS in the context of navigation and wayfinding, our research extends the existing literature on cognitive load and cognitive capacity, which were conducted mostly in highly controlled laboratory settings using simple stimuli. However, we did not observe differences in alpha ERD between the three landmark conditions. This could be attributed to conflicting relationships found between alpha ERD and cognitive load in recent years (Palva and Palva, 2007; Jensen and Mazaheri, 2010). It has been proposed recently that alpha ERS and alpha ERD might support two different cognitive mechanisms—attention orientation and attention maintenance, respectively (Capilla et al., 2014; Puma et al., 2018). As our current experimental design does not distinguish between these two mechanisms, they might have occurred at the same time in our extracted alpha power oscillations during map viewing and the two processes might have canceled each other out.

Notably, theta ERS is determined not only by the amount of displayed information that needs processing but also by the cognitive effort being spent to complete a cognitive task (Onton et al., 2005). Similarly, P3 amplitude is related to not only intrinsic task demands, but also to the internal expenditure of cognitive resources on the task at hand (Näätänen, 1992; Kok, 2001). Our results in the context of navigation and wayfinding show no increase in cognitive load when the number of landmarks increases from three to five, suggesting that the medium amount of chosen landmarks (five) does not tax additional cognitive resources, compared to showing the lowest evaluated amount of landmarks (three). This pattern differs when the highest number of landmarks (seven) was shown. We found increased cognitive load and thus that more cognitive resources were indeed consumed, compared to the other two landmark conditions.

Taking the findings of spatial learning outcomes and cognitive load together, it appears that showing five landmarks on a mobile map improves spatial learning performance compared to just showing three without significantly taxing additional cognitive resources. Presenting seven landmarks, and thus more information on a mobile map to navigators, however, does not further improve spatial learning, probably due to cognitive overload. Our research thus identified a potential boundary condition to the proposed benefit of visualizing landmarks in mobile maps on spatial learning—visualizing landmarks on maps benefits users’ spatial learning only when the number of visualized landmarks shown does not exceed users’ cognitive capacity during navigation. As such, this paper contributes to the ongoing debate in the field of map-assisted navigation by showing, on the one hand, that landmarks have an important role because they do facilitate spatial learning (Liao et al., 2017). On the other hand, we show that depicting increasing numbers of landmarks on mobile maps does not necessarily lead to corresponding increases in spatial learning, and that the role of landmarks in navigation and spatial learning should not be overly exaggerated (Montello, 2017). Future work can further examine other potential boundary conditions to the landmark effect on spatial cognition during navigation. To further ensure that our findings on EEG/ERP modulations reflect cognitive load in navigation contexts, future studies should also combine other instruments for measuring cognitive load, such as self-reports on cognitive load using the NASA-TLX questionnaire (Hart and Staveland, 1988) and pupil diameter assessments with mobile eye trackers (Krejtz et al., 2018).

4.3 Visuospatial encoding–theta ERS and alpha ERD

Our finding of increased theta ERS at parieto-occipital leads with increasing numbers of depicted landmarks, which is in line with our hypothesis of visuospatial encoding, replicates the results of previous literature on posterior theta oscillations during spatial navigation in urban virtual environments (Fischer et al., 2020; Do et al., 2021). Our results on theta ERS and alpha ERD at occipital leads are also congruent with the literature on visual stimuli encoding and visual attention (Wang et al., 2018; Delaux et al., 2021). These findings together suggest that when a greater number of landmarks is available for visuospatial encoding on a mobile map during navigation, brain activity related to visuospatial encoding increases similarly. However, an increase of available visuospatial information on mobile maps does not necessarily lead to an increase in cognitive load. This, therefore, extends the cartographic literature interested in the relationship of how visual information on maps is presented to map viewers and how this affects visual processing demand (Garlandini and Fabrikant, 2009), cognitive load (Bunch and Lloyd, 2006), decision making (Korporaal et al., 2020), and spatial behavior (Brügger et al., 2019). Specifically, our findings provide further insights to the field of cognitive cartography regarding the way in which landmark depiction is visually processed by map users during map-assisted navigation, and how this visual information can assist navigation and spatial learning. Our findings on visuospatial processing of map information and cognitive load also contribute to the literature on mobile maps by emphasizing that different cognitive processes occur during map-assisted navigation (Montello, 2002; Lobben, 2004) and spatial learning (Allen, 2003). These two different cognitive processes should be considered separately when designing cognitively supportive mobile maps.

4.4 Limitations and future work

The current work provides the first evidence for the impact of the number of landmarks visualized in a mobile map on cognitive load and visuospatial encoding during navigation and wayfinding in virtual urban environments. The use of three vs. five vs. seven landmarks based on classic studies on cognitive load theory e.g., (Baddeley, 2003), turns out to be a very useful stepping stone for further research on the role of landmarks in map-assisted wayfinding. The current findings need to be considered together with our specific navigation settings, including the route following task, the map style, the route length, the number of traversed intersections, and the chosen types of 3D landmarks (i.e., point features). Future studies following our paradigm could examine the relationship between spatial learning and cognitive load without depicting landmarks on a mobile map, which would resemble the state-of-the-art of available navigation systems, or presenting beyond seven landmarks. Another series of studies could, for example, examine how the length of a route to be followed affects cognitive load and spatial learning.

Based on our chosen experimental set up, we could assess only 17 map onset events for each participant and per condition, even though we used a within-participant design to control for inter-subject variability in brain activity and to increase statistical power. Classic prior research with desktop setups have typically used a large number of repetitions of a given event (such as stimulus presentation) to measure cognitive processing. However, using such a method to increase the numbers of events for analysis can be challenging in naturalistic settings (Wascher et al., 2014). In the case of our pedestrian wayfinding paradigm, showing a mobile map repeatedly could directly interfere with navigators’ wayfinding performance (i.e., trying to avoid obstacles, etc.,). To alleviate this problem, future studies could try using eye blinks as events as they are self-generated by navigators (Wascher et al., 2014; Wunderlich and Gramann, 2020) and therefore do not interrupt the navigation task. Another option would be to employ a VR-system coupled with an eye-tracking system that could be leveraged to generate large numbers of eye fixation events for a fine-grained cognitive load analysis while maintaining high ecological validity.

5 Conclusion and implications

Our current empirical research on the effect of landmark visualization on cognitive load and visuospatial encoding further exemplifies the important role of landmarks in map-assisted navigation and wayfinding. Specifically, varying the number of landmarks depicted on mobile maps used in a naturalistic VR navigation setting was found to affect spatial learning as well as cognitive load and visuospatial information encoding measured by EEG. Moreover, our findings also suggest a potential boundary effect to the proposed benefit of depicting landmarks on mobile maps during navigation on spatial learning. To support effective (i.e., accurate) spatial learning, a mobile map with a medium number of landmarks (i.e., five landmarks) seems to be optimal for spatial learning without overtaxing cognitive resources. Alternatively, showing a low number of landmarks (i.e., three landmarks) compared to five landmarks leads to worse spatial learning. Showing a high number of landmarks (i.e., seven landmarks) compared to five landmarks does not further benefit spatial learning while taxing more cognitive resources. By examining the effect of increasing the number of landmarks depicted on mobile maps on cognitive load and visuospatial encoding in a naturalistic navigation task, the present research synthesizes the fields of navigation research, mobile map design, neuroergonomics, and spatial cognition with implications for the development of brain-machine interfaces used for navigation. Our findings also provide a starting point for the design of tailored navigation assistance systems that respond to users’ cognitive load to optimize spatial learning while still maintaining navigation efficiency.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving human participants were reviewed and approved by the Ethics Board at the University of Zurich, Switzerland. The patients/participants provided their written informed consent to participate in this study.

Author contributions

BC, KG, and SF designed the study. BC performed data collection. BC performed data analysis. AW, KG, and EL assisted with data analysis. BC drafted the manuscript and all authors were involved in revising the manuscript. The authors read and approved the final manuscript.

Funding

This work was supported by the H2020 European Research Council (ERC) Advanced Grant GeoViSense (740426). https://cordis.europa.eu/project/id/740426. The funder had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgments

We thank Armand Kapaj for his assistance in the data collection and Ian Ruginski for his advice on the experimental design and assistance with multilevel modeling.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frvir.2022.981625/full#supplementary-material

References

Allen, G. L. (Editor) (2003). Human Spatial Memory: Remembering Where (1st ed.). New York, NY: Psychology Press. doi:10.4324/9781410609984

CrossRef Full Text

Alvarez, G. A., and Cavanagh, P. (2004). The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychol. Sci. 15 (2), 106–111. doi:10.1111/j.0963-7214.2004.01502006.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Anacta, V. J. A., Schwering, A., Li, R., and Muenzer, S. (2017). Orientation information in wayfinding instructions: Evidences from human verbal and visual instructions. GeoJournal 82 (3), 567–583. doi:10.1007/s10708-016-9703-5