- VTT Technical Research Centre of Finland, Espoo, Finland
Biosensing techniques are progressing rapidly, promising the emergence of sophisticated virtual reality (VR) headsets with versatile biosensing enabling an objective, yet unobtrusive way to monitor the user’s physiology. Additionally, modern artificial intelligence (AI) methods provide interpretations of multimodal data to obtain personalised estimations of the users’ oculomotor behaviour, visual perception, and cognitive state, and their possibilities extend to controlling, adapting, and even creating the virtual audiovisual content in real-time. This article proposes a visionary approach for personalised virtual content adaptation via novel and precise oculomotor feature extraction from a freely moving user and sophisticated AI algorithms for cognitive state estimation. The approach is presented with an example use-case of a VR flight simulation session explaining in detail how cognitive workload, decrease in alertness level, and cybersickness symptoms could be modified in real-time by using the methods and embedded stimuli. We believe the envisioned approach will lead to significant cost savings and societal impact and will thus be a necessity in future VR setups. For instance, it will increase the efficiency of a VR training session by optimizing the task difficulty based on the user’s cognitive load and decrease the probability of human errors by guiding visual perception via content adaptation.
Introduction
Head-mounted displays (HMD) offer an unobtrusive platform for head-area sensing. In this article, the term head-area sensing denotes a virtual reality (VR) headset enabling unobtrusive monitoring of a variety of biosignals such as eye and head movements, pupil size, heart rate, skin conductivity, and brain activity providing versatile information on human physiology and psychophysiology. Existing VR headsets include built-in eye-trackers, and an increasing number of versatile biosensing capabilities are emerging. Further, enabling VR content adaptation with seamless real-time estimations on human visual perception and cognitive state, for instance, we could analyse and modify human perception and attention allocation in a specific task (e.g., safety critical, monitoring tasks), optimize learning and rehabilitation effect during a session, or even guide individual experience paths (e.g., in entertainment and tourism related use-cases).
In this article we envision why biosignal-based personalised virtual content adaptation should be a necessity in future VR systems. The functionality of such a system is presented with an example use-case of a VR flight simulation training session. We explain how signal and feature processing, adaptive classification, and decision making with virtual content parametrisations are composed to operate together as a sophisticated AI system. We illustrate examples of three distinctive occasions that would be most likely to affect the performance of the trainee; increase in cognitive workload, decrease in alertness level, and cybersickness symptoms estimation. The primary focus here is in the eye and oculomotor parameters, as well as head movements, given their crucial role in visual perception.
Since visual perception is tightly linked to the visual content, the discussion is limited to VR environments (in which the content is fully controllable). However, some of the protocols could be implemented in augmented reality (AR), especially in fully rendered mixed reality (MR), where the visual scene is known (Rauschnabel et al., 2022).
Perception is an active cognitive process we use to form our understanding of the complex and dynamic world around us. It involves receiving and processing sensory information selectively filtered by attention. Cognitive state, such as alertness, directly influences attention allocation and visual perception (Lim and Dinges, 2010). Therefore, knowledge on the person’s cognitive state is essential for understanding their perception and attentional processes, strategies, behaviour, and performance in specific tasks.
Currently, the user’s cognitive state, especially stress and cognitive workload, can be detected from biosignals with machine learning methods in constrained environments as a binary indicator (Giannakakis et al., 2022). VR experiments have achieved a similar goal by limiting to data from an integrated eye-tracker (Shadiev and Li, 2023) and by the inclusion of wearable measurement devices (Weibel et al., 2023), (Tao et al., 2022), (Miltiadous et al., 2022) or custom hardware attached to the headset (Luong et al., 2020). However, external devices are more error-prone, leading to synchronization challenges compared to integrated sensors. Moreover, we have found that by combining multiple biosignals (such as electro-oculography, EOG; electroencephalogram, EEG; electrocardiogram, ECG; electrodermal activity, EDA) it is possible to achieve better classification performance in a multiclass classification of stress and cognitive workload (Pettersson et al., 2020), (Tervonen et al., 2023). These points suggests that the advent of novel VR headsets integrating biosensing facilitate to deliver versatile and engaging stimuli in a less constrained environment to estimate cognitive states on a more fine-grained level.
In their extensive 2021 review, Halbic and Latoschik noted that although the integration of biosignals with VR applications is promising, there were no VR headsets with biosignal capabilities available at that time (Halbig and Latoschik, 2021). Now, 3 years later, many commercial manufacturers (e.g., OpenBCI, HP, Emteqlabs, LooxidLabs, Wearable Sensing) are increasingly integrating biosensing such as EOG, electromyogram, ECG, EEG, EDA, photoplethysmogram, and facial cameras in addition to eye-tracking (video-oculography, VOG) into the headsets.
Existing knowledge of human perception and active vision, has been mainly derived from experiments conducted in static 2D setups with limited field-of-view (FoV) e.g., (Ugwitz et al., 2022), and corresponding oculometric algorithms are also designed based on such setups. While advancements in headset integrated sensing and VR have opened new opportunities for studying the complexities of real-life perceptual experiences, e.g., (Agtzidis et al., 2019), (Haskins et al., 2020), (Merzon et al., 2022), they also enable us to update the oculomotor feature extraction algorithms to operate in more detail in new, more dynamic (large FoV, 360° VR, even 3D) settings with a freely moving person (Startsev and Zemblys, 2022).
In the near future the user’s experience can be modified with personalised content adaptation in VR. This is possible by combining headsets with time-synchronized VR content and biosignals, robust real-time oculomotor feature estimation in a dynamic 3D environment, and sophisticated real-time AI algorithms for cognitive state estimation. Examples already exist in the literature on bio-feedback and content adaptation approaches, such as (Miltiadous et al., 2022), (Halbig and Latoschik, 2021), (Qu et al., 2022). However, most of these examples present offline solutions or are related to use-cases involving meditation or relaxation.
We envision that the personalised content adaptation will improve immersion and experience as well as obtain more fine-grained knowledge on the user’s visual perception, cognitive state, and performance. Ultimately, the approach could enable behavioural changes, enhance and accelerate learning or rehabilitation, and even decrease the probability of human error. These in turn will lead to cost savings and a significant societal impact.
Personalised content adaptation
The functionality of the personalised content adaptation is presented with an example use-case of a VR flight simulation training session with three cognitive states: cognitive workload, alertness level, and cybersickness symptoms estimation. Cognitive states are estimated by using all the biosignals obtained with biosensors integrated to a VR headset. Next, we will describe how the eye and oculomotor features are estimated from a freely moving person in VR, what an embedded stimulus is and how it could be used in content adaptation, and AI-aspects for realising the real-time adaptation.
Oculomotor parameter estimation in VR
Figure 1 demonstrates the detection of eye and oculomotor features during a VR flight training session. The headset tracks the user’s head and eye movements, and concurrent feature extraction provides the time series into different eye and head movement events based on the signal changes.
Figure 1. Schematic overview of an illustrative VR flight simulation session where the headset tracks the user’s head movements (the light cyan signal) and eye movements with both VOG (the blue signal) and EOG (the black signal). Feature extraction provides the head and eye movement events into different classes based on their types and concurrence: (A) a detailed illustration of a horizontal saccade, (B) an example of an occasion of simultaneous saccade and head movement followed by a vestibular ocular reflex (VOR), (C) a detailed illustration of a blink.
Both the head and the eyes are stable during fixations (Figure 1: light purple). While fixating, an object is held in foveal vision to gather information and a longer fixation time may indicate, for example, deeper cognitive processing (Rayner, 1998). Fixation patterns such as scan paths, and fixation dispersion contain information about content, and how a person tracks the visual field, which can change due to fatigue or neurological dysfunction (Shiferaw et al., 2018), (Cox and Aimola Davies, 2020). Eyes can maintain the fixation with smooth pursuit even though the object moves (Figure 1: cyan). Since smooth pursuit is difficult to perform without a moving target, it is discussed in more detail later with embedded stimuli.
Gaze is shifted with fast eye movements, saccades (Figure 1: grey; Figure 1A) with or without a head movement. Smaller gaze shifts (<15°) are usually made without a head movement (Bahill et al., 1975a). In studies with fixed head position, saccade parameters (e.g., rate, duration, peak velocity, main sequence (Bahill et al., 1975b)) have shown to be sensitive to changes in cognitive state such as alertness level (Hirvonen et al., 2010), acute stress (Startsev and Zemblys, 2022), and cognitive load (Qu et al., 2022). We believe these parameters can be reliably estimated during the VR session and used to estimate cognitive workload of the trainee during the flight simulation.
When the target object is far away, the head tends to continue movement after gaze fixates on the target, causing the eyes to do a compensatory movement with vestibular ocular reflex, (VOR) to keep the target in foveal vision (Figure 1: purple; Figure 1B). Given the coordinated nature of head and eye movements during attention allocation, the latency between eye and head movement, amplitude ratio, and direction of the movements can serve as indicators of the user’s performance, strategy, cognitive state in VR. Moreover, the dysfunction of VOR (e.g., jerky eye movements instead of smooth), can indicate motion sickness and dizziness (Wallace and Lifshitz, 2016), (Clément and Reschke, 2018), (Biswas et al., 2024) and could be used for estimating cybersickness occasions of the user during the flight training session.
Blink and pupil size parameters are robust for the head movements and can thus be monitored from during the VR session. Spontaneous eye blink frequency (EBR), blink duration, blink waveform parameters as well as eye lid velocity and acceleration (see Figure 1: yellow; Figure 1C) are sensitive to changes in cognitive state, e.g., vigilance (Schleicher et al., 2008). EBR is also mediated by the central dopaminergic activity and indicates cognitive performance (e.g., reinforcement learning and motivation, and cognitive flexibility) (Jongkees and Colzato, 2016). Luminance of the stimulus influences the pupil size. Nevertheless, variations in pupil size can serve as an informative marker for cognitive states, especially excitement and engagement (Bradley et al., 2008), if the illumination remains constant or is known.
Eye and oculomotor reactions induced by embedded protocols
The visual scene guides the eye and especially oculomotor parameters. To get more versatile information on the user’s oculomotor behaviour and cognitive state, embedded stimuli could be timely included into the content. For instance, if a movie includes a salient target, the user will most likely do a reflexive saccade towards the target, enabling the estimation of the saccade latency and the occasions when the target has been missed (Figure 1A). In addition, smooth pursuit can be induced by adding, e.g., a flying object to the flight simulation content. Dysfunction of the pursuit system could be an indicator of, e.g., fatigue (Stone et al., 2019) or cognitive impairment caused by alcohol (Tyson, 2021).
Simultaneous eye and head movement can be induced by implementing a large and rapid target movement across the FoV, prompting the user to execute a simultaneous head movement and saccade, potentially followed by a VOR (Figure 1B). With the help of such embedded stimuli occurrences, e.g., cybersickness symptoms can be monitored during the VR session.
Most of the presented eye parameters are under voluntary control, thereby being closely associated with the user’s motivation and engagement, for example, the user can voluntarily inhibit the embedded target. However, in certain setups VR may incorporate stimuli that elicit startle responses. Startle responses (e.g., blink and pupil) are predominantly unconscious defensive reactions triggered by sudden or threatening stimuli, such as a loud noise or light flashes (Koch, 1999). The latency of the startle responses reflects the functioning of the startle reflex controlled by the brainstem (Koch, 1999), providing insight into both affective and cognitive processes (Bradley et al., 2003).
AI aspects and cognitive state estimation
Head-area sensing in VR may benefit from AI in several ways, ranging from adaptive feature extraction, user cognitive state detection, content adaptation, synthetic virtual content creation based on user preferences, or assistance in reaching the VR session objectives by guiding the overall session management. Figure 2 presents a schematic overview of the data flow from sensing to personalised content adaptation.
Figure 2. Flow of head-area and biosignal data from sensing to calibration, feature extraction, modelling, and personalised content adaptation.
We illustrate the proposed approach (Figure 2) by using the example of estimating and tuning the trainee’s 1) cognitive load, 2) alertness level, and 3) cybersickness occasions during a VR flight simulator session. In the imaginary use-case a user carries out flight simulator session. The system operates in the background with capabilities to run and personalise the VR content, automated feature extraction, and state detectors to reach a certain overall objective set for a particular user.
The example: In the first phase of the session, the VR basic flight content is rendered and separate parallel personalised models on cognitive workload, alertness, and cybersickness provide corresponding estimations based on session objectives (e.g., steadily increasing cognitive load, maintaining alertness level, and eliminating cybersickness). The content parameters of the first phase of the embedded protocol simulation are tuned for evaluating the individual model performance measures within each of the three cognitive state test cases. The cases can be run sequentially or some of them can be combined. After the first phase each of the model provide individual state estimations (e.g., cognitive workload medium, alertness low, cybersickness high) for suggesting corresponding embedded protocol content parameter updates (this comes through physiology and oculometry vs. model explanations). The content parameter updates are converted in the background into new simulation sequences (embedded protocol adaptation) which are then played (sequentially or combined) with fine-tuned estimation models in the second phase of the simulation. These steps can be repeated until the objectives of the training session are fulfilled.
More concretely, a flight simulation session has the objective of training a student to fly in various weather conditions and manage different malfunctions while flying from one location (A) to another (B). Here, personalised content adaptation optimises the task difficulty with embedded stimuli to maximize the learning effect. During a normal takeoff and climb to cruising altitude, the ML models for workload, alertness, and cybersickness are personalised. At the same time, cybersickness symptoms and vigilance level are checked. If, for instance, the system notices that the student’s low alertness level is not optimal for learning, stimulating elements are automatically added to the content (e.g., turbulence, flock of birds). When the student’s alertness level is increased, the actual task can begin. The task difficulty is increased by adding embedded stimuli, for instance, increasing the crosswind. Afterwards, the student’s performance (e.g., correct altitude and heading) and cognitive workload are checked, and the task difficulty increased with various embedded stimuli (e.g., malfunction, or another weather condition) automatically in order to keep the workload in an optimal level.
As perceptional, attentional, and physiological processes are specific to everyone, modelling and content adaptation should account for individual differences. Personalisation requires some prior information, which is not available for a new person, i.e., when a cold start occurs. Since new users will likely require a short period of time to get used to the setup, an AI-assisted calibration process can be run while collecting the required baseline information to calibrate the setup and personalise cognitive state detection.
Biosignal events and the corresponding extracted features are computed based on the visual scene, the task at hand, the presence of head movements and the temporality of the physiological phenomenon behind the biosignal feature. For instance, blinks occur every 3.5 s on average, making the calculation of blink rate over any shorter segments pointless. Moreover, cognitive responses and oculomotor behaviour emerge with varying velocity and duration, making some of them useful for fast-paced content adaptations and others more relevant for longer-term domain-specific applications. Essentially, adaptive feature sampling and segmentation involves three distinct time frames.
i) Super-fast reactions (few seconds at most) such as panicking which should be caught for fast safety critical interventions.
ii) Short reactions (about a minute) such as cognitive load, and acute stress, detectable from e.g., ECG, EDA, and eye parameters.
iii) Slow reactions (3–10 min or more) such as flow, and engagement which require slower interventions.
These interventions largely consist of the addition of embedded stimuli, like modifying the visual scene to show only essential information, although switching to automated operating mode might also be needed in some cases and applications. Reactions to interventions are monitored and given back as input in the feedback loop.
In the scenario the AI models responsible for user calibration, cognitive state estimation, and adaptations with embedded protocols work in conjunction with another AI model to generate synthetic data. Augmenting the calibration data with the synthetic counterpart could help to improve the performance of the state detection especially in a cold start. The true potential of generative AI, however, comes from counterfactual prediction of adaptations needed to direct user states to a desired direction, and then creating the required virtual world. Such an AI could, for example, shift interactive storytelling from active, explicit decisions made by the user to implicit deduction of one’s wishes and creation of corresponding multimodal narrative, with potential applications ranging from training and entertainment to rehabilitation. These processes relate directly to the creation and optimization of the virtual space for each user and situation. We see that these topics are the most challenging technical advancements for the near future to achieving the vision.
Discussion
The use of personalised content adaptation will lead into significant societal impact and cost savings by improving the efficiency of VR sessions (e.g., in education and rehabilitation) as well as decreasing the probability of human errors by guiding attention allocation (e.g., safety critical tasks). For these reasons we claim that our approach will be a necessity in future VR setups.
We have illustrated the details and potential our approach with the flight simulator training use-case. However, the real-time content and embedded stimuli adaptation opens the possibility to make desired interventions to a variety of use-cases. There are multiple application domains varying from entertainment and workplace to education, with examples including.
• Entertainment in VR games and movies to enhance the experience by e.g., modifying the level of engagement.
• Education in training contexts such as flight simulators to optimize the content for maximizing learning e.g., real-time optimization of the task difficulty.
• Demand to estimate the cognitive state of the human in the loop especially in safety critical work, such as control rooms.
• Wellbeing, wellness, and rehabilitation (e.g., stroke, post-traumatic stress disorder).
A comfortable user experience in VR requires synchrony between the audiovisual content and immediate controlling events, such as head turns moving the FoV and eye tracking sharpening the image once movement halts. However, the envisioned AI processes for visual perception and cognitive state modelling, adaptations with embedding stimuli, and especially the creation of virtual worlds all have potentially significant computational costs. The orchestration of these AI processes may require architecturally complex solutions and integration to edge or cloud processing to ensure the models work accurately in a timely conjunction.
The potential applications require processing sensitive personal data, and some of the states that can be detected are private to the user. Besides computational challenges, developers need to consider the ethical aspects of their applications, including user privacy and anonymity, information security, and fairness. Data processing and AI based systems are also increasingly regulated, with the General Data Protection Regulation and the recent AI Act in the EU, and the non-binding AI Bill of Rights in the US. Such regulations and guidelines help developers ensure responsible data processing and use of AI and should therefore be closely followed.
The envisioned approach will revolutionise our understanding of human visual perception, cognitive state as well as behaviour in VR. With the integration of context detection, it can be further extended to AR with an increasing number of real-world components, and even implemented in real-life settings by using smart eyewear. Such a setting would allow for an extremely diverse analysis of human perception, cognition, and behaviour in everyday life.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.
Author contributions
KP: Conceptualization, Funding acquisition, Methodology, Supervision, Visualization, Writing–original draft, Writing–review and editing. JT: Conceptualization, Methodology, Visualization, Writing–original draft, Writing–review and editing. JH: Visualization, Writing–original draft, Writing–review and editing. JM: Conceptualization, Funding acquisition, Methodology, Project administration, Supervision, Visualization, Writing–original draft, Writing–review and editing.
Funding
The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. The work was funded by the Academy of Finland (grant 334092) and VTT.
Acknowledgments
The image of a human figure used in Figures 1, 2 is designed by Freepik (AI image generator beta)1.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Footnotes
1https://www.freepik.com/ai/image-generator
References
Agtzidis, I., Startsev, M., and Dorr, M. (2019). “360-degree video gaze behaviour: a ground-truth data set and a classification algorithm for eye movements,” in Proceedings of the 27th ACM international conference on multimedia (Nice France: ACM), 1007–1015. doi:10.1145/3343031.3350947
Bahill, A. T., Abler, D. L., and Stark, L. (1975a). Most naturally occurring human saccades have magnitudes of 15 degrees or less. Investig. Ophthalmol., 1M6.
Bahill, A. T., Clark, M. R., and Stark, L. (1975b). The main sequence, a tool for studying human eye movements. Math. Biosci. 24 (3–4), 191–204. doi:10.1016/0025-5564(75)90075-9
Biswas, N., Mukherjee, A., and Bhattacharya, S. (2024). Are you feeling sick?’ A systematic literature review of cybersickness in virtual reality. ACM Comput. Surv. 56, 1–38. doi:10.1145/3670008
Bradley, M. M., Miccoli, L., Escrig, M. A., and Lang, P. J. (2008). The pupil as a measure of emotional arousal and autonomic activation. Psychophysiology 45 (4), 602–607. doi:10.1111/j.1469-8986.2008.00654.x
Bradley, M. M., and Sabatinelli, D., “Startle reflex modulation: perception, attention, and emotion,” in , Experimental methods in neuropsychology, vol. 21, K. Hugdahl, Ed., in Neuropsychology and cognition, Boston, MA: Springer US, 2003, pp. 65–87. doi:10.1007/978-1-4615-1163-2_4
Clément, G., and Reschke, M. F. (2018). Relationship between motion sickness susceptibility and vestibulo-ocular reflex gain and phase. J. Vestib. Res. 28 (3–4), 295–304. doi:10.3233/VES-180632
Cox, J. A., and Aimola Davies, A. M. (2020). Keeping an eye on visual search patterns in visuospatial neglect: a systematic review. Neuropsychologia 146, 107547. doi:10.1016/j.neuropsychologia.2020.107547
Di Stasi, L. L., Antolí, A., and Cañas, J. J. (2011). Main sequence: an index for detecting mental workload variation in complex tasks. Appl. Ergon. 42 (6), 807–813. doi:10.1016/j.apergo.2011.01.003
Giannakakis, G., Grigoriadis, D., Giannakaki, K., Simantiraki, O., Roniotis, A., and Tsiknakis, M. (2022). Review on psychological stress detection using biosignals. IEEE Trans. Affect. Comput. 13 (1), 440–460. doi:10.1109/TAFFC.2019.2927337
Halbig, A., and Latoschik, M. E. (2021). A systematic review of physiological measurements, factors, methods, and applications in virtual reality. Front. Virtual Real. 2, 694567. doi:10.3389/frvir.2021.694567
Haskins, A. J., Mentch, J., Botch, T. L., and Robertson, C. E. (2020). Active vision in immersive, 360° real-world environments. Sci. Rep. 10 (1), 14304. doi:10.1038/s41598-020-71125-4
Hirvonen, K., Puttonen, S., Gould, K., Korpela, J., Koefoed, V. F., and Müller, K. (2010). Improving the saccade peak velocity measurement for detecting fatigue. J. Neurosci. Methods 187 (2), 199–206. doi:10.1016/j.jneumeth.2010.01.010
Jongkees, B. J., and Colzato, L. S. (2016). Spontaneous eye blink rate as predictor of dopamine-related cognitive function—a review. Neurosci. and Biobehav. Rev. 71, 58–82. doi:10.1016/j.neubiorev.2016.08.020
Koch, M. (1999). The neurobiology of startle. Prog. Neurobiol. 59 (2), 107–128. doi:10.1016/S0301-0082(98)00098-7
Lim, J., and Dinges, D. F. (2010). A meta-analysis of the impact of short-term sleep deprivation on cognitive variables. Psychol. Bull. 136 (3), 375–389. doi:10.1037/a0018883
Luong, T., Martin, N., Raison, A., Argelaguet, F., Diverrez, J.-M., and Lecuyer, A. (2020). “Towards real-time recognition of users mental workload using integrated physiological sensors into a VR HMD,” in 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Porto de Galinhas (Brazil: IEEE), 425–437. doi:10.1109/ISMAR50242.2020.00068
Merzon, L., Pettersson, K., Aronen, E. T., Huhdanpää, H., Seesjärvi, E., Henriksson, L., et al. (2022). Eye movement behavior in a real-world virtual reality task reveals ADHD in children. Sci. Rep. 12 (1), 20308. doi:10.1038/s41598-022-24552-4
Miltiadous, A., Aspiotis, V., Sakkas, K., Giannakeas, N., Glavas, E., and Tzallas, A. T. (2022). “An experimental protocol for exploration of stress in an immersive VR scenario with EEG,” in 2022 7th south-east europe design automation, computer engineering, computer Networks and social media conference (SEEDA-CECNSM), ioannina, Greece (IEEE), 1–5. doi:10.1109/SEEDA-CECNSM57760.2022.9932987
Pettersson, K., Tervonen, J., Närväinen, J., Henttonen, P., Määttänen, I., and Mäntyjärvi, J. (2020) “Selecting feature sets and comparing classification methods for cognitive state estimation,” in Presented at the 2020 IEEE 20th international conference on bioinformatics and bioengineering (BIBE). IEEE, 683–690. doi:10.1109/BIBE50027.2020.00115
Qu, C., Che, X., Ma, S., and Zhu, S. (2022). Bio-physiological-signals-based VR cybersickness detection. CCF Trans. Pervasive Comp. Interact. 4 (3), 268–284. doi:10.1007/s42486-022-00103-8
Rauschnabel, P. A., Felix, R., Hinsch, C., Shahab, H., and Alt, F. (2022). What is XR? Towards a framework for augmented and virtual reality. Comput. Hum. Behav. 133, 107289. doi:10.1016/j.chb.2022.107289
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychol. Bull. 124 (3), 372–422. doi:10.1037/0033-2909.124.3.372
Schleicher, R., Galley, N., Briest, S., and Galley, L. (2008). Blinks and saccades as indicators of fatigue in sleepiness warnings: looking tired? Ergonomics 51 (7), 982–1010. doi:10.1080/00140130701817062
Shadiev, R., and Li, D. (2023). A review study on eye-tracking technology usage in immersive virtual reality learning environments. Comput. and Educ. 196, 104681. doi:10.1016/j.compedu.2022.104681
Shiferaw, B. A., Downey, L. A., Westlake, J., Stevens, B., Rajaratnam, S. M. W., Berlowitz, D. J., et al. (2018). Stationary gaze entropy predicts lane departure events in sleep-deprived drivers. Sci. Rep. 8 (1), 2220. doi:10.1038/s41598-018-20588-7
Startsev, M., and Zemblys, R. (2022). Evaluating eye movement event detection: a review of the state of the art. Behav. Res. 55 (4), 1653–1714. doi:10.3758/s13428-021-01763-7
Stone, L. S., Tyson, T. L., Cravalho, P. F., Feick, N. H., and Flynn-Evans, E. E. (2019). Distinct pattern of oculomotor impairment associated with acute sleep loss and circadian misalignment. J. Physiol. 597 (17), 4643–4660. doi:10.1113/JP277779
Tao, K., Huang, Y., Shen, Y., and Sun, L. (2022). Automated stress recognition using supervised learning classifiers by interactive virtual reality scenes. IEEE Trans. Neural Syst. Rehabil. Eng. 30, 2060–2066. doi:10.1109/TNSRE.2022.3192571
Tervonen, J., Närväinen, J., Mäntyjärvi, J., and Pettersson, K. (2023). Explainable stress type classification captures physiologically relevant responses in the Maastricht Acute Stress Test. Front. Neuroergonomics 4, 1294286. doi:10.3389/fnrgo.2023.1294286
Tyson, T. L., Feick, N. H., Cravalho, P. F., Flynn Evans, E. E., and Stone, L. S. (2021). “Dose-dependent sensorimotor impairment in humanocular tracking after acute low-dose alcohol administration,” J. Physiology, vol. 599, no. 4, pp. 1225–1242. doi:10.1113/JP280395
Ugwitz, P., Kvarda, O., Juříková, Z., Šašinka, Č., and Tamm, S. (2022). Eye-tracking in interactive virtual environments: implementation and evaluation. Appl. Sci. 12 (3), 1027. doi:10.3390/app12031027
Wallace, B., and Lifshitz, J. (2016). Traumatic brain injury and vestibulo-ocular function: current challenges and future prospects, EB, 8, pp. 153–164. doi:10.2147/EB.S82670
Keywords: visual perception, oculomotor behavior, cognitive state estimation, virtual reality, adaptive sampling, artificial intelligence
Citation: Pettersson K, Tervonen J, Heininen J and Mäntyjärvi J (2024) Head-area sensing in virtual reality: future visions for visual perception and cognitive state estimation. Front. Virtual Real. 5:1423756. doi: 10.3389/frvir.2024.1423756
Received: 26 April 2024; Accepted: 29 August 2024;
Published: 20 September 2024.
Edited by:
Jesús Gutiérrez, Universidad Politécnica de Madrid, SpainReviewed by:
Aaron L. Gardony, U. S. Army Combat Capabilities Development Command Soldier Center, United StatesCopyright © 2024 Pettersson, Tervonen, Heininen and Mäntyjärvi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: K. Pettersson, kati.pettersson@vtt.fi