Skip to main content

SYSTEMATIC REVIEW article

Front. Robot. AI, 19 February 2024
Sec. Human-Robot Interaction

A scoping review of gaze and eye tracking-based control methods for assistive robotic arms

  • 1Faculty Economy, Work-Life Robotics Institute, University of Applied Sciences Offenburg, Offenburg, Germany
  • 2Ubiquitous Computing, Department of Electrical Engineering and Computer Science, University of Siegen, Siegen, Germany

Background: Assistive Robotic Arms are designed to assist physically disabled people with daily activities. Existing joysticks and head controls are not applicable for severely disabled people such as people with Locked-in Syndrome. Therefore, eye tracking control is part of ongoing research. The related literature spans many disciplines, creating a heterogeneous field that makes it difficult to gain an overview.

Objectives: This work focuses on ARAs that are controlled by gaze and eye movements. By answering the research questions, this paper provides details on the design of the systems, a comparison of input modalities, methods for measuring the performance of these controls, and an outlook on research areas that gained interest in recent years.

Methods: This review was conducted as outlined in the PRISMA 2020 Statement. After identifying a wide range of approaches in use the authors decided to use the PRISMA-ScR extension for a scoping review to present the results. The identification process was carried out by screening three databases. After the screening process, a snowball search was conducted.

Results: 39 articles and 6 reviews were included in this article. Characteristics related to the system and study design were extracted and presented divided into three groups based on the use of eye tracking.

Conclusion: This paper aims to provide an overview for researchers new to the field by offering insight into eye tracking based robot controllers. We have identified open questions that need to be answered in order to provide people with severe motor function loss with systems that are highly useable and accessible.

1 Introduction

1.1 Motivation and research questions

Assistive robotics is a broad field that describes the use of robots to assist the elderly or people with physical and cognitive disabilities. The field describes several types of robotic applications. For example, social, service and surgical robots, walking aids, exoskeletons, power wheelchairs, therapy robots and assistive robotic arms. The term Assistive Robotic Arm (ARA) is only one of several keywords used in the literature. The rationale for this work is based on the versatile naming found in various publications. Robotic arms used to assist people in everyday life have been called, among others, wheelchair-mounted robotic arm, assistive robotic manipulator and assistive robot, which complicates the retrieval of related work. In this paper, we will use the term Assistive Robotic Arm.

Controlling a robot with eye movements could be an appropriate solution for people with severe physical impairments of arms, head movement, and speech. In Germany, for example, 7.8 million people are severely disabled. 11% have disabilities of the arms and legs, 10% of the spine and upper body, and 9% have a cerebral disorder (Destatis, 2022). This results in approximately 2.3 million people who could benefit from an eye tracking control. Locked-in patients in particular could benefit from an ARA. Locked-in Syndrome can result from brainstem lesions such as stroke, traumatic brain injury, and tumors, from brainstem infections or degenerative diseases. Depending on the individual case rehabilitation therapy can restore body functions (Smith and Delargy, 2005). Amyotrophic Lateral Sclerosis (ALS) is a progressive disease that immobilizes the patient over time. In the Locked-In state and in the late stages of ALS, there is a high probability that the motor function of the eye is still intact, showing possibilities for the use of eye tracking (Edughele et al., 2022).

Currently, robots are used primarily in industrial applications to automate work. Adapting such robots as assistive systems requires interdisciplinary knowledge due to the variety of daily tasks and environments. Such tasks can be grouped into Activities of Daily Living (ADL) (WHO, 2001). They involve routines such as cleaning, eating, and personal hygiene, to ensure a good quality of life. Today, ARA can be controlled by joysticks. In these cases, the 3-dimensional moving robot is controlled solely by the user. Shared control, where Artificial Intelligence (AI) is used to reduce the cognitive load of the user in complex tasks, is being explored (Bien et al., 2003; Aronson et al., 2021). However, concerns about Human Robot Interaction (HRI) and safety are raised due to the close proximity to the user (Bien et al., 2003). A concern in assistive robotics is to ensure the wellbeing of the user. Among the challenges of realizing the various applications and ensuring the safety of the user, the usability and accessibility of ARAs must be addressed. If the system is not easily accessible and adapted to individual needs, people tend not to use it. A related challenge is the Midas touch problem, which describes the misinterpretation of gaze that triggers robot commands. This leads to user frustration (Alsharif et al., 2016; Di Maio et al., 2021). This challenge is compounded by the need to control the robot in a 3-dimensional space using 2-dimensional eye movements. For these reasons, this work will address the research questions in Table 1.

TABLE 1
www.frontiersin.org

TABLE 1. Research questions and motivation.

The main contributions of this work are based on the answers to these questions.

1. An overview of publications in the heterogeneous field of eye movement based robot control.

2. A detailed specification of the technology used (Anke Fischer-Janzen, Study Descriptions, URL: https://github.com/AnkeLinus/EyeTrackingInRobotControlTasks/blob/main/StudyDesciption.md, last accessed: 09.02.2024) and the studies conducted (Anke Fischer-Janzen, Overview of Measurements, URL: https://github.com/AnkeLinus/EyeTrackingInRobotControlTasks/blob/main/OverviewOfMeasurements.md, last accessed: 09.02.2024)

3. Future trends and open questions in eye-movement based robot control and a comparison of approaches

1.2 Technical background

Eye movements are used in a variety of assistive technology applications. Popular are eye typing interfaces that provide the user with the ability to speak and eye mouse that allows the person to use a computer (Al-Rahayfeh and Faezipour, 2013). Other applications include the control of electric wheelchairs that provide mobility to the user (Cowan et al., 2012). Social robots use it to extract facial features and interpret the feelings and needs of the user (Kyrarini et al., 2021).

Fixation and saccades are the most common eye movements. Fixation describes the gaze resting on a particular point. The duration of fixation depends on the object or location being fixated and ranges from tens of milliseconds or seconds. Fixations can be measured as dwell time, which is the amount of time a user fixates on an object. This is often used as a parameter to interpret the user’s intent or to give a command to the robot. Saccades are rapid movements of the eye. They last only a few milliseconds and can be as fast as 500°/s (Holmqvist and Andersson, 2017). These numbers vary in the literature literature. This may be due to the fact that everyone is anatomically and behaviorally unique, including how they move the eyes.

As the environment and the person move or the depth of field changes, the eye responds with vergence movements and smooth pursuit. Vergence movements describe the movement of the eye during reading. The axis of the eyes moves from parallel (distance vision) to an intersection on the page (near vision). This prevents double vision. Smooth pursuit describes the eye movement while looking at an object and moving the head or the object itself. This eye movement is not voluntarily. The slow speed of the eye with less than 30°/s in smooth pursuit distinguishes it from saccades (Holmqvist and Andersson, 2017). Most of the systems evaluated use fixation on objects or directions, but in head-mounted eye tracking devices smooth pursuit could have an impact on the function of the system.

There are several techniques to measure these movements. The techniques can be divided into Infrared Oculography (IOG), Video Oculography (VOG), Electrooculography (EOG), and electromagnetic coils (Klaib et al., 2021). All techniques are used to measure vertical and horizontal eye movements. Depending on the system, additional movements such as eyebrow movements, blinks, and pupil dilation can be recorded.

IOG uses infrared sensors to detect IR light. Unlike VOG, which records visible light, the invisible light is less distracting to the user, especially in darker environments. Most eye trackers require additional illuminators to create the following effect. When using Pupil Corneal Reflection Point effect, illumination creates a reflection on the cornea. By tracking the vector between the center of the pupil and this reflection eye movement can be determined (Klaib et al., 2021). As it will be presented, most eye trackers are based on infrared light because it provides more contrast of the pupil. This is based on the dark pupil detection method, where by illuminating the eye with IR light, the pupil appears darker than the iris (Edughele et al., 2022). Another way to achieve this effect is to place the illuminator at a different angle to the camera (see Figure 1). With this method, the light reflected from the retina is blocked by the iris, resulting in a darker pupil. In contrast, for bright pupil detection the illuminator is placed close to the optical axis of the camera, creating what is known in photography as the red-eye effect. The light reflected from the retina is captured by the camera (Tobii, 2023). Depending on the algorithm, the appearance of the pupil contrast can lead to better measurement results. With IOG and VOG it is possible to measure the pupil size in addition to the horizontal and vertical movements of the eye, as is done in behavioral studies (Tobii, 2023).

FIGURE 1
www.frontiersin.org

FIGURE 1. A comparison between bright pupil and dark pupil detection. The corneal reflection used for determining the Pupil Corneal Reflection Point is shown on the cornea. Adapted from Tobii (2023), the anatomy is presented in a simplified form.

EOG measures corneo-retinal standing potentials by detecting the electric field and is used to measure horizontal, vertical eye movements and eyebrow movements. This method is based on placing multiple electrodes on the user’s face in specific regions. Changes in the electric field are generated by the eye movements (Klaib et al., 2021). These systems can be used with a hybrid Brain Computer Interface (hBCI). A Brain Computer Interface (BCI) uses Electroencephalography (EEG) to measure neural signals. EEG is a mostly non-invasive neuroimaging technique used to measure and record the electrical activity generated by the brain through electrodes placed on the scalp. These signals can be used to interpret the user’s intention and to control external devices such as speech computers and robots (Chaudhary et al., 2022; Karas et al., 2023). hBCIs can use additional inputs such as EOG to improve the quality of the measurement, as will be presented in the results. For example, in Huang et al. (2019) Steady-State visually Evoked Potential (SSVEP) is used to control the system. SSVEP is an electrical signal evoked by the brain’s response to a visual flickering stimulus that has a constant frequency. From this pattern, the brain generates generative oscillatory electrical activity that can be measured by EEG.

Since no related publications using electromagnetic coils were found in this work, we refer to Holmqvist and Andersson (2017), Klaib et al. (2021), and Edughele et al. (2022) for further information. These techniques are used to classify the retrieved publications as shown in Table 4. There will be occasions when a system is labeled as VOG/IOG. No detailed constraints could be found in literature for the eye tracker technique used. This is possible with both infrared and visible light.

2 Materials and methods

This review was conducted in accordance with the PRISMA 2020 Statement. Due to the heterogeneity of the literature the PRISMA-ScR extension (Tricco et al., 2018) was used to present the results. In terms of the research questions, this review focuses on eye tracking based control for robots. In the following the databases, search terms, inclusion and exclusion criteria are listed below.

2.1 Search strategy and selection process

Three databases listed in Table 2 were searched to identify publications of interest. During the planning phase, non-standardized Internet searches were conducted as a preliminary evaluation to gather further information and identify search terms, including commercial systems and patents. After the systematic identification and screening process, a snowball search was conducted to identify relevant publications from other databases such as Scopus. Publications in English and German were included in the review process.

TABLE 2
www.frontiersin.org

TABLE 2. Databases used in the identification phase.

The search terms used for identification were permutations of the words “eye tracking,” “robot,” “eye,” “shared control,” “gaze” and “assistive.” Combinations of two search terms such as “eye” AND “robot” were neglected due to the number of publications found from other research topics outside the scope. As can be seen in Figure 2, the vast majority of reports were manually excluded due to ambiguous terms. Searching for “assistive robotic arms” (ARA) or “wheelchair mounted arms” (WMRA), which would imply an aid for physically disabled people, would have excluded most of the relevant papers. The general use of these terms is not common and varies between disciplines. Similar results can be found for eye and gaze used as synonyms, leading to ambiguous results for robotics, such as “eye-in-hand” describing the use of a camera mounted on the robot’s end effector to improve automated grasping.

FIGURE 2
www.frontiersin.org

FIGURE 2. The state flow shows the identification and screening process according to PRISMA 2020.

2.2 Inclusion and exclusion criteria

Table 3 shows the inclusion and exclusion criteria. Publications were included in the evaluation if the robot was controlled by eye movements. Accordingly, robot type and eye tracking method were included as evaluation items in Table 4. Any system used in a non-assistive task was excluded. This decision led to the inclusion of industrial applications with shared workspaces, as the system could also be used by people with motor impairments in their working life. Other excluded applications can be summarized as surgical assistance systems and telepresence robots in hazardous environments. In these cases the robot is not specifically designed to assist people with physical disabilities. Rehabilitation robots are used to mobilize the person. In such tasks, the robot’s movements are adapted to meet the patient’s anatomy. Visual feedback can be helpful to improve the rehabilitation process. However, as it will be shown the application of eye tracking varies greatly and many systems are exoskeletons. An exoskeleton is not controlled by gaze alone. To ensure safe use, torque, force, and pressure sensors are used to prevent unhealthy forces on the body. Rehabilitation robots are usually limited to a certain movement set, which is different from the wide range needed in everyday life. Therefore, the architecture of a rehabilitation system is different from an ARA, making a valid comparison impossible. The first research question depends on several variables: the space in which the robot can move, the level of automation and whether the systems presented have been tested by users (able-bodied or disabled). If no information was provided, this did not lead to exclusion due to the small number of publications included.

TABLE 3
www.frontiersin.org

TABLE 3. Inclusion and exclusion criteria.

TABLE 4
www.frontiersin.org

TABLE 4. List of articles included. Eye tracking devices were divided into VOG, IOG, and EOG. Robotic arms were distinguished between their use in assisted living (ARA), industrial (R), educational (EDR) or collaborative robots (CR). Eye movements are described as fixations (F), saccades (S), blinks (B) and winks (W) and SSVEP (E). The last three columns describe the number of able-bodied (AP) and disabled participants (DP), and trials (T) in the user study. More information can be found in the tables provided in Fischer-Janzen (2023).

2.3 Data charting process and data items

The data chart presented in Table 4 and online was constructed iteratively. Variables were selected with respect to author information, hardware and software setup, and empirical study criteria. The rationale for the chosen parameters was to find parameters that would allow the reader to gain insight into the design and development of eye tracking based ARA and, if already developing such a system, to find similar approaches. While reading the included publications the importance of the selected parameters for the original authors was estimated and the table was adapted. This resulted in two tables with 17 (Anke Fischer-Janzen, Study Descriptions, URL: https://github.com/AnkeLinus/EyeTrackingInRobotControlTasks/blob/main/StudyDesciption.md, last accessed: 09.02.2024) and 12 (Anke Fischer-Janzen, Overview of Measurements, URL: https://github.com/AnkeLinus/EyeTrackingInRobotControlTasks/blob/main/OverviewOfMeasurements.md, last accessed: 09.02.2024) elements respectively. Since these tables were too large to fit in this paper, it was decided to extract the most important ones, presented in Table 4, to facilitate the reading of the paper. For the author information four items were selected (names, year, title, and DOI). The system information was divided into six items (technology overview, eye movement detection device, eye tracking technique and sensors, wearable or remote eye tracking device, robot manipulation space dimensions, and type of robot). The software was described in three parameters (algorithms and models used in the approach, programming environment, type of eye movement used to control the robot). Finally, the empirical parameters were specified in four parameters (task description, number of participants, number of repetitions, empirical test performed) and additional three parameters describing the specified measurements and parameters divided into task related, computational and empirical parameters. This list was prepared by the first author and discussed with all authors.

2.4 Critical appraisal and synthesis methods

The interpretation of the results is limited to the statements of the cited literature. The list may not be exhaustive for research areas beyond eye tracking techniques in the context of robotics based on the search terms. As indicated by several authors, telemanipulation of robots includes various control input devices. Related research on input modalities such as joysticks may provide different perspectives to the discussion presented here. A risk of bias may exist for some of the evaluation items. Multimodal control and shared control exist for other areas of research. For example, gaze-based control is increasingly used to control electric wheelchairs and computer interfaces. An in-depth look at algorithmic design proposals is beyond the scope of this paper and not feasible due to the heterogeneity of the included work. However, the question arises to what extent the results are transferable. In order to reduce the risk of bias, reviews on the aforementioned topics were reviewed, interpreted, and included in the discussion.

The retrieved data will be evaluated in terms of challenges, benefits, limitations, and identified research interests. The challenges, benefits, and limitations are extracted and generalized in the case of duplicates. Sources will be provided for each summarized item. The research interests are extracted by sorting the publications according to their aim and contribution. Three timelines were created for three categories: telemanipulation, directional gaze, object-oriented gaze. Keywords were evaluated to indicate category membership.

The presentation of articles varied greatly depending on whether the researcher was working in computer science, robotics, or behavioral science. Therefore, research interests were extracted through a summary of keywords resulting from these parameters to reduce the influence of personal and field bias. Author defined keywords were collected and ranked according to their frequency of occurrence.

3 Results

The adapted PRISMA 2020 state flow is shown in Figure 2. 3,227 publications were identified and screened for duplicates. This was followed by title and abstract screening. Duplicates were found by applying a self-written Matlab script. Duplicates found later were added to the number of “duplicate records removed” accordingly. After screening the abstracts, 62 publications remained. These full texts were checked against the exclusion criteria. A snowball search based on the cited authors and an Internet search of the lead authors in each publication yielded 16 additional publications of interest. A total of 39 articles and 6 reviews were included in this work.

3.1 Excluded publications

Due to the rapid progress in research topics related to robotics and eye tracking, this section provides a brief overview. Especially telepresence robots, social robots, and the use of hBCI and BCI were found to be of growing research interest. Rehabilitation and surgical assistance were mentioned as possible applications. The 36 excluded publications were used to identify these topics.

Six publications were excluded due to their publication type. Three of these were dissertations. Alsharif’s Ph.D. thesis can be partially represented in her publication, which is referred to later in the included articles (Alsharif, 2017). Shahzad and Mehmood (2010) presented a master’s thesis on controlling articulated robot arms using eye tracking. (Nunez-Varela, 2012), focusing on task-driven object manipulation by gaze locations on the object (intention read from gaze). The MyEccPupil (HomeBrace GmbH) is a commercially available eye tracking controller for a wheelchair-mounted robotic arm. Although other eye tracking based systems were searched worldwide, no other systems could be found. Two related patents were found (Payton et al., 2013; Norales et al., 2016).

Telepresence robots (Mejia and Kajikawa, 2017; Zhang et al., 2019), smart or electric wheelchairs (Cowan et al., 2012; McMurrough et al., 2012; McMurrough et al., 2013; Jiang et al., 2016; Meena et al., 2016; 2017; Callejas-Cuervo et al., 2020; Edughele et al., 2022), social or humanoid robots (Ma et al., 2015; Mejia and Kajikawa, 2017; Nocentini et al., 2019; Gonzalez-Aguirre et al., 2021; Kyrarini et al., 2021; Liang and Nejat, 2022; Robinson et al., 2022) are controlled by gaze. Such programs greatly facilitate the daily life of people with upper limb impairments. The movement of other devices such as cameras by gaze can help in the control of surgical robots for laparoscopy (Zhu et al., 2010; Despinoy et al., 2013). Similar to direction corrections as in surgical context gaze can be used to interpret human intentions. Human intend detection were used to interpret the desired action the robot should accomplish to enhance the performance of a shared control (Jain and Argall, 2019). A human reflex also includes the dilatation of the pupil in certain events and can be used for shared control improvement (Aronson et al., 2018; Aronson and Admoni, 2019; 2020; Aronson et al., 2021).

Eye tracking is a versatile technology and is used, among other things, to detect whether the participant’s gaze is fixated on the monitor presenting the robot control interface Postelnicu et al. (2019). Projects such as ASPICE focus on a selection of input modalities to match the needs of the participants in order to assist patients with different tasks in the form of a hBCI (Cincotti et al., 2008). A robotic dog (AIBO) was used to test the modalities.

A variety of robotic applications can be found in rehabilitation. In cases of stroke rehabilitation, the robots are used to move the extremities through gaze implication, which is detected by EEG, EOG and VOG (Kirchner et al., 2013; Maimon-Dror et al., 2017; Crea et al., 2018; Shafti et al., 2019; Shafti and Faisal, 2021). Currently, it has been shown to restore hand function, but the “fluent, reliable and safe operation of a semi-autonomous whole-arm exoskeleton restoring ADLs” have to be demonstrated (Crea et al., 2018). Approaches for home physical therapy can be realized by exoskeletons (Kirchner et al., 2013; Pedrocchi et al., 2013; Maimon-Dror et al., 2017) or robotic gloves (Noronha et al., 2017; Shafti et al., 2019; Shafti and Faisal, 2021). Assisting with drinking tasks using an exoskeleton is mentioned in Crea et al. (2018). Other tasks include food preparation using virtual reality and an exoskeleton (Novak and Riener, 2013) or drawing on a screen without a robotic system (Shehu et al., 2021).

3.2 Related reviews

This section provides a brief overview of reviews in related research areas. A review of input modalities has been conducted by Clark and Ahmad (2021), who present eye tracking, computer vision and EEG approaches, emotion recognition, gestures, and lie detection. Advantages and challenges are presented along with an extensive literature review. It is aslo stated that non-verbal and non-touch based approaches are important for the future development of intuitive and natural feeling robot control. Esposito et al. (2021) reviewed biosignal-based Human Machine Interaction (HMI). These were biopotential biosignals such as EEG, muscle-mechanical motion, body motion, and hybrid approaches. This review focused on the use of EOG, among others. Ramkumar et al. (2018) presented an overview of the review classification of EOG-based Human Computer Interfaces (HCI) considering data from 2015 to 2020. In addition, Schäfer and Gebhard (2019) compared five hands-free input modalities that are important in robotic arm control research. Dünser et al. (2015) presented a similar approach and compared four input modalities. The results of both works are discussed in Section 4.3.1. In addition, a comparison of BCI and eye tracking in eye typing studies using a spelling program with people with severe motor impairments (Pasqualotto et al., 2015).

Robotics in healthcare is a promising approach addressing the shortage of skilled workers. The goal is to support patients in physical therapy and everyday life as well as caregivers in their daily tasks. Human recognition, emotion and speech recognition are used for the realization. Kyrarini et al. (2021) present robots in different scenarios, such as care robots like Pepper, hospital robots that help with logistics, physical therapy robots like exoskeletons and walking aids and finally assistive robots like FRIEND, Jaco 2, and Baxter.

Finally, a review of wearable interaction was found that defined head and eye movements used to control wearables (Siean and Vatavu, 2021). The results showed that there is limited research on the accessibility of such systems for people with motor impairments. The main findings of this review revealed four suggestions for future research. 1. exploring a variety of wearables, as the current focus is on head wearables, 2. multimodal input modalities and input modalities that maximize motor abilities, 3. more user studies, and 4. IoT.

3.3 Included works

The literature found is the basis for answering research question RQ1: “What approaches have been explored in the field of gaze-controlled robotic arms to assist people with (severe) upper limb impairments?”. Table 4 lists all articles that met the inclusion criteria. It is divided into three sections called telemanipulation, directional gaze, and object-oriented gaze. This separation was chosen because the interaction between the participant and the system can vary. Telemanipulation describes the use of a display that allows visual feedback and the presentation of buttons to manipulate the robot. Directional gaze approaches are based on looking in a direction to move the robot. Switching between robot manipulation dimensions (e.g., x-y axis to x-z axis) or grasping mode is achieved by including blinking or using additional input modalities. Object-oriented gaze entails the need to either determine gaze in 3D-space, implement object detection algorithms, and adapt trajectory planning to automate the task. For all three categories, we distinguish between 1) assistive systems in everyday life and 2) miscellaneous applications, which include industrial applications, comparative work, and tasks that are not performed in everyday life.

In Table 4, the Technology column is divided into the type of eye tracking device used, the location of the device, and the type of robotic arm mentioned in each publication. Details to the composition of the system in the online table (Fischer-Janzen, 2023). As can be seen, most systems use either eye tracking glasses (head-worn, 17 of 39 publications) or remote eye tracking devices (12 of 39 publications). Approaches using EOG devices are represented by 9 out of 39 publications. Robotic arms are divided into industrial robots (R: 11/39 pub.), collaborative robots (CR: 7/39 pub.), assistive robotic arms (ARA: 9/39 pub.), educational robots (EDR: 4/39 pub.), and modular prostheses (PR: 2/39 pub.). As a second characteristic, robot motion was divided into pointing to objects or positions (12/39 pub.), grasping objects (10/39 pub.), and performing tasks (17/39 pub.). Most authors chose one task, such as ADL, to proof the functionality of their system. If the system did not have an effector to grasp the object or to interact with it in some other way (e.g., with a magnet), it was reported as “point.” If the object was grasped but no further interaction was reported, the classification “grasp” was chosen. Note that “task” also describes pick and place tasks, as many tasks can be accomplished by this object manipulation, such as setting a table. The type of eye movement was evaluated to determine the effects of the Midas Touch Problem or similar effects. In total, 32 of the 39 publications used fixation based approaches (F). 4 of 39 publications also used saccades. In addition, blinking (B) was detected in seven publications. The workspace in which the robot is to be controlled was divided into 2D, moving in a plane and 3D, which is necessary, for example, to place objects in shelves. Approaches of recognition 2D gaze in a 3-dimensional control of a robot is challenging. Most of the authors presented solutions to control the robot in 3-dimensional Cartesian space (29/39 pub.). The last three characteristics describe the characteristics of the conducted user studies (28/39 pub.). Six studies included handicapped participants. In addition, Figure 3 presents a timeline of all included works and provides insights into the research focus, measurements, and application of artificial feedback.

FIGURE 3
www.frontiersin.org

FIGURE 3. Presented characteristics in all subfigures: Research focus, evaluation parameters/metrics, use of artificial feedback, optional: use in non-assistive living environment. (A)Publications with focus on telemanipulation. (B)Publications with focus on directional gaze. (C)Publications with focus on object oriented gaze.

3.3.1 Telemanipulation

Telemanipulation is usually realized by displaying digital buttons on a screen. They represent either directions and rotations of the Tool Center Point (TCP) or the robot gripper, individual joint positions, or complete tasks. In the latter case, the robot will act independently to achieve the goal of the task. In terms of the items shown in Table 4, most telemanipulation motions are described as “task” because complex tasks can be accomplished by manually controlling the robot using directional buttons.

The tasks used to test these telemanipulation systems were drinking (Huang et al., 2019; Stalljann et al., 2020), opening doors (Dragomir et al., 2021), drawing (Scalera et al., 2021a; b), printing cloths (Sharma et al., 2020; 2022), assisting with industrial tasks in shared workspaces (Di Maio et al., 2021), serving a meal or drink, picking up objects, changing CDs or tapes, making tea, and shaving (Bien et al., 2004).

The development of eye tracking based robot control began in early 2000 (see Figure 3; Kim et al. (2001)) presented the first system based on a self-developed eye tracking system and interface to control an industrial robotic arm using on-screen buttons. Similar systems have been presented to control a robotic arm via a GUI (Yoo et al., 2002; Sunny et al., 2021). Using a GUI means that the person does not need to have the same field of view as the robot because the scene can be displayed on the screen. This is helpful for people that are bedridden. For example, an EEG/EOG-based system capable of controlling a robot from another building was presented by Iáñez et al. (2010).

Although the display used in such telemanipulation approaches provides a solid basis for using either head-mounted or remote eye trackers many approaches have been found that use or comparing multimodal inputs. KARES II was one of the first systems using multiple input modalities such as eye-mouse, haptic suit, face recognition and EEG Bien et al., 2003, Bien et al., 2004. For this purpose a PUMA-type robotic arm, an eye tracking setup, and a haptic suit were developed. In this work the mouth was tracked to estimate drinking intent, requiring the combination of several input modalities. The HARMONIE system demonstrates an approach using intercranial electroencephalography (iEEG) signals and eye tracking as a hBCI (McMullen et al., 2014). Eye tracking was used to move a cursor and select targets.

In the experiment by Jones et al. (2018), the robot was used to play chess at different levels of difficulty. The two input modalities, eye tracking and joystick, were compared. Webb et al. (2016) presented a system that enhanced teleoperation for a robot by using gaze to manipulate it toward a fixated object. The object was approached using a joystick controller. A comparison between head and gaze performance in a robot control task evaluated different applications between continuous and discrete control events (Stalljann et al., 2020). This study demonstrated the importance of including people with disabilities, as there were measurable differences between the quadriplegic and able-bodied participants.

In several approaches, the use of multiple subsystems was targeted. In particular, the control of an electric wheelchair and an ARA, as done by Dragomir et al. (2021) and Huang et al. (2019), is of interest because it provides more mobility to the user. The design philosophies of such systems should include task-oriented design, “human friendliness” including safety precautions, and “modularization of subsystems” as stated by Bien et al. (2004). They combined a robotic arm, an electric wheelchair, and a mobile platform.

Another important advantage of the display is its ease of use for displaying visual feedback. In the approach of Zeng et al. (2017), the hBCI was enhanced by using Augmented Reality (AR) inputs to correct an eye tracking pick-and-place task. In this AR environment, colored rectangles were presented as visual feedback to help sort colored objects. Others were visualizations of virtual button presses (Kim et al., 2001), tactile feedback in a multimodal shoulder control (Bien et al., 2004), and highlighting as well as text descriptions (Iáñez et al., 2010; McMullen et al., 2014; Jones et al., 2018), as shown in Figure 3.

3.3.2 Directional gaze

Directional gaze is defined as rapid eye movements (saccades) with additional optional fixation events in a particular direction. Directional gaze can be helpful in improving the performance of the system, as quite large areas can be discriminated for a given movement. For example, Alsharif et al. (2016) demonstrated one of the first systems that worked without a screen and with gaze gestures, in which participants could control a robotic arm with a specific cue of eye movements. The directional control allowed the user to perform various pick-and-place tasks. The end-effector could be translated and rotated through the interface.

Most of the grouped systems performed pick-and-place tasks (Khan et al., 2012), but this straightforward approach has also been tested with other tasks, such as writing and drawing (Ubeda et al., 2011; Dziemian et al., 2016), according to the principle “look to the right, draw a line to the right.” Other approaches let the robot perform a free trajectory in which the user visually focused on visual markers on a wall and the robot followed by interpreting EOG signals (Zhang et al., 2013). Five other publications on EOG are mentioned in this category (Khan et al., 2012; Ubeda et al., 2011; Perez Reynoso et al., 2020; Rusydi et al., 2014a; b). One challenge is matching the 3D motion of the robot to the 2D motion of the eye. A solution was presented by (Perez Reynoso et al., 2020) by adapting the system to user-specific parameters using a fuzzy inference system. The response time was reduced and 3-dimensional movements could be performed with 2-dimensional eye inputs by using fuzzy classifiers. With this advanced technology, the system is able to use specific coordinates to separate the signal into multiple locations in a workspace.

Other multimodal approaches, such as Wang et al. (2015), combined an EEG system with an eye tracking device by implementing a trained HMM to improve the performance of the hBCI. The authors stated: “Our goal was to enable flexible and unscripted control while ensuring high reliability.” A multimodal system including gaze and EEG was used to perform a multiple obstacles grasping task. Gaze was used to correct the robot’s trajectory and EEG to control the speed of the end effector Wang et al. (2018). Especially in shared workspaces, adding speech and buttons as input can improve accessibility in assembly tasks (Bannat et al., 2009; Rusydi et al., 2014).

3.3.3 Object-oriented gaze

Object-oriented gaze describes the fixation on an object to indicate an interaction to the system. This can improve the user experience because humans tend to look at the object we want to interact with. In addition, the user is not forced to look away since most of the task execution is automated and the robot is not controlled by looking in a certain direction as in the previous section. This leads to the need for computer vision to extract features from the objects as it has been applied in most cases.

Tasks such as picking ingredients with intention recognition (Huang and Mutlu, 2016), interacting with everyday objects (Onose et al., 2012; Ivorra et al., 2018), or grasping a pair of scissors (Yang et al., 2021) have been realized using eye tracking techniques. The AIDE project uses EEG and EOG to move a robotic arm. The state-of-the-art algorithm uses AI to improve object selection from textureless objects in real time. Mouth poses have been used to improve user safety. The solution can be adapted to control an electric wheelchair (Ivorra et al., 2018).

Object-oriented gaze is a robust solution to the problem of grasping objects in 3D space. The user does not have to switch modes to move between x-y and x-z axes. 3D-gaze estimation can be used to identify the location of objects (Tostado et al., 2016; Li et al., 2017; Wöhle and Gebhard, 2021). This information facilitates robot trajectory planning. An algorithm for continuous calibration of the eye tracking device to track the gaze as a 3D point in a scene was presented by Tostado et al. (2016) and realized with a stereo vision camera and machine learning approaches. This system is useable for people with “strabism or other eye alignment defects.” Magnetic, Angular Rate, and Gravity (MARG) sensors and eye tracking were fused to estimate head position in the work of Wöhle and Gebhard (2021). Yang et al. (2021) introduced a system using Apriltags to facilitate joint control with the robot. Points of interest were detected by gaze. Intention detection was used to determine whether an object should be grasped at a particular location. For 3D gaze estimation, head movements were additionally tracked. By combining artificial stereo vision and eye tracking as an input device Cio et al. (2019) showed that the system performed well in a human-guided grasping task. Even in the presence of obstacles, a task success rate of 91% was achieved. Catalán et al. (2017) presented a multimodal control architecture that uses two eye tracking devices to estimate the location of objects in a scene.

In contrast to telemanipulation using a screen, an approach to the use of mixed realities has been found that also allows the use of artificial visual feedback (Park et al., 2022). This system has been used in industrial applications to realize a shared workspace. Through gaze selection, objects can be defined to be removed by a robot from a given space. In a service application, a system for predicting user intent in a drink mixing task was developed.

4 Discussion

This section discusses research questions RQ2 and RQ3. To answer these questions, a more general overview of the behavioral and technical interpretation of eye movements is presented. A comparison of input modalities, future trends, and open questions will answer RQ3 to increase transparency in this diverse field.

4.1 Interpreting gaze in robot control

As shown, most researchers use fixation to define events. In this context, fixation is defined as focusing on an object or direction. The resulting risk of the Midas Touch Problem can be reduced by a well choosen dwell time of about 200–700 ms. The Midas Touch Problem is reported in several included publications (Bien et al., 2004; Drewes and Schmidt, 2007; McMullen et al., 2014; Velichkovsky et al., 2014; Alsharif et al., 2016; Dziemian et al., 2016; Webb et al., 2016; Meena et al., 2017; Jones et al., 2018; Stalljann et al., 2020). In some experiments, dwell time leads either to pleasant or irritating situations. It was interpreted as entertaining in the experiment of Bednarik et al. (2009) as it was seen as a challenge in a game or frustrating since it caused wrong decisions in object manipulation (Jones et al., 2018). Saccades are rarely used. One reason found is that saccades are highly noisy, unintentional, and not goal-directed, which makes them difficult to interpret (Dziemian et al., 2016). Saccadic movements are used by Scalera et al. (2021a) to determine the trajectory of the eye movement and implement virtual brush strokes, which were then interpreted by the robot. Saccadic movements are also used to determine electrical potential changes in EOG signals (Khan et al., 2012).

Some approaches use different input modalities such as gaze gestures or blinks, although these can be highly intuitive and enrich the received gaze information (Duguleana and Mogan, 2010; Iáñez et al., 2010; Alsharif et al., 2016; Dziemian et al., 2016). A gaze gesture is described as a sequence of eye movement elements (Drewes and Schmidt, 2007) or as “the number of strokes performed in a predefined sequence” (Alsharif et al., 2016) and includes fixations and saccades as well as blinks and winking. If these sequences are easy to remember it may have an impact on usability.

In addition to the behavioral aspect of interpretation, the provision of interpretable data to the robot must be ensured. Not every publication provided detailed information about the software architecture. The solution was usually divided into two phases: 1) filtering and analysis of the eye-tracking data and, if used additional cameras and sensors and 2) robot trajectory planning. In the first step, OpenCV was used several times to detect the position of the pupil. Other systems like Pupil Core from Pupil Labs and the eye trackers from Tobii already provide an API to receive this data. The trajectory planning was mostly done with ROS MoveIt or GraspIt. The first ROS distribution was released in 2010. Therefore, in the first publications mentioned, ROS did not exist, so it is most likely that the robot was controlled directly by programmable logic controllers (PLC). New Python-based toolkits, modules, and environments are being developed to facilitate the control of robots and the implementation of AI models as it was presented (e.g., PyBullet, OpenVINO, Open AI Gym, and many more).

4.2 Benefits, challenges and limitations of eye tracking

Gaze is considered an intuitive control modality because it requires low cognitive load, is proactive, and directly correlates with action intentions (Tostado et al., 2016). It provides natural, effortless, and rich information that can be interpreted by the robot (Li et al., 2017; Zeng et al., 2017), especially since humans tend to look at objects they want to interact with (Li et al., 2017). When designing interfaces for assistive devices, it is important to consider people’s ability to retrain their motor functions, which leads to different approaches solving a challenge. During the training phase, the setup may be perceived as unnatural and distracting, resulting in additional cognitive load (Tostado et al., 2016). This may change over time. Considering that the system will be worn for several hours in real-world applications, such as head-mounted eye tracking devices, the system should be lightweight and adjustable to reduce discomfort and pressure on the head (Wöhle and Gebhard, 2021). In terms of social factors and the individuality of each user, there are important influences that challenge head-mounted and remote eye trackers. Parts of the environment are usually included in the scene video recorded by the eye tracking camera, which shows limitations regarding privacy regulations (Wöhle and Gebhard, 2021). As stated by the European Data Protection Board, the increased use of smart cameras today leads to a large generation of additional data to the captured video itself. This leads to an increased risk of secondary use for unexpected purposes such as marketing and performance monitoring (Jelinek, 2019). Especially in combination with the use of the robot in ADL, such as for maintaining personal hygiene, ways must be provided to avoid the monitoring of such sensitive data. According to a 2020 report, more than half of humanity has either myopia or presbyopia (WHO, 2019), requiring corrective devices such as contact lenses or glasses. Most eye tracking devices are not useable with glasses or lead to higher inaccuracies (Schäfer and Gebhard, 2019; Edughele et al., 2022).

Inadequate accuracy is often reported for eye tracking devices (Huang and Mutlu, 2016; Jones et al., 2018; Stalljann et al., 2020). The signal-to-noise ratio (SNR) in EOG and VOG is sometimes insufficient (Rusydi et al., 2014; Dziemian et al., 2016; Wang et al., 2018; Huang et al., 2019; Schäfer and Gebhard, 2019). This makes it difficult for users to complete the task and limits bandwidth, resulting in incorrect button and object selection (Dziemian et al., 2016; Huang and Mutlu, 2016; Wang et al., 2018). In VOG and IOG applications, head movements can lead to moving areas of interest (Edughele et al., 2022) due to limited performance of gaze-mapping algorithms (Yang et al., 2021). In experiments, this results in a limitation of head motion (Yang et al., 2021). Other limitations include lighting conditions, camera field of view, and object overlap (Esposito et al., 2021). Eye tracking glasses should be calibrated regularly or secured with a strap, as slippage can lead to errors (Stalljann et al., 2020). A common problem with IMUs or gyroscopes built into the glasses is DC offset. In gyroscopes, magnetic materials can also cause data drift. They are often used to track head or body movements (Schäfer and Gebhard, 2019; Wöhle and Gebhard, 2021). EOG signals must be filtered and segmented to obtain the desired eye movements for control, as unintended eye movements are also recorded (Huang et al., 2019). Gaze directions are difficult to track in previous work (Huang et al., 2019), but solutions are being found in current work (Perez Reynoso et al., 2020). Information is lost through filtering (Iáñez et al., 2010). One reason why EOG signals are easy to use is the linear relationship between signal and eye movement displacement (Rusydi et al., 2014).

4.2.1 Comparison of input modalities

The decision of which input modality is more appropriate depends strongly on the technology used, the goal of the task, and the control algorithm. A number of comparative studies with different experimental setups were identified in this review (Dünser et al., 2015; Pasqualotto et al., 2015; Jones et al., 2018; Schäfer and Gebhard, 2019; Stalljann et al., 2020). Figure 4 shows the articles and the input modalities. The task description is given to provide a deeper insight into the study design.

FIGURE 4
www.frontiersin.org

FIGURE 4. Identified comparative studies describing the task and the used input modalities.

Overall, VOG and IOG were rated more positively than the other modalities. Positive aspects were identified as subjective rating (low workload/cognitive load (Pasqualotto et al., 2015; Stalljann et al., 2020), comfort (Schäfer and Gebhard, 2019), and robust functionality (Schäfer and Gebhard, 2019). Pasqualotto et al. (2015) suggested its use as a communication device, similar to Schäfer and Gebhard (2019), who stated its use for discrete events, such as trigger events, based on the traceability of fast eye movements. Negative aspects have been reported in poor performance with the Fitt’s law test, which is associated with low accuracy (Dünser et al., 2015; Jones et al., 2018). It describes the relationship between the time it takes to move quickly to a target area and the size of that target (Dünser et al., 2015).

Contrasting results have been found with other modalities such as the thumbstick (Jones et al., 2018) and the mouse (Dünser et al., 2015). Jones et al. (2018) found better performance when combining input modalities than when using eye tracking alone, as it was faster and found better results in workload. However, Dünser et al. (2015) reported worse results for the eye tracking approach than for the mouse and touchscreen in both objective and subjective measures. One reason for this may be that most participants have used one of these before (Yoo et al., 2002; Jones et al., 2018). It is difficult to determine whether the differences in parameters are due to learning effect or pure perception.

Head tracking using a Magnetic, Angular Rate, and Gravity (MARG) sensor has also been positively evaluated (Schäfer and Gebhard, 2019; Stalljann et al., 2020). This type of sensor can provide comprehensive data on humans movement. Stalljann et al. (2020) presented that button activation to perform the task was less demanding, strenuous and frustrating with this sensor. Shoulder or head control is considered less intuitive than hand control because the information density is reduced. It is also mentioned that it can be more natural than other input devices (Bien et al., 2003). In comparison, shoulder control was found to be a suitable candidate for moving a robotic arm in continuous control and was rated better than eye tracking in one case (Stalljann et al., 2020). The patient with tetraplegia was able to perform better with eye tracking as measured by the lower error rate. The difference to the MARG sensor was not significant in the group of able-bodied participants.

EEG, EOG, and BCI were rated lower than all other input modalities due to low signal transmission (Schäfer and Gebhard, 2019), additional hardware required (Schäfer and Gebhard, 2019), and more effort and time required to perform the task (Pasqualotto et al., 2015).

4.3 Future trends

The result of research question RQ3 showed an increased mention of the following topics: multimodal control, including 3D gaze estimation and hBCI, and shared control. The authors see high potential in these topics as they can reduce the error rate of eye tracking devices used as stand-alone systems, improve the usability of such systems, and address individual needs.

4.3.1 Multimodal control inputs

Multimodal control inputs can improve the user experience for each participant. They are combined to adapt to the degree and type of impairment (Bien et al., 2004; Siean and Vatavu, 2021) and to the task (Bannat et al., 2009; Di Maio et al., 2021), increasing the intuitive control of the complex system (Bannat et al., 2009; Li et al., 2017; Sharma et al., 2022). The system of Bien et al. (2004) presents a hierarchy and design for different levels of disability. The user was able to move the system with the shoulder if the upper body motor functions were still intact. If the patient has a progressive disease, he can later switch to eye tracking control. The use of multiple input modalities leads to improved performance (Esposito et al., 2021), such as combining gaze and joystick control (Webb et al., 2016) or gaze and head tracking (Kim et al., 2001).

In this work, 6 of the 39 identified and included papers involved the use of hBCI that included eye tracking devices. Eye tracking was used for rapid intention detection, such as moving the end-effector in a particular direction (Wang et al., 2015; 2018), for object pose estimation (Onose et al., 2012; Ivorra et al., 2018), or for triggering switch events (Zeng et al., 2017). For detailed information, we refer to Esposito et al. (2021); Hong and Khan (2017); Nicolas-Alonso and Gomez-Gil (2012); Pasqualotto et al. (2015); Trambaiolli and Falk (2018).

3D gaze estimation can be achieved by combining different input modalities. By estimating the intersection of the lines of sight, we can determine the 3D gaze position with respect to the scene camera in head-mounted glasses (Tostado et al., 2016; Yang et al., 2021). Due to head movements, the point on a world coordinate system needed for robot manipulation is not stable. Two approaches that combine head tracking with eye tracking have been identified. The first approach used motion detection from video cameras placed around the user (Onose et al., 2012; Wöhle and Gebhard, 2021; Yang et al., 2021). The second approach uses accelerometers, IMUs and MARG sensors placed on the user’s neck and head (Wöhle and Gebhard, 2021). The latter is mostly used as a control based on head motion unrelated to 3D gaze estimation (Schäfer and Gebhard, 2019; Stalljann et al., 2020).

4.3.2 Shared control

A synonym for shared control was found to be described as semi-autonomous robot control. The robot can act and perform movements or tasks automatically, but is dependent on a user. The goal is to realize a comfortable and intuitive control of the device (Bannat et al., 2009; Li et al., 2017; Sharma et al., 2022). One solution is to incorporate intention recognition into the design process (Bien et al., 2004; Huang and Mutlu, 2016; Shafti and Faisal, 2021). Intention can be read by Areas of Interest (AOI) in eye tracking approaches, such as the handle of a cup when the user intends to grasp it or by tracking facial features. Further literature on this area of research can be found in Jain and Argall (2019); Aronson and Admoni (2019); Bonci et al. (2021), and Hentout et al. (2019).

Task-oriented design is one of the key benefits of shared control. Activities of daily living are composed of multiple subtasks. For example, drinking can be divided into reaching for and grasping a bottle, filling the glass, placing the bottle, taking the glass and reaching toward the user. A human will decide at which step to stop, and it is not necessary to perform them all in one go. In most of the presented approaches, the task is realized by the user fixating the object and correcting the system (Bien et al., 2003; 2004; McMullen et al., 2014; Wang et al., 2015; Huang and Mutlu, 2016). A comparison of the resulting degree of autonomy is shown for a feeding task by Bhattacharjee et al. (2020). The main finding is that the user should always feel in control of the situation and control.

One way to implement shared control is to use of AI, as has been done in several works. Detecting of an object can be easily done using models such as YOLO or RetinaNet (Ivorra et al., 2018; Park et al., 2022). Neural networks have already been implemented to learn robot motion (Bien et al., 2004), to model EOG functions (Perez Reynoso et al., 2020), to model this eye-hand coordination behavior during grasping (Li et al., 2017), or for face recognition (Sharma et al., 2022). The use of AI helps to automate tasks. Especially in robotics, but also in behavioral science, these models are becoming increasingly important.

4.4 Open questions

To conclude RQ3, open questions were uncovered that will be necessary progressing in the design process of the eye tracking control system for robotic arms. Frequently mentioned research topics focused on performance measurement, inclusion of people with disabilities in studies, and the use of artificial feedback to improve the usability and accessibility of the system.

4.4.1 Assistive robotics performance measurements

How the performance of an ARA is measured is highly dependent on the system, the domain of origin (e.g., computer science), and the study design. It was found that the most common parameters reported in the included papers were task completion time and success rate (Anke Fischer-Janzen, Overview of Measurements, URL: https://github.com/AnkeLinus/EyeTrackingInRobotControlTasks/blob/main/OverviewOfMeasurements.md, last accessed: 09.02.2024). These parameters are not comparable between studies because of differences in design. All reported parameters were divided into task-related parameters, computational parameters and empirical parameters. Computational parameters are mostly measured as accuracy (Bien et al., 2004; Onose et al., 2012; Wang et al., 2015; Li et al., 2017; Zeng et al., 2017; Huang et al., 2019; Yang et al., 2021). In the case of object recognition, parameters such as prediction accuracy, projection accuracy (McMullen et al., 2014; Huang and Mutlu, 2016; Ivorra et al., 2018), recognition rate (Zhang et al., 2013; Yang et al., 2021), and reaction time (Huang and Mutlu, 2016) are used. Computational cost, as reported by Ivorra et al. (2018), was used as a way to rank different algorithms. Task-related parameters are most often given as task completion time and success rate. Counting different events also helps the reader to estimate the challenges in the system. For example, authors mention the number of required gestures, which indicates the effort required by the user (Alsharif et al., 2016), or the number of collisions when there are obstacles (Huang et al., 2019). In the case of eye tracking, gaze parameters such as dwell time (fixation duration), gaze estimation accuracy, or areas of interest are controlled (Holmqvist and Andersson, 2017; Edughele et al., 2022). Other errors have been reported by authors such as Euclidean and Cartesian errors to describe the drift from the intended position to the robot position (Tostado et al., 2016; Li et al., 2017; Perez Reynoso et al., 2020; Wöhle and Gebhard, 2021). We want to show how to ensure better comparability between studies by specifying exemplary parameters. The completion time itself depends on several parameters. Dividing the task into several steps has been found to be helpful, such as different empirical and computational parameters, (e.g., reaction time of the user and the computation and the execution time with the robot). Reproducible results can be obtained by additionally specifying the trajectory length and the maximum speed of the robot. Boundary conditions for reproducibility were missing in some publications.

User satisfaction is evaluated through subjective measurements. The following tests and questionnaires have been identified: NASA TLX (NASA Task Load Index), USE questionnaire (Usability, Satisfaction and Ease of use), SUS questionnaire (System Usability Scale), and QUEST 2.0 (Quebec User Evaluation of Satisfaction with assistive Technology). Non-standardized tests were often used to analyze specific features of the system. This section has been expanded to include the QUEAD (Questionnaire for the Evaluation of Physical Assistive Devices), based on the TAM (Technology Acceptance Model), to provide another test method applicable to assistive robotics. Table 5 should provide a quick overview for new researchers. The question to be answered in future work is to what extent these questionnaires can be used to evaluate performance in an eye tracking based robot control. Although most of the questionnaires are well established in science and hurdles in execution and evaluation have been uncovered, the novelty and therefore the application to this topic needs to be explored.

TABLE 5
www.frontiersin.org

TABLE 5. Standardized questionnaires for subjective evaluation of assistive robotics.

4.4.2 User-centered design

In this review, six of the 39 publications included people with physically disabilities. A lack of inclusion of disabled people in such studies can be seen in other areas, such as the accessibility of wearables (Siean and Vatavu, 2021). Participants’ opinions about the system are crucial for improving such systems. Differences in the evaluation between able-bodied and disabled participants have been reported several times (Bien et al., 2004; Cincotti et al., 2008; Pasqualotto et al., 2015; Stalljann et al., 2020). For example, Stalljann et al. (2020) found that healthy participants could easily switch between controls and interpret the artificial feedback well. This was much more difficult for the tetraplegic participant. Onose et al. (2012) cited difficulties caused by changing postures in a multi-day study. In particular, severely immobile users may find it difficult to remain in the field of view of stationary eye tracking devices (McMullen et al., 2014).

According to Lulé et al. (2008), the decision against life-prolonging treatments in ALS patients is based on the fear of loss of autonomy caused by lack of mobility and aggravated communication. This has a major impact on the Quality of Life (QoL). Eye tracking based controls can help to increase the possibilities of an autonomous life (Edughele et al., 2022). Moreover, everyday use of such systems poses risks and challenges. Regarding an ARA mounted on an electric wheelchair, the needed instruments of a tetraplegic person can include respiratory and gastric catheters as well as communication devices (Edughele et al., 2022). Therefore, the movement of the robot has to be done carefully in close proximity to the person so as not to interfere with the tubes and cables. In addition to medical challenges, the research in this area should be guided on what activities and needs should be addressed and are necessary to provide an improvement in QoL for each individual. The performance of ADL with the use of ARA may be a promising solution to improve QoL. Care is mostly provided by family or professional caregivers. They take care of personal hygiene, dressing, medication, and in the case of assistive devices such as the ARA, maintenance and daily setup of the system (Chung et al., 2013). Especially when the patient has invasive devices, a clean environment is necessary. The system should be easy to keep sterile. In addition, the system should be easy to deploy to further reduce time pressures on caregivers.

Based on these impressions, the question arises as to what other effects have not yet been found in the case studies. Further studies need to be conducted with real-world robots controlled by eye tracking to assess the effects on impaired and able-bodied people. In these systems, multiple users are involved in creating a user-friendly design. The design steps should include additional input from caregivers and family members to ensure ease of use. Due to the novelty of the technology no details could be found.

4.4.3 Artificial feedback

Artificial feedback is well studied in the field of electric prostheses or other computer-based user interfaces. It helps to provide an easy-to-interpret signal for closed-loop interaction between the user and the robot, and provides a good user experience when applied correctly. The question arises as to how such feedback can be used in robotic applications and eye tracking. Figure 3 shows the use of artificial feedback in the form of auditory, visual, and tactile feedback in the included publications. Summarizing all included publications, 25 authors reported no use of feedback. 11 used visual feedback, two used auditory feedback and in one publication was tactile feedback used.

In the telemanipulation systems group, half of the authors used visual feedback to indicate system performance to the user. In most APIs, button click visualization, colored boxes, outlines, and bounding boxes are easy to program and provide visual feedback to the user about their interaction with the system. Transferring these feedback methods to a controller without a display seems complicated and may be unnecessary if the robot responds quickly. This may be one reason why most studies have not suggested the need for further artificial feedback. However, it can provide important information to the user, resulting in higher user satisfaction (Zeng et al., 2017). In the identified publications, two authors used visual feedback without the use of a display. Wang et al. (2018) used the robotic arm itself to provide visual cues by moving the end-effector to trigger each task. Park et al. (2022) used virtual objects displayed in augmented reality to visualize the movement of the object, which was then performed by the robot. The use of virtual and augmented reality allows this discipline to easily integrate visual ways to easily integrate visual feedback into real-world tasks.

Auditive feedback can be easily applied when using a tablet (Edughele et al., 2022). Examples of the use of auditory feedback were provided by Ivorra et al. (2018) and Dziemian et al. (2016). In these papers, a brief tone indicated either the selection of an object (Ivorra et al., 2018) or the successful selection of a movement command (Dziemian et al., 2016). Design rules for auditory feedback can be found in the work of Tajadura-Jiménez et al. (2018).

Tactile feedback can be used. With small vibration motors like those used in smartphones. Vibrotactile feedback can be realized, which could be mounted on the eye tracking glasses (Rantala et al., 2014; Fischer et al., 2023). Bien et al. (2004) used it to provide status information to the user. There are several design rules for vibrotactile feedback that should be considered (Gemperle et al., 2001; Choi and Kuchenbecker, 2013; Rantala et al., 2020). These are mostly related to either body-worn devices or integrated into eye tracking glasses.

While this is a good start, more research needs to be done on what feedback is practical and whether it affects user or system behavior.

4.5 Limitations

Interest in assistive devices will grow in the coming years due to the needs of the elderly and disabled, as well as the decline in caregivers. Each of the research interests presented above is a vast domain in itself. Therefore, a bias based on the limitation of robotic application cannot be excluded. To reduce this bias and to enrich the discussion, the excluded works were partially included. The focus was on research that considered eye tracking in combination with a robotic arm. Due to the interdisciplinary nature of the field, different results for advantages and disadvantages may be found in literature focusing on only one of the topics.

The bias was estimated by comparing the keywords given by the authors and the search terms. Figure 5 shows the results for the keywords listed in Section 2. The top 10 keywords used in all publications are 1. robot, 2. human, 3. interface, 4. control, 5. gaze, 6. robotic, 7. eye, 8. interaction, 9. assistive, 10. computer. The reason why 17 relevant publications were found in the snowball search is based on the absence of keywords such as “assistive,” “impairment” and “shared” and the fact that the database was not included in the initial search. Regarding the search terms used in the methods, “tracking” was not among the top 10 keywords in the list of systematically searched papers. The keywords “brain,” “machine,” “BCI” and “EOG” were identified in the snowball search. Since the focus of this paper is on eye tracking based systems, the identification and screening process was not repeated. Regarding the top 10 list of all publications, a new search would not have yielded any new results because most of the terms are synonyms of a previous used search term or would have missed key terms.

FIGURE 5
www.frontiersin.org

FIGURE 5. Search term analysis.

5 Conclusion

We presented an overview of 39 works with real robotic applications controlled by eye tracking. These approaches require interdisciplinary knowledge from the fields of robotics, human-robot interaction, and behavioral science. The interdisciplinary nature of the field often leads to different keywords for the same topic, making it difficult to find literature. It has been shown that the interpretation of gaze depends on the hardware used, the task the robot has to perform, and the and the stage of the user’s disability. Research interests were found in topics such as multimodal inputs and shared control. Insight was provided on performance measurements, inclusion of disabled people in study design and artificial feedback. Benefits and challenges in terms of eye movement detection methods and various important user parameters were decribed. The comparison of input modalities showed that the optimal devices depend strongly on the user’s abilities. The table presented online (Anke Fischer-Janzen, Study Descriptions, URL: https://github.com/AnkeLinus/EyeTrackingInRobotControlTasks/blob/main/StudyDesciption.md, last accessed: 09.02.2024 and Anke Fischer-Janzen, Overview of Measurements, URL: https://github.com/AnkeLinus/EyeTrackingInRobotControlTasks/blob/main/OverviewOfMeasurements.md, last accessed: 09.02.2024) will be expanded to further facilitate the identification of assistive robotic arm controllers.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Author contributions

AF-J: Conceptualization, Data curation, Formal Analysis, Writing–original draft, Writing–review and editing. TW: Resources, Writing–review and editing. KvL: Resources, Writing–review and editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Acknowledgments

We would like to thank the employees of the Work-Life Robotics Institute at University Offenburg and Ubiquitous Computing group at University Siegen for providing the possibility to discuss the work.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frobt.2024.1326670/full#supplementary-material

References

Al-Rahayfeh, A., and Faezipour, M. (2013). Eye tracking and head movement detection: a state-of-art survey. IEEE J. Transl. Eng. health Med. 1, 2100212. doi:10.1109/JTEHM.2013.2289879

PubMed Abstract | CrossRef Full Text | Google Scholar

Alsharif, S. (2017). Gaze-based control of robot arm in three-dimensional space. Bremen: Ph.D. thesis, University Bremen.

Google Scholar

Alsharif, S., Kuzmicheva, O., and Gräser, A. (2016). Gaze gesture-based human robot interface. Conf. Tech. Unterstützungssysteme, Menschen wirklich wollen A. T. Helmut-Schmidt-Universität/Universität Bundeswehr Hambg., 339–348.

Google Scholar

Aronson, R. M., and Admoni, H. (2019). “Semantic gaze labeling for human-robot shared manipulation,” in Proceedings of the 11th ACM symposium on eye tracking research & applications. Editors K. Krejtz, and B. Sharif (New York, NY, United States: ACM), 1–9. doi:10.1145/3314111.3319840

CrossRef Full Text | Google Scholar

Aronson, R. M., and Admoni, H. (2020). “Eye gaze for assistive manipulation,” in Companion of the 2020 ACM/IEEE international conference on human-robot interaction. Editors T. Belpaeme, J. Young, H. Gunes, and L. Riek (New York, NY, United States: ACM), 552–554. doi:10.1145/3371382.3377434

CrossRef Full Text | Google Scholar

Aronson, R. M., Almutlak, N., and Admoni, H. (2021). “Inferring goals with gaze during teleoperated manipulation,” in 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS) (IEEE), 7307–7314. doi:10.1109/IROS51168.2021.9636551

CrossRef Full Text | Google Scholar

Aronson, R. M., Santini, T., Kübler, T. C., Kasneci, E., Srinivasa, S., and Admoni, H. (2018). “Eye-hand behavior in human-robot shared manipulation,” in Proceedings of the 2018 ACM/IEEE international conference on human-robot interaction. Editors T. Kanda, S. Ŝabanović, G. Hoffman, and A. Tapus (New York, NY, United States: ACM), 4–13. doi:10.1145/3171221.3171287

CrossRef Full Text | Google Scholar

Bannat, A., Gast, J., Rehrl, T., Rösel, W., Rigoll, G., and Wallhoff, F. (2009). “A multimodal human-robot-interaction scenario: working together with an industrial robot,” in Human-computer interaction. Novel interaction methods and techniques. Editor J. A. Jacko (Berlin, Heidelberg: Springer Berlin Heidelberg), 5611, 302–311. doi:10.1007/978-3-642-02577-8_33

CrossRef Full Text | Google Scholar

Bednarik, R., Gowases, T., and Tukiainen, M. (2009). Gaze interaction enhances problem solving: effects of dwell-time based, gaze-augmented, and mouse interaction on problem-solving strategies and user experience. J. Eye Mov. Res. 3. doi:10.16910/jemr.3.1.3

CrossRef Full Text | Google Scholar

Bhattacharjee, T., Gordon, E. K., Scalise, R., Cabrera, M. E., Caspi, A., Cakmak, M., et al. (2020). “Is more autonomy always better?,” in Proceedings of the 2020 ACM/IEEE international conference on human-robot interaction. Editors T. Belpaeme, J. Young, H. Gunes, and L. Riek (New York, NY, USA: ACM), 181–190. doi:10.1145/3319502.3374818

CrossRef Full Text | Google Scholar

Bien, Z., Chung, M.-J., Chang, P.-H., Kwon, D.-S., Kim, D.-J., Han, J.-S., et al. (2004). Integration of a rehabilitation robotic system (kares ii) with human-friendly man-machine interaction units. Aut. Robots 16, 165–191. doi:10.1023/B:AURO.0000016864.12513.77

CrossRef Full Text | Google Scholar

Bien, Z., Kim, D.-J., Chung, M.-J., Kwon, D.-S., and Chang, P.-H. (2003). “Development of a wheelchair-based rehabilitation robotic system (kares ii) with various human-robot interaction interfaces for the disabled,” in Proceedings 2003 IEEE/ASME international conference on advanced intelligent mechatronics (AIM 2003) (IEEE), 902–907. doi:10.1109/AIM.2003.1225462

CrossRef Full Text | Google Scholar

Bonci, A., Cen Cheng, P. D., Indri, M., Nabissi, G., and Sibona, F. (2021). Human-robot perception in industrial environments: a survey. Sensors Basel, Switz. 21, 1571. doi:10.3390/s21051571

CrossRef Full Text | Google Scholar

Brooke, J. (1996). “Sus: a ’quick and dirty’ usability scale,” in Usability evaluation in industry. Editors P. W. Jordan, B. Thomas, I. L. McClelland, and B. Weerdmeester (London: CRC Press), 1–7.

Google Scholar

Callejas-Cuervo, M., González-Cely, A. X., and Bastos-Filho, T. (2020). Control systems and electronic instrumentation applied to autonomy in wheelchair mobility: the state of the art. Sensors Basel, Switz. 20, 6326. doi:10.3390/s20216326

CrossRef Full Text | Google Scholar

Catalán, J. M., Díez, J. A., Bertomeu-Motos, A., Badesa, F. J., and Garcia-Aracil, N. (2017). “Multimodal control architecture for assistive robotics,”. Converging clinical and engineering research on neurorehabilitation II. Editors J. Ibáñez, J. González-Vargas, J. M. Azorín, M. Akay, and J. L. Pons (Springer International Publishing), 513–517. doi:10.1007/978-3-319-46669-9_85

CrossRef Full Text | Google Scholar

Chaudhary, U., Vlachos, I., Zimmermann, J. B., Espinosa, A., Tonin, A., Jaramillo-Gonzalez, A., et al. (2022). Spelling interface using intracortical signals in a completely locked-in patient enabled via auditory neurofeedback training. Nat. Commun. 13, 1236. doi:10.1038/s41467-022-28859-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Choi, S., and Kuchenbecker, K. J. (2013). Vibrotactile display: perception, technology, and applications. Proc. IEEE 101, 2093–2104. doi:10.1109/JPROC.2012.2221071

CrossRef Full Text | Google Scholar

Chung, C.-S., Wang, H., and Cooper, R. A. (2013). Functional assessment and performance evaluation for assistive robotic manipulators: literature review. J. spinal cord Med. 36, 273–289. doi:10.1179/2045772313Y.0000000132

PubMed Abstract | CrossRef Full Text | Google Scholar

Cincotti, F., Mattia, D., Aloise, F., Bufalari, S., Schalk, G., Oriolo, G., et al. (2008). Non-invasive brain-computer interface system: towards its application as assistive technology. Brain Res. Bull. 75, 796–803. doi:10.1016/j.brainresbull.2008.01.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Cio, Y.-S. L.-K., Raison, M., Leblond Menard, C., and Achiche, S. (2019). Proof of concept of an assistive robotic arm control using artificial stereovision and eye-tracking. IEEE Trans. neural Syst. rehabilitation Eng. a Publ. IEEE Eng. Med. Biol. Soc. 27, 2344–2352. doi:10.1109/TNSRE.2019.2950619

PubMed Abstract | CrossRef Full Text | Google Scholar

Clark, A., and Ahmad, I. (2021). “Interfacing with robots without the use of touch or speech,” in The 14th PErvasive technologies related to assistive environments conference (New York, NY, USA: ACM), 347–353. doi:10.1145/3453892.3461330

CrossRef Full Text | Google Scholar

Cowan, R. E., Fregly, B. J., Boninger, M. L., Chan, L., Rodgers, M. M., and Reinkensmeyer, D. J. (2012). Recent trends in assistive technology for mobility. J. neuroengineering rehabilitation 9, 20. doi:10.1186/1743-0003-9-20

PubMed Abstract | CrossRef Full Text | Google Scholar

Crea, S., Nann, M., Trigili, E., Cordella, F., Baldoni, A., Badesa, F. J., et al. (2018). Feasibility and safety of shared eeg/eog and vision-guided autonomous whole-arm exoskeleton control to perform activities of daily living. Sci. Rep. 8, 10823. doi:10.1038/s41598-018-29091-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Demers, L., Monette, M., Lapierre, Y., Arnold, D., and Wolfson, C. (2002). Reliability, validity, and applicability of the quebec user evaluation of satisfaction with assistive technology (quest 2.0) for adults with multiple sclerosis. Disabil. Rehabil. 24, 21–30. doi:10.1080/09638280110066352

PubMed Abstract | CrossRef Full Text | Google Scholar

Demers, L., Weiss-Lambrou, R., and Ska, B. (1996). Development of the quebec user evaluation of satisfaction with assistive technology (quest). Assistive Technol. official J. RESNA 8, 3–13. doi:10.1080/10400435.1996.10132268

CrossRef Full Text | Google Scholar

Despinoy, F., Torres, L., Roberto, J., Vitrani, M.-A., and Herman, B. (2013). “Toward remote teleoperation with eye and hand: a first experimental study,” in 3rd joint workshop on new technologies for computer/robot assisted surgery (CRAS2013), 1–4.

Google Scholar

Destatis (2022). Press release of 22 June 2022: 7.8 million severely disabled people living in Germany.

Google Scholar

Di Maio, M., Dondi, P., Lombardi, L., and Porta, M. (2021). “Hybrid manual and gaze-based interaction with a robotic arm,” in 2021 26th IEEE international conference on emerging technologies and factory automation (ETFA) (IEEE), 1–4. doi:10.1109/ETFA45728.2021.9613371

CrossRef Full Text | Google Scholar

Dragomir, A., Pana, C. F., Cojocaru, D., and Manga, L. F. (2021). “Human-machine interface for controlling a light robotic arm by persons with special needs,” in 2021 22nd international carpathian control conference (ICCC) (IEEE), 1–6. doi:10.1109/ICCC51557.2021.9454664

CrossRef Full Text | Google Scholar

Drewes, H., and Schmidt, A. (2007). “Interacting with the computer using gaze gestures,” in Human-computer interaction - interact 2007 (Berlin, Heidelberg: Springer), 1–14.

CrossRef Full Text | Google Scholar

Duguleana, M., and Mogan, G. (2010). “Using eye blinking for eog-based robot control,” in Emerging trends in technological innovation: first IFIP WG 5.5SOCOLNET doctoral conference on computing, electrical and industrial systems, DoCEIS 2010, costa de caparica, Portugal, february 22-24, 2010 proceedings. Editors L. Camarinha-Matos, P. Pereira, and L. Ribeiro (Springer), 343–350.

Google Scholar

Dünser, A., Lochner, M., Engelke, U., and Rozado Fernández, D. (2015). Visual and manual control for human-robot teleoperation. IEEE Comput. Graph. Appl. 35, 22–32. doi:10.1109/MCG.2015.4

CrossRef Full Text | Google Scholar

Dziemian, S., Abbott, W. W., and Faisal, A. A. (2016). “Gaze-based teleprosthetic enables intuitive continuous control of complex robot arm use: writing & drawing,” in 2016 6th IEEE international conference on biomedical robotics and biomechatronics (BioRob) (IEEE), 1277–1282. doi:10.1109/BIOROB.2016.7523807

CrossRef Full Text | Google Scholar

Edughele, H. O., Zhang, Y., Muhammad-Sukki, F., Vien, Q.-T., Morris-Cafiero, H., and Opoku Agyeman, M. (2022). Eye-tracking assistive technologies for individuals with amyotrophic lateral sclerosis. IEEE Access 10, 41952–41972. doi:10.1109/ACCESS.2022.3164075

CrossRef Full Text | Google Scholar

Esposito, D., Centracchio, J., Andreozzi, E., Gargiulo, G. D., Naik, G. R., and Bifulco, P. (2021). Biosignal-based human-machine interfaces for assistance and rehabilitation: a survey. Sensors Basel, Switz. 21, 6863. doi:10.3390/s21206863

CrossRef Full Text | Google Scholar

Fischer, A., Gawron, P., Stiglmeier, L., Wendt, T. M., and van Laehrhoven, K. (2023). Evaluation of precision, accuracy and threshold for the design of vibrotactile feedback in eye tracking applications. J. Sens. Sens. Syst. 12, 103–109. doi:10.5194/jsss-12-103-2023

CrossRef Full Text | Google Scholar

Fischer-Janzen, A. (2023). Ankelinus/eyetrackinginrobotcontroltasks: V4 updated version of data base. doi:10.5281/zenodo.10034509

CrossRef Full Text | Google Scholar

Gao, M., Kortum, P., and Oswald, F. (2018). Psychometric evaluation of the use (usefulness, satisfaction, and ease of use) questionnaire for reliability and validity. Proc. Hum. Factors Ergonomics Soc. Annu. Meet. 62, 1414–1418. doi:10.1177/1541931218621322

CrossRef Full Text | Google Scholar

Gemperle, F., Ota, N., and Siewiorek, D. (2001). “Design of a wearable tactile display,” in Proceedings fifth international symposium on wearable computers (IEEE Comput. Soc), 5–12. doi:10.1109/ISWC.2001.962082

CrossRef Full Text | Google Scholar

Gonzalez-Aguirre, J. A., Osorio-Oliveros, R., Rodríguez-Hernández, K. L., Lizárraga-Iturralde, J., Morales Menendez, R., Ramírez-Mendoza, R. A., et al. (2021). Service robots: trends and technology. Appl. Sci. 11, 10702. doi:10.3390/app112210702

CrossRef Full Text | Google Scholar

Hart, S. G., and Staveland, L. E. (1988). “Development of nasa-tlx (task load index): results of empirical and theoretical research,”. Human mental workload (Elsevier), 52, 139–183. doi:10.1016/S0166-4115(08)62386-9

CrossRef Full Text | Google Scholar

Hentout, A., Aouache, M., Maoudj, A., and Akli, I. (2019). Human–robot interaction in industrial collaborative robotics: a literature review of the decade 2008–2017. Adv. Robot. 33, 764–799. doi:10.1080/01691864.2019.1636714

CrossRef Full Text | Google Scholar

Holmqvist, K., and Andersson, R. (2017). Eye-tracking: a comprehensive guide to methods, paradigms and measures. Oxford University Press.

Google Scholar

Hong, K.-S., and Khan, M. J. (2017). Hybrid brain-computer interface techniques for improved classification accuracy and increased number of commands: a review. Front. neurorobotics 11, 35. doi:10.3389/fnbot.2017.00035

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, C.-M., and Mutlu, B. (2016). “Anticipatory robot control for efficient human-robot collaboration,” in 2016 11th ACM/IEEE international conference on human-robot interaction (HRI) (IEEE), 83–90. doi:10.1109/HRI.2016.7451737

CrossRef Full Text | Google Scholar

Huang, Q., Zhang, Z., Yu, T., He, S., and Li, Y. (2019). An eeg-/eog-based hybrid brain-computer interface: application on controlling an integrated wheelchair robotic arm system. Front. Neurosci. 13, 1243–1249. doi:10.3389/fnins.2019.01243

PubMed Abstract | CrossRef Full Text | Google Scholar

Iáñez, E., Azorín, J. M., Fernández, E., and Úbeda, A. (2010). Interface based on electrooculography for velocity control of a robot arm. Appl. Bionics Biomechanics 7, 199–207. doi:10.1080/11762322.2010.503107

CrossRef Full Text | Google Scholar

Ivorra, E., Ortega, M., Catalán, J. M., Ezquerro, S., Lledó, L. D., Garcia-Aracil, N., et al. (2018). Intelligent multimodal framework for human assistive robotics based on computer vision algorithms. Sensors Basel, Switz. 18, 2408. doi:10.3390/s18082408

CrossRef Full Text | Google Scholar

Jain, S., and Argall, B. (2019). Probabilistic human intent recognition for shared autonomy in assistive robotics. ACM Trans. human-robot Interact. 9, 1–23. doi:10.1145/3359614

CrossRef Full Text | Google Scholar

Jelinek, A. (2019). Guidelines 3/2019 on processing of personal data through video devices. European Data Protection Board. Tech. rep.

Google Scholar

Jiang, H., Zhang, T., Wachs, J. P., and Duerstock, B. S. (2016). Enhanced control of a wheelchair-mounted robotic manipulator using 3-d vision and multimodal interaction. Comput. Vis. Image Underst. 149, 21–31. doi:10.1016/j.cviu.2016.03.015

CrossRef Full Text | Google Scholar

Jones, E., Chinthammit, W., Huang, W., Engelke, U., and Lueg, C. (2018). Symmetric evaluation of multimodal human–robot interaction with gaze and standard control. Symmetry 10, 680. doi:10.3390/sym10120680

CrossRef Full Text | Google Scholar

Karas, K., Pozzi, L., Pedrocchi, A., Braghin, F., and Roveda, L. (2023). Brain-computer interface for robot control with eye artifacts for assistive applications. Sci. Rep. 13, 17512. doi:10.1038/s41598-023-44645-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Khan, A., Memon, M. A., Jat, Y., and Khan, A. (2012). Electro-occulogram based interactive robotic arm interface for partially paralytic patients. ITEE J. 1.

Google Scholar

Kim, D. H., Kim, J. H., Yoo, D. H., Lee, Y. J., and Chung, M. J. (2001). A human-robot interface using eye-gaze tracking system for people with motor disabilities. Transaction Control, Automation Syst. Eng. 3, 229–235.

Google Scholar

Kirchner, E. A., Albiez, J. C., Seeland, A., Jordan, M., and Kirchner, F. (2013). “Towards assistive robotics for home rehabilitation,” in Proceedings of the international conference on biomedical electronics and devices (SciTePress - Science and and Technology Publications), 168–177. doi:10.5220/0004248501680177

CrossRef Full Text | Google Scholar

Klaib, A. F., Alsrehin, N. O., Melhem, W. Y., Bashtawi, H. O., and Magableh, A. A. (2021). Eye tracking algorithms, techniques, tools, and applications with an emphasis on machine learning and internet of things technologies. Expert Syst. Appl. 166, 114037–114124. doi:10.1016/j.eswa.2020.114037

CrossRef Full Text | Google Scholar

Kyrarini, M., Lygerakis, F., Rajavenkatanarayanan, A., Sevastopoulos, C., Nambiappan, H. R., Chaitanya, K. K., et al. (2021). A survey of robots in healthcare. Technologies 9, 8–26. doi:10.3390/technologies9010008

CrossRef Full Text | Google Scholar

Li, S., Zhang, X., and Webb, J. D. (2017). 3-d-gaze-based robotic grasping through mimicking human visuomotor function for people with motion impairments. IEEE Trans. bio-medical Eng. 64, 2824–2835. doi:10.1109/TBME.2017.2677902

PubMed Abstract | CrossRef Full Text | Google Scholar

Liang, N., and Nejat, G. (2022). A meta-analysis on remote hri and in-person hri: what is a socially assistive robot to do? Sensors Basel, Switz. 22, 7155. doi:10.3390/s22197155

CrossRef Full Text | Google Scholar

Lulé, D., Häcker, S., Ludolph, A., Birbaumer, N., and Kübler, A. (2008). Depression and quality of life in patients with amyotrophic lateral sclerosis. Dtsch. Arzteblatt Int. 105, 397–403. doi:10.3238/arztebl.2008.0397

CrossRef Full Text | Google Scholar

Lund, A. (2001). Measuring usability with the use questionnaire. Usability User Exp. Newsl. STC Usability SIG 8.

Google Scholar

Ma, J., Zhang, Y., Cichocki, A., and Matsuno, F. (2015). A novel eog/eeg hybrid human-machine interface adopting eye movements and erps: application to robot control. IEEE Trans. bio-medical Eng. 62, 876–889. doi:10.1109/TBME.2014.2369483

PubMed Abstract | CrossRef Full Text | Google Scholar

Maimon-Dror, R. O., Fernandez-Quesada, J., Zito, G. A., Konnaris, C., Dziemian, S., and Faisal, A. A. (2017). Towards free 3d end-point control for robotic-assisted human reaching using binocular eye tracking. IEEE Int. Conf. Rehabilitation Robotics 2017, 1049–1054. doi:10.1109/ICORR.2017.8009388

PubMed Abstract | CrossRef Full Text | Google Scholar

McKendrick, R. D., and Cherry, E. (2018). A deeper look at the nasa tlx and where it falls short. Proc. Hum. Factors Ergonomics Soc. Annu. Meet. 62, 44–48. doi:10.1177/1541931218621010

CrossRef Full Text | Google Scholar

McMullen, D. P., Hotson, G., Katyal, K. D., Wester, B. A., Fifer, M. S., McGee, T. G., et al. (2014). Demonstration of a semi-autonomous hybrid brain-machine interface using human intracranial eeg, eye tracking, and computer vision to control a robotic upper limb prosthetic. IEEE Trans. neural Syst. rehabilitation Eng. a Publ. IEEE Eng. Med. Biol. Soc. 22, 784–796. doi:10.1109/TNSRE.2013.2294685

PubMed Abstract | CrossRef Full Text | Google Scholar

McMurrough, C., Ferdous, S., Papangelis, A., Boisselle, A., and Makedon, F. (2012). “A survey of assistive devices for cerebral palsy patients,” in Proceedings of the 5th international conference on PErvasive technologies related to assistive environments (New York, NY, United States: ACM), 1–8. doi:10.1145/2413097

CrossRef Full Text | Google Scholar

McMurrough, C., Ranatunga, I., Papangelis, A., Popa, D. O., and Makedon, F. (2013). “A development and evaluation platform for non-tactile power wheelchair controls,” in Proceedings of the 6th international conference on PErvasive technologies related to assistive environments - petra ’13. Editors F. Makedon, M. Betke, M. S. El-Nasr, and I. Maglogiannis (New York, New York, USA: ACM Press), 1–4. doi:10.1145/2504335.2504339

CrossRef Full Text | Google Scholar

Meena, Y. K., Cecotti, H., Wong-Lin, K., and Prasad, G. (2017). “A multimodal interface to resolve the midas-touch problem in gaze controlled wheelchair Annual International Conference of the IEEE Engineering in Medicine and Biology Society,” in IEEE engineering in medicine and biology society. Annual international conference, 905–908. doi:10.1109/EMBC.2017.8036971

PubMed Abstract | CrossRef Full Text | Google Scholar

Meena, Y. K., Chowdhury, A., Cecotti, H., Wong-Lin, K., Nishad, S. S., Dutta, A., et al. (2016). “Emohex: an eye tracker based mobility and hand exoskeleton device for assisting disabled people,” in 2016 IEEE international conference on systems, man, and cybernetics (SMC) (IEEE), 002122–002127. doi:10.1109/SMC.2016.7844553

CrossRef Full Text | Google Scholar

Mejia, C., and Kajikawa, Y. (2017). Bibliometric analysis of social robotics research: identifying research trends and knowledgebase. Appl. Sci. 7, 1316. doi:10.3390/app7121316

CrossRef Full Text | Google Scholar

Nicolas-Alonso, L. F., and Gomez-Gil, J. (2012). Brain computer interfaces, a review. Sensors Basel, Switz. 12, 1211–1279. doi:10.3390/s120201211

PubMed Abstract | CrossRef Full Text | Google Scholar

Nocentini, O., Fiorini, L., Acerbi, G., Sorrentino, A., Mancioppi, G., and Cavallo, F. (2019). A survey of behavioral models for social robots. Robotics 8, 54. doi:10.3390/robotics8030054

CrossRef Full Text | Google Scholar

Norales, E. R., Brasset, D., and Invernizzi, P. (2016). Robotized surgery System with improved control (US9360934 B2)

Google Scholar

Noronha, B., Dziemian, S., Zito, G. A., Konnaris, C., and Faisal, A. A. (2017). “Wink to grasp - comparing eye, voice & emg gesture control of grasp with soft-robotic gloves,” in IEEE international conference on rehabilitation robotics, 1043–1048. doi:10.1109/ICORR.2017.8009387

PubMed Abstract | CrossRef Full Text | Google Scholar

Novak, D., and Riener, R. (2013). Enhancing patient freedom in rehabilitation robotics using gaze-based intention detection. IEEE Int. Conf. Rehabilitation Robotics 2013, 6650507. doi:10.1109/ICORR.2013.6650507

PubMed Abstract | CrossRef Full Text | Google Scholar

Nunez-Varela, J. (2012). Gaze control for visually guided manipulation. Birmingham: University of Birmingham. Phd thesis.

Google Scholar

Onose, G., Grozea, C., Anghelescu, A., Daia, C., Sinescu, C. J., Ciurea, A. V., et al. (2012). On the feasibility of using motor imagery eeg-based brain-computer interface in chronic tetraplegics for assistive robotic arm control: a clinical test and long-term post-trial follow-up. Spinal Cord. 50, 599–608. doi:10.1038/sc.2012.14

PubMed Abstract | CrossRef Full Text | Google Scholar

Park, K.-B., Choi, S. H., Moon, H., Lee, J. Y., Ghasemi, Y., and Jeong, H. (2022). “Indirect robot manipulation using eye gazing and head movement for future of work in mixed reality,” in 2022 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW) (IEEE), 483–484. doi:10.1109/VRW55335.2022.00107

CrossRef Full Text | Google Scholar

Pasqualotto, E., Matuz, T., Federici, S., Ruf, C. A., Bartl, M., Olivetti Belardinelli, M., et al. (2015). Usability and workload of access technology for people with severe motor impairment: a comparison of brain-computer interfacing and eye tracking. Neurorehabilitation neural repair 29, 950–957. doi:10.1177/1545968315575611

PubMed Abstract | CrossRef Full Text | Google Scholar

Payton, D. W., and Daily, M. J. (2013). Systems, methods, and apparatus for neuro-robotic tracking point selection (US8483816 B1).

Google Scholar

Pedrocchi, A., Ferrante, S., Ambrosini, E., Gandolla, M., Casellato, C., Schauer, T., et al. (2013). Mundus project: multimodal neuroprosthesis for daily upper limb support. J. neuroengineering rehabilitation 10, 66–20. doi:10.1186/1743-0003-10-66

PubMed Abstract | CrossRef Full Text | Google Scholar

Perez Reynoso, F. D., Niño Suarez, P. A., Aviles Sanchez, O. F., Calva Yañez, M. B., Vega Alvarado, E., and Portilla Flores, E. A. (2020). A custom eog-based hmi using neural network modeling to real-time for the trajectory tracking of a manipulator robot. Front. neurorobotics 14, 578834–578923. doi:10.3389/fnbot.2020.578834

CrossRef Full Text | Google Scholar

Postelnicu, C.-C., Girbacia, F., Voinea, G.-D., and Boboc, R. (2019). “Towards hybrid multimodal brain computer interface for robotic arm command: 13th international conference, ac 2019, held as part of the 21st hci international conference, hcii 2019, orlando, fl, USA, july 26-31, 2019: proceedings,” in Augmented cognition. Editors D. D. Schmorrow, and C. M. Fidopiastis (Cham: Springer), 461–470.

CrossRef Full Text | Google Scholar

Ramkumar, S., Sathesh Kumar, K., Dhiliphan Rajkumar, T., Ilayaraja, M., and Sankar, K. (2018). A review-classification of electrooculogram based human computer interfaces. Biomed. Res. 29, 078–1084. doi:10.4066/biomedicalresearch.29-17-2979

CrossRef Full Text | Google Scholar

Rantala, J., Kangas, J., Akkil, D., Isokoski, P., and Raisamo, R. (2014). “Glasses with haptic feedback of gaze gestures,” in CHI ’14 extended abstracts on human factors in computing systems. Editors M. Jones, P. Palanque, A. Schmidt, and T. Grossman (New York, NY, USA: ACM), 1597–1602. doi:10.1145/2559206.2581163

CrossRef Full Text | Google Scholar

Rantala, J., Majaranta, P., Kangas, J., Isokoski, P., Akkil, D., Špakov, O., et al. (2020). Gaze interaction with vibrotactile feedback: review and design guidelines. Human–Computer Interact. 35, 1–39. doi:10.1080/07370024.2017.1306444

CrossRef Full Text | Google Scholar

Robinson, N., Tidd, B., Campbell, D., Kulić, D., and Corke, P. (2022). Robotic vision for human-robot interaction and collaboration: a survey and systematic review. ACM Trans. Human-Robot Interact. 12, 1–66. doi:10.1145/3570731

CrossRef Full Text | Google Scholar

Rusydi, M., Okamoto, T., Ito, S., and Sasaki, M. (2014a). Rotation matrix to operate a robot manipulator for 2d analog tracking objects using electrooculography. Robotics 3, 289–309. doi:10.3390/robotics3030289

CrossRef Full Text | Google Scholar

Rusydi, M. I., Sasaki, M., and Ito, S. (2014b). Affine transform to reform pixel coordinates of eog signals for controlling robot manipulators using gaze motions. Sensors Basel, Switz. 14, 10107–10123. doi:10.3390/s140610107

PubMed Abstract | CrossRef Full Text | Google Scholar

Scalera, L., Seriani, S., Gallina, P., Lentini, M., and Gasparetto, A. (2021a). Human–robot interaction through eye tracking for artistic drawing. Robotics 10, 54. doi:10.3390/robotics10020054

CrossRef Full Text | Google Scholar

Scalera, L., Seriani, S., Gasparetto, A., and Gallina, P. (2021b). “A novel robotic system for painting with eyes,” in Advances in Italian mechanism science. Editors V. Niola, and A. Gasparetto (Cham: Springer International Publishing), 91, 191–199. doi:10.1007/978-3-030-55807-9{∖textunderscore}22

CrossRef Full Text | Google Scholar

Schäfer, J., and Gebhard, M. (2019). “Feasibility analysis of sensor modalities to control a robot with eye and head movements for assistive tasks,” in Proceedings of the 12th ACM international conference on PErvasive technologies related to assistive environments. Editor F. Makedon (New York, NY, USA: ACM), 482–488. doi:10.1145/3316782.3322774

CrossRef Full Text | Google Scholar

Schmidtler, J., Bengler, K., Dimeas, F., and Campeau-Lecours, A. (2017). Questionnaire for the evaluation of physical assistive devices (quead) - manual. doi:10.13140/RG.2.2.31113.44649

CrossRef Full Text | Google Scholar

Shafti, A., and Faisal, A. A. (2021). “Non-invasive cognitive-level human interfacing for the robotic restoration of reaching & grasping,” in 2021 10th international IEEE/EMBS conference on neural engineering (NER) (IEEE), 872–875. doi:10.1109/NER49283.2021.9441453

CrossRef Full Text | Google Scholar

Shafti, A., Orlov, P., and Faisal, A. (2019). “Gaze-based,” in Context-aware robotic system for assisted reaching and grasping (Piscataway, NJ: IEEE).

Google Scholar

Shahzad, M. I., and Mehmood, S. (2010). Control of articulated robot arm by eye tracking. Karlsakrona, Sweden: School of Computing Blekinge Institute of Technology. Master thesis.

Google Scholar

Sharma, V. K., Murthy, L. R. D., and Biswas, P. (2022). Comparing two safe distance maintenance algorithms for a gaze-controlled hri involving users with ssmi. ACM Trans. Accessible Comput. 15, 1–23. doi:10.1145/3530822

CrossRef Full Text | Google Scholar

Sharma, V. K., Saluja, K., Mollyn, V., and Biswas, P. (2020). “Eye gaze controlled robotic arm for persons with severe speech and motor impairment,” in ACM symposium on eye tracking research and applications. Editors A. Bulling, A. Huckauf, E. Jain, R. Radach, and D. Weiskopf (New York, NY, USA: ACM), 1–9. doi:10.1145/3379155.3391324

CrossRef Full Text | Google Scholar

Shehu, I. S., Wang, Y., Athuman, A. M., and Fu, X. (2021). Remote eye gaze tracking research: a comparative evaluation on past and recent progress. Electronics 10, 3165. doi:10.3390/electronics10243165

CrossRef Full Text | Google Scholar

Siean, A.-I., and Vatavu, R.-D. (2021). “Wearable interactions for users with motor impairments: systematic review, inventory, and research implications,” in The 23rd international ACM SIGACCESS conference on computers and accessibility. Editors J. Lazar, J. H. Feng, and F. Hwang (New York, NY, USA: ACM), 1–15. doi:10.1145/3441852.3471212

CrossRef Full Text | Google Scholar

Smith, E., and Delargy, M. (2005). Locked-in syndrome. BMJ 330, 406–409. doi:10.1136/bmj.330.7488.406

PubMed Abstract | CrossRef Full Text | Google Scholar

Stalljann, S., Wöhle, L., Schäfer, J., and Gebhard, M. (2020). Performance analysis of a head and eye motion-based control interface for assistive robots. Sensors Basel, Switz. 20, 7162. doi:10.3390/s20247162

CrossRef Full Text | Google Scholar

Sunny, M. S. H., Zarif, M. I. I., Rulik, I., Sanjuan, J., Rahman, M. H., Ahamed, S. I., et al. (2021). Eye-gaze control of a wheelchair mounted 6DOF assistive robot for activities of daily living (research square). doi:10.21203/rs.3.rs-829261/v1

CrossRef Full Text | Google Scholar

Tajadura-Jiménez, A., Väljamäe, A., Bevilacqua, F., and Bianchi-Berthouze, N. (2018). “Principles for designing body-centered auditory feedback,” in The wiley handbook of human computer interaction. Editors K. L. Norman, and J. Kirakowski (Chichester, UK: John Wiley & Sons, Ltd), 371–403. doi:10.1002/9781118976005.ch18

CrossRef Full Text | Google Scholar

Tobii, C. (2023). Dark and bright pupil tracking.

Google Scholar

Tostado, P. M., Abbott, W. W., and Faisal, A. A. (2016). “3d gaze cursor: continuous calibration and end-point grasp control of robotic actuators,” in 2016 IEEE international conference on robotics and automation (ICRA) (IEEE), 3295–3300. doi:10.1109/ICRA.2016.7487502

CrossRef Full Text | Google Scholar

Trambaiolli, L. R., and Falk, T. H. (2018). “Hybrid brain–computer interfaces for wheelchair control: a review of existing solutions, their advantages and open challenges,” in Smart wheelchairs and brain-computer interfaces (Elsevier), 229–256. doi:10.1016/B978-0-12-812892-3.00010-8

CrossRef Full Text | Google Scholar

Tricco, A. C., Lillie, E., Zarin, W., O’Brien, K. K., Colquhoun, H., Levac, D., et al. (2018). Prisma extension for scoping reviews (prisma-scr): checklist and explanation. Ann. Intern. Med. 169, 467–473. doi:10.7326/M18-0850

PubMed Abstract | CrossRef Full Text | Google Scholar

Ubeda, A., Iañez, E., and Azorin, J. M. (2011). Wireless and portable eog-based interface for assisting disabled people. IEEE/ASME Trans. Mechatronics 16, 870–873. doi:10.1109/TMECH.2011.2160354

CrossRef Full Text | Google Scholar

Velichkovsky, B. B., Rumyantsev, M. A., and Morozov, M. A. (2014). New solution to the midas touch problem: identification of visual commands via extraction of focal fixations. Procedia Comput. Sci. 39, 75–82. doi:10.1016/j.procs.2014.11.012

CrossRef Full Text | Google Scholar

Wang, H., Dong, X., Chen, Z., and Shi, B. E. (2015). “Hybrid gaze/eeg brain computer interface for robot arm control on a pick and place task,” in Annual international conference of the IEEE engineering in medicine and biology society. IEEE engineering in medicine and biology society. Annual international conference 2015, 1476–1479. doi:10.1109/EMBC.2015.7318649

CrossRef Full Text | Google Scholar

Wang, Y., Xu, G., Song, A., Xu, B., Li, H., Hu, C., et al. (2018). “Continuous shared control for robotic arm reaching driven by a hybrid gaze-brain machine interface,” in 2018 IEEE/RSJ international conference on intelligent robots and systems: october, 1-5, 2018, Madrid, Spain, Madrid municipal conference centre (Piscataway, NJ: IEEE).

CrossRef Full Text | Google Scholar

Webb, J. D., Li, S., and Zhang, X. (2016). “Using visuomotor tendencies to increase control performance in teleoperation,” in 2016 American control conference (ACC) (IEEE), 7110–7116. doi:10.1109/ACC.2016.7526794

CrossRef Full Text | Google Scholar

WHO (2001). International classification of functioning, disability, and health. Icf.

Google Scholar

WHO (2019). World report on vision.

Google Scholar

Wöhle, L., and Gebhard, M. (2021). Towards robust robot control in cartesian space using an infrastructureless head- and eye-gaze interface. Sensors Basel, Switz. 21, 1798. doi:10.3390/s21051798

CrossRef Full Text | Google Scholar

Yang, B., Huang, J., Sun, M., Huo, J., Li, X., and Xiong, C. (2021). “Head-free, human gaze-driven assistive robotic system for reaching and grasping,” in 2021 40th Chinese control conference (CCC) (IEEE), 4138–4143. doi:10.23919/CCC52363.2021.9549800

CrossRef Full Text | Google Scholar

Yoo, D. H., Kim, J. H., Kim, D. H., and Chung, M. J. (2002). “A human-robot interface using vision-based eye gaze estimation system,” in IEEE/RSJ international conference on intelligent robots and system (IEEE), 1196–1201. doi:10.1109/IRDS.2002.1043896

CrossRef Full Text | Google Scholar

Zeng, H., Wang, Y., Wu, C., Song, A., Liu, J., Ji, P., et al. (2017). Closed-loop hybrid gaze brain-machine interface based robotic arm control with augmented reality feedback. Front. neurorobotics 11, 60–13. doi:10.3389/fnbot.2017.00060

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, G., Hansen, J. P., Minakata, K., Alapetite, A., and Wang, Z. (2019). Eye-gaze-controlled telepresence robots for people with motor disabilities: the 14th ACM/IEEE international conference on human-robot interaction: march 11-14, 2019, daegu, South Korea. Piscataway, NJ: IEEE.

Google Scholar

Zhang, J., Guo, F., Hong, J., and Zhang, Y. (2013). “Human-robot shared control of articulated manipulator,” in 2013 IEEE international symposium on assembly and manufacturing (ISAM) (IEEE), 81–84. doi:10.1109/ISAM.2013.6643493

CrossRef Full Text | Google Scholar

Zhu, D., Gedeon, T., and Taylor, K. (2010). Head or gaze? controlling remote camera for hands-busy tasks in teleoperation: a comparison. In Proceedings of the 22nd conference of the computer-human interaction special interest group of Australia on computer-human interaction, ed. M. Brereton (New York, NY: ACM, 300–303.

Google Scholar

Keywords: robot, assistive robotics, input modalities, eye tracking, assisted living, EOG, hybrid BCI, human robot interaction (HRI)

Citation: Fischer-Janzen A, Wendt TM and Van Laerhoven K (2024) A scoping review of gaze and eye tracking-based control methods for assistive robotic arms. Front. Robot. AI 11:1326670. doi: 10.3389/frobt.2024.1326670

Received: 23 October 2023; Accepted: 29 January 2024;
Published: 19 February 2024.

Edited by:

Mizanoor Rahman, The Pennsylvania State University (PSU), United States

Reviewed by:

Stefano Tortora, University of Padua, Italy
Kecheng Shi, University of Electronic Science and Technology of China, China
Jiahui Pan, South China Normal University, China
Madi Babaiasl, Saint Louis University, United States

Copyright © 2024 Fischer-Janzen, Wendt and van Laerhoven. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Anke Fischer-Janzen, anke.fischer-janzen@hs-offenburg.de

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.