A Bi-Directional Deep Learning Interface for Gaze-Controlled Wheelchair Navigation: Overcoming the Midas Touch Problem
Top Authors
Abstract
We present a gaze-based augmented reality control interface for electric wheelchairs, addressing the challenges faced by individuals with mobility impairments. The development transitions through three stages: model training with offline evaluation, Virtual Reality (VR) simulations, and physical deployment. First, we trained deep learning models, comparing Transformers and LSTMs, to predict locomotion intentions based on gaze data. While gaze predicts steering intentions well, it sometimes diverges from locomotion goals. To tackle this, we classify gaze movements as either indicative of locomotor intention or not. This novel approach addresses the Midas Touch Problem of gaze. Datasets were collected in controlled VR environments featuring different tasks. We find that data sets with tasks that encouraged diverse navigation and gaze behaviors enable strong generalization. The online VR simulation evaluation phase enabled safe and immersive testing, allowing the assessment of system performance and the integration of feedback for user guidance. Our approach provided smoother navigation control compared to traditional “Where-You-Look-Is-Where-You-Go” methods. Feedback improved user ratings of the system. In the final stage, the system was deployed on a physical wheelchair equipped with an augmented reality (AR) device to provide feedback about the predictions to the user, allowing real-world evaluation. Despite differences in user behavior between VR and physical environments, the system successfully translated gaze inputs into precise and safe navigation commands. Users were able to steer the wheelchair solely using their eyes while simultaneously being able to look at destinations at the side of the path.