Welcome to AirGestAR
Hand gestures provide a natural and an intuitive way of user interaction in AR/VR applications. However, the most popular and commercially available devices such as the Google Cardboard and Wearality still employ only primitive modes of interaction such as the magnetic trigger, conductive lever and have limited user-input capability. The truly instinctual gestures work only with inordinately priced devices such as the Hololens, Magic Leap, Meta which use proprietary hardware and are still not commercially available.
The Idea
In this paper, we explore the possibility of leveraging deep learning for recognizing complex 3-dimensional marker-less gestures (Bloom, Click, Zoom-In, Zoom-Out) in real-time using monocular camera input from a single smartphone. This framework can be used with frugal smartphones to build powerful AR/VR systems for large scale deployments that work in real-time, eliminating the need for specialized hardware. We have created a hand gesture dataset to train LSTM networks for gesture classification and published the same online. We also demonstrate the performance of our proposed method in terms of classification accuracy and computational time.