Career Profile

Graduate student researcher at the Robotics Institute, Carnegie Mellon University. Passionate about applications of Computer Vision in Augmented and Virtual Reality, Telepresence, Hand Gesture Recognition, Robotics, Artificial Intelligence, Machine learning and Deep learning. Keen interests in – Data Analytics, Big Data, IoT, Machine Learning and Cryptocurrencies.


Synthetic Video Generation for Robust Hand Gesture Recognition in Augmented Reality Applications

Varun Jain, Shivam Aggarwal, Suril Mehta, Ramya Hebbalaguppe
GAN-based framework capable of generating photo-realistic videos that with labelled hand bounding box and fingertip. [ Paper ] [ Webpage ]

GestARLite: An On-Device Pointing Finger Based Gestural Interface for Smartphones and Video See-Through Head-Mounts

Varun Jain, Gaurav Garg, Ramakrishna Perla, Ramya Hebbalaguppe
A lightweight hand-gesture recognition framework in First Person View for wearable devices. It works on commercial off-the-shelf smartphones attached to generic video see-through head-mounts like the Google Cardboard and challenges the interaction paradigms of expensive HMDs such as the Microsoft HoloLens and ARCore/ARKit enabled premium smartphones. [ Paper ] [ Webpage ]

DrawInAir: A Lightweight Gestural Interface Based on Fingertip Regression

Gaurav Garg, Srinidhi Hegde, Varun Jain*, Ramakrishna Perla*, Lovekesh Vig, Ramya Hebbalaguppe
Deep learning for real-time pointing hand gesture classification that works on monocular RGB input. [ Paper ] [ Webpage ]

Scalable Measurement of Air Pollution using COTS IoT Devices

Varun Jain, Mansi Goel, Mukulika Maity, Vinayak Naik, Ramachandran Ramjee
Analyses spatio-temporal coverage of air pollution in Delhi and propose a framework to estimate air pollution (PM value) for a locality by using the existing infrastructure of monitoring stations and looking at factors, such as traffic conditions and greenery. Built regression models to estimate the pollution level and evaluated it across areas in Delhi. [ Paper ]

AirGestAR: Leveraging Deep Learning for Complex Hand Gestural Interaction with Frugal AR Devices

Varun Jain, Ramakrishna Perla, Ramya Hebbalaguppe
Deep learning for recognizing complex 3-dimensional marker-less gestures (Bloom, Click, Zoom-In, Zoom-Out) in real-time using monocular camera input from a single smartphone. [ Paper ] [ Webpage ]


Applied Scientist II

Feb 2021 - Present

Bringing state-of-the-art segmentation to Microsoft Teams.

Led efforts on transitioning to temporal segmentation. Responsible for a reduction of up to 20% in error rates.

Research Engineer Intern

May 2020 - Aug 2020
Facebook Reality Labs

Blind augmentation and normalization in the input domain does not help significantly in making the face encoders robust to real-world variations such as identities, appearances, lighting and headset assembly. Hence, we use analysis-by-synthesis to exploit the information available during test time: the target image (with lighting/appearance information).

Explored metric learning to learn a generic feature space which can be used to refine predictions during run-time. This is done through fully convolutional networks (so that the pixels are aligned), and comparison in this feature space minimizes the distance between current predictions and the reference results.

Research Collaborator

Jan 2020 - Jan 2021
Facebook Reality Labs

Working on High-Fidelity Bidirectional Telepresence communication in VR using Photo-realistic avatars. This involves solving problems such as 3D avatar animation generation via self-supervised multi-view image translation using a limited amount of IR sensors.

Developed a Variational Auto-Encoder to predict texture and geometry from IR images and experimented with input-level augmentation techniques such as adding UV maps to make the model robust to variations in identities, appearances, and lighting. This resulted in an overall improvement of 6% over the ongoing methods.

Research Associate

Aug 2018 - Jul 2019
TCS Research

Worked at the intersection of Computer Vision, Deep Learning and Augmented Reality. Explored the possibility of porting powerful deep-learning models to commodity smart-phones to solve problems in the domain of AR.

Proposed the new state-of-the-art in 2D temporal hand gesture recognition for egocentric videos. Our DrawInAir framework uses a CNN architecture to detect hands and a DSNT layer to regress over the fingertip coordinates which are tracked by a Bi-LSTM to classify gestures.

Worked on memory-efficient Deep Neural Network architectures that enable on-device hand gestures recognition for frugal HMDs.

Designed a Deep Neural Network Architecture to recognize complex Hololens-like, marker-less, 3D temporal hand gestures in real-time using monocular RGB input without any depth information.

Founder, Tech Lead

2015 - 2017

Worked as the founder and product developer of AirZen, which aims to measure pollution levels in and around people's home using a RaspberryPi with various sensors and give suitable health advice. Also, made an Android app and a server implementation for the same. Project mentored by Ms. Jyoti Vashishtha Sinha and incubated by the Incubation Cell, IIITD.

Find us at [ ] . Coverage by [ The Statesman ], [ Accenture Innovation Jockeys '16 ], [ NASSCOM TechNgage '16 ].

Research Intern

Winter 2016, Summer 2017
TCS Research

Worked in the domain of Augmented Reality, exploring the possibility of leveraging deep learning for recognizing 3-dimensional markerless temporal hand gestures in real-time using monocular camera input of a smartphone. It includes estimating a 3D hand pose and running a classification network.

Innovation Internship

Summer 2016
IIIT Delhi

Worked on a startup that aims to provide benefits of ‘Road Rationing’ and reduce overall congestion, commuting time and pollution by distributing the roads evenly across all the commuters.

Research Intern

Summer 2015
IIIT Delhi

Worked under Ms. Jyoti Vashishtha Sinha to determine influence of various pollutants like PM, CO, CO2, NO2, O3 on already infested diseases and conditions like asthma and bronchitis.

Google Student Ambassador

2014 - 2018
IIIT Delhi

Working as the Google Student Ambassador for the college.


Sangoshthi: A Mobile Learning Platform for Community Health Workers

2017 - 2018

Worked closely with the Government of Delhi and NGOs to help improve health conditions in rural areas. Developed a mobile-based peer learning platform that uses IVR and Internet to host real-time video interactions for rural health workers. The work bagged a $100000 grant from the Bill & Melinda Gates Foundation and has touched over 500 health workers.
Find project at [ PDF ]

NOSY: Nasal Features as a Biometric

Spring 2018

Examined the usage of 2-dimensional image based nasal features as a biometric for face recognition systems. The efficacy of nasal features is studied in various conditions such as occluded faces, disguises and in the case of reconstructive surgeries. Evaluated architectures such as FaceNet, DeepFace, DeepID, VGG-Face and traditional HAAR based cascades.
Find project at [ PDF ] [ Github ]

Don’t Hurt Me, But Hulk Me: Your Personal Fitness Manager

Spring 2018

As part of Pattern Recognition course, designed a framework to provide an appropriate fitness regime to a user depending upon his workout routine. (i) user identification from gait, (ii) recognizing an activity from skeletal movements (iii) suggesting improvements in the posture.
Find project at [ PDF ] [ Github ]

Smart VR Helmet

Fall 2017

As part of Wearables course, developed a smart Virtual Reality enabled helmet to assist kids' safety while travelling on two-wheleers.
Find project report at [ PDF ]

ViZDoom: AI Bot for FPS Games

Fall 2017

Explored Deep Recurrent Q-Learning techniques for playing the first-person shooter game of Doom in partially observable environments. Implemented several Action-Navigation architectures that use separate deep neural networks for exploring environments and fighting enemies in Tensorflow. Experimented with augmenting high-level game features, reward shaping and sequential updates for efficient training.
Find project at [ Video ]


2014 - 2016

Developed a photo-realistic simulation environment in UNITY Engine. It was used to implement static obstacle avoidance, lane following and basic planning and control algorithms for a fully autonomous vehicle ‘Swarath’ as part of the Mahindra Spark the Rise challenge.
Find project at [ Video ]

Software Defined Networking

Spring 2017

As part of Software Defined Networking course, developed an Open vSwitch application that descovers network topology, finds the top two-shortest paths between given nodes and divides network traffic among such paths.
Find project report at [ PDF ]

Entrepreneurship Project: TagTraqr

Summer 2016

Developed business plan and marketing strategies for TagTraqr. It offers easy smartphone based tracking of belongings based on Bluetooth LE and crowd-sourced data
Find project report at [ PDF ]

Predicting Value of College Degree

Fall 2016

As part of Machine Learning course, developed a framewrok to predict post college student debt and earnings after 6 years of working. The project included basic concepts of machine learning such as regression. Also, designed a framework to predict stock market behaviour.
Find project report at [ PDF 1 ] [ PDF 2 ]


Spring 2016

As part of Fundamentals of Database Systems course, developed a Java application to manage inventory/ stocks of any large scale operation with the facility to place orders and track the same.
Find project at [ Github ]

Complete Network Solution For Serviced Apartments

Spring 2016

As part of Computer Networks course, provided a simple and efficient wireless and wired connection system for any fully serviced apartment model.


Fall 2015

Developed likeProphesy, an application that predicts the number of likes a post on a social networking site might fetch. Used Java to process data fetched using Facebook Graph API. Basic concepts of Machine Learning and statistics used.
Find project at [ Github ]


Spring 2015

Developed a home automation/ control system named JARVIS which uses a RaspberryPi and Bluetooth to sense number of people in a room and control its electronic resources accordingly.
Find project at [ blog page ].

Weather Prediction

Spring 2015

Developed a software that analyses past climatic data and predicts weather anomalies like global warming and droughts. The data is retrieved from National Oceanic and Atmospheric Administration and is processed using Python.
Find project poster at [ PDF ]

Webpage Data Analysis | Word Cloud

Spring 2015

Developed a program which analyzes webpages to list frequently occurring words and makes a ‘word cloud’ using Python.

ARM Simulator

Spring 2015

Designed a ARM Processor Simulator in C as part of Computer Organization course
Find project at [ Github ]


Fall 2014

Developed a drag-and-drop interface for programming a microcontroller.
Find project at [ Wikidot Blog Page ]