Gesture Controlled Quadcopter

Overview

This project implements an advanced autonomous drone system that responds to gesture-based commands from a ground pilot. The drone interprets these gestures in real time and executes the corresponding actions with accurate control, enabling flexible and dynamic operation across various applications. Implemented wireless communication protocols to transmit precise gesture-based commands from a Ground Control Station (GCS) to the drone.

Gesture Recognition

Gesture recognition is done using training a ML model on a dataset containing labelled distances between all possible pair of hand landmarks
The loading hand landmarks and extraction of their x, y, z coordinates are done using mediapipe library
RandomForestClassifier is the ML model used in training

Approach for Data Collection

Using media pipe we extract the features of each hand landmark(i.e. it’s x, y, z coordinates)
Then for all possible pairs of hand landmarks i.e. C(21,2) = 210 pairs distances between them are calculated and stored in a labelled CSV dataset respective to a particular gesture
Also, before storing them we divide the distances by width of the user’s hand (Distance between landmark 5 and 17), this is done so that our model is relative and not just trained according to one person only
Now, for each gesture we collect 5k rows of 210 distance data (this is done for better training), collection speed is 10rows/sec
Now, when the dataset is ready for all the gesture we move to the training part

Model Training

The training script will load the CSV file, it will label the string(like “takeoff”) and will map them to integers
Then the data will me randomised and it will be split into 80% training data and 20% testing data
Now, using StandardScalar for every feature, mean = 0 and standard deviation = 1 is set, it will prevent imbalance in feature distribution
RandomForestClassifier is the ML model which is trained with parameters like 300 Decision trees, tree depth limit as 20
The model is evaluated using classification report and heatmap confusion matrix
While evaluating, the threshold is set as 0.8, so if probability while testing is <0.8 it is treated as ‘unknown’
Once trained and evaluated, the model in saved in .pkl file format which is the used for prediction

Command Mapping and Forwarding

Our final script has 2 parts
Before running the code, Using Mavlink the mavproxy commands are forwarded to 2 udp ports, one connects to Mission Planar and the other connects to connection python code
Then the connection code is run it connects to the vehicle, performs all security checks and using multithreading creates a separate thread in which gesture recognition code runs
The separation of tasks ensures stability, and prevents any potential crash or command delays
The recognition code continuously detects and sends gesture to command execution code which stores it in a deque of size 15(first-in-last-out format)
Best stable gesture is the one whose occurrence is more than 70% in the deque, it is selected and it’s respective mavproxy command is forwarded, this adds stability and prevents any fluctuation in gesture recognition which can lead to potential crash
There are two safety checks added, first being that during change in stable gesture the drone pauses and waits for 1.5 seconds, also before land command is forwarded it is ensured that stable gesture is land for at least 5 seconds

Setup and Running

SITL Setup

Installation Requirements-

Ground Control Station
Ubuntu 22.04.5 LTS
Python 3.10.11
ArduPilot
Configure ArduPilot firmware

Gazebo Setup

ArduCopter
ArduCopter Gazebo Plugin

Simulation Run-

Run all the applications in order all in new terminal.

Gazebo
ArduCopter
Command Script

Implementation

GITHUB

Team Members

Aman Sharma
Shriya Nerlikar
Harshit Raj
Shivansh Satyam

Mentors

Mrunmayee Limaye
Harshal Kolhe
Samiron Biswas
Sameer Badami
Sanchet Dhalwar