Overview


This project implements an advanced autonomous drone system that responds to gesture-based commands from a ground pilot. The drone interprets these gestures in real time and executes the corresponding actions with accurate control, enabling flexible and dynamic operation across various applications. Implemented wireless communication protocols to transmit precise gesture-based commands from a Ground Control Station (GCS) to the drone.

Gesture Recognition


  • Gesture recognition is done using training a ML model on a dataset containing labelled distances between all possible pair of hand landmarks
  • The loading hand landmarks and extraction of their x, y, z coordinates are done using mediapipe library
  • RandomForestClassifier is the ML model used in training

Approach for Data Collection


  • Using media pipe we extract the features of each hand landmark(i.e. it’s x, y, z coordinates)
  • Then for all possible pairs of hand landmarks i.e. C(21,2) = 210 pairs distances between them are calculated and stored in a labelled CSV dataset respective to a particular gesture
  • Also, before storing them we divide the distances by width of the user’s hand (Distance between landmark 5 and 17), this is done so that our model is relative and not just trained according to one person only
  • Now, for each gesture we collect 5k rows of 210 distance data (this is done for better training), collection speed is 10rows/sec
  • Now, when the dataset is ready for all the gesture we move to the training part


Model Training


  • The training script will load the CSV file, it will label the string(like “takeoff”) and will map them to integers
  • Then the data will me randomised and it will be split into 80% training data and 20% testing data
  • Now, using StandardScalar for every feature, mean = 0 and standard deviation = 1 is set, it will prevent imbalance in feature distribution
  • RandomForestClassifier is the ML model which is trained with parameters like 300 Decision trees, tree depth limit as 20
  • The model is evaluated using classification report and heatmap confusion matrix
  • While evaluating, the threshold is set as 0.8, so if probability while testing is <0.8 it is treated as ‘unknown’
  • Once trained and evaluated, the model in saved in .pkl file format which is the used for prediction

Command Mapping and Forwarding


  • Our final script has 2 parts
  • Before running the code, Using Mavlink the mavproxy commands are forwarded to 2 udp ports, one connects to Mission Planar and the other connects to connection python code
  • Then the connection code is run it connects to the vehicle, performs all security checks and using multithreading creates a separate thread in which gesture recognition code runs
  • The separation of tasks ensures stability, and prevents any potential crash or command delays
  • The recognition code continuously detects and sends gesture to command execution code which stores it in a deque of size 15(first-in-last-out format)
  • Best stable gesture is the one whose occurrence is more than 70% in the deque, it is selected and it’s respective mavproxy command is forwarded, this adds stability and prevents any fluctuation in gesture recognition which can lead to potential crash
  • There are two safety checks added, first being that during change in stable gesture the drone pauses and waits for 1.5 seconds, also before land command is forwarded it is ensured that stable gesture is land for at least 5 seconds

Setup and Running


SITL Setup

Installation Requirements-

  • Ground Control Station
  • Ubuntu 22.04.5 LTS
  • Python 3.10.11
  • ArduPilot
  • Configure ArduPilot firmware

Gazebo Setup

  • ArduCopter
  • ArduCopter Gazebo Plugin

Simulation Run-

Run all the applications in order all in new terminal.

  1. Gazebo
  2. ArduCopter
  3. Command Script

Implementation


Team Members


  • Aman Sharma
  • Shriya Nerlikar
  • Harshit Raj
  • Shivansh Satyam

Mentors