Hand Tracking R2D2
I worked with a group to build a robot that autonomously tracks hand movements using computer vision and embedded systems. The iPhone camera in R2D2 is used to identify/track the hand; if the hand is detected closer to one side of the camera frame, R2D2's head rotates so that the hand is in the center of the frame. The robot can also identify 5 specific hand gestures, triggering the ESP32 buzzer.
​
Two separate scripts run simultaneously: the camera/image processing code on local Python and the R2D2 code on the ESP32. The camera/image processing code connects to the iPhone camera and can detect the following hand gestures: high five, peace sign, thumbs down, thumbs up, and OK. It also splits the camera into 3 zones (right, middle, left). The code detects which zone the hand is in and calculates the distance from the hand's center to the screen center. Commands are sent to the ESP32 depending on the type of hand gesture detected (for the buzzer) and hand distance from center (for the motor rotation).​
​
R2D2 houses an iPhone and a NEMA 17 stepper motor, which is wired to the ESP32. The stepper motor responds to commands from the image processing code to rotate a certain distance left or right based on how far away the hand is from the screen center. The ESP32 code implements several optimization techniques, including proportional step scaling (fewer steps when closer to the center) to achieve smooth tracking without overshooting. In addition, the buzzer on the ESP32 will make a distinct noise for each of the 5 hand gestures.
​​​
The camera/image processing code can be accessed here.​​​
The R2D2 (ESP32) code can be accessed here.
Main Micropython Skills Used:
-
Establishing USB serial connection
-
MediaPipe integration- specifically using hand landmarks
-
Stepper motor control and sensitivity scaling
-
Creating structured commands for serial communication
