AirSketch is an advanced computer vision application that enables users to draw in the air using hand gestures. It utilizes real-time hand tracking and gesture recognition to create a virtual drawing canvas. This project demonstrates the integration of computer vision techniques, gesture recognition algorithms, and real-time video processing.
- Python 3.7+: Core programming language 🐍
- OpenCV 4.5+: Computer vision library for image processing and drawing 📷
- MediaPipe 0.8.9+: Machine learning framework for hand tracking ✋
- NumPy 1.19+: Numerical computing library for efficient array operations 📊
- Real-time Hand Tracking: Utilizes MediaPipe's hand landmark detection model to track 21 3D hand landmarks at 30+ FPS.
- Gesture Recognition: Implements custom logic to detect raised index finger for drawing initiation and termination.
- Dynamic Color Selection: Provides an interactive UI for real-time color switching during drawing.
- Adaptive Smoothing: Employs a distance-based point sampling technique to reduce jitter and improve line quality.
- Performance Optimization: Incorporates frame resolution reduction and efficient drawing algorithms to minimize latency.
- Multi-layer Rendering: Combines the input video feed with the drawing canvas using alpha blending for a seamless user experience.
The application follows a modular architecture:
- Input Module: Captures and preprocesses video frames from the webcam.
- Hand Detection Module: Utilizes MediaPipe to detect and track hand landmarks.
- Gesture Recognition Module: Analyzes hand landmark positions to recognize drawing gestures.
- Drawing Module: Manages the canvas state and renders lines based on recognized gestures.
- UI Module: Handles the creation and interaction with the color selection and clear button interface.
- Output Module: Combines processed frames, UI elements, and the drawing canvas for final display.
Utilizes MediaPipe's palm detection model followed by a hand landmark model to identify 21 3D landmarks of a hand.
def is_index_finger_raised(hand_landmarks):
return hand_landmarks.landmark[8].y < hand_landmarks.landmark[6].y
This function compares the y-coordinates of the index finger tip (landmark 8) and the middle knuckle (landmark 6) to determine if the index finger is raised.
if prev_point and np.linalg.norm(np.array(index_finger_tip) - np.array(prev_point)) > min_distance:
cv2.line(canvas, prev_point, index_finger_tip, colors[colorIndex], line_thickness)
prev_point = index_finger_tip
This algorithm ensures smooth line drawing by only rendering lines when the finger has moved a significant distance, reducing jitter and improving performance.
- Frame Resolution: Reduced to 640x480 to balance between processing speed and visual quality.
- Hand Detection Confidence: Set to 0.5 to optimize the trade-off between accuracy and speed.
- Drawing Optimization: Direct line drawing instead of complex smoothing algorithms to reduce latency.
- UI Rendering: Pre-rendered UI elements to minimize per-frame computation.
- Ensure Python 3.7+ is installed.
- Install required libraries:
pip install opencv-python mediapipe numpy
- Clone the repository:
git clone https://github.com/yourusername/AirSketch.git cd Air_Sketch
Run the application:
python Air_Sketch.py
- Use your index finger to draw in the air. ✍️
- Touch the color buttons at the top of the screen to change the drawing color. 🎨
- Use the 'CLEAR' button to reset the canvas. 🔄
- Press 's' to save your drawing, '+'/'-' to adjust line thickness, and 'q' to quit.
- Implement multi-hand support for collaborative drawing.
- Integrate machine learning for gesture customization.
- Develop 3D drawing capabilities using depth estimation techniques.
- Optimize for mobile devices using TensorFlow Lite.
Contributions are welcome! Please fork the repository and submit a pull request with your improvements.
This project is licensed under the MIT License - see the LICENSE.md file for details.
- MediaPipe team for their hand-tracking solution
- OpenCV contributors for the comprehensive computer vision toolkit