GSoC 2025 Project #24 GestureCap :: Markerless gesture-recognition and motion-capture Ml/AI for music, speech generation; develop neuroscientific, psychological theories (music creativity, music-movement-dance interactions) (350h)

Title: Markerless gesture-recognition and motion-capture Ml/AI-based tool to drive music and speech generation, and develop neuroscientific/psychological theories of music creativity and music-movement-dance interactions

Mentors: Louis Martinez <louis.martinez@telecom-paris.fr>, Yohai-Eliel Berreby <yohaiberreby@gmail.com>, and Suresh Krishna <suresh.krishna@mcgill.ca>

Skill level: Intermediate - Advanced

Required skills: Comfortable with Python and modern AI tools. Experience with image/video processing and using deep-learning based image-processing models. Familiarity with C/C++ programming, low-latency sound generation, image processing, and audiovisual displays, as well as MediaPipe is an advantage, though not necessary. Max/CSound/PureData familiarity preferred.

Time commitment: Full time (350 hours)

Forum for discussion

About: Last year, we developed GestureCap (GSoC 2024 report · GitHub), a tool that uses markerless gesture-recognition and motion-capture (using Google’s MediaPipe) to create a working framework whereby movements can be translated into sound with short-latency, allowing for example, gesture-driven new musical instruments and dancers who control their own music while dancing.

Aims: This year, we aim to build on this initial proof-of-concept to create a usable tool that enables gesture-responsive music and speech generation, as well as characterize the low-latency properties of this system and the sense of agency that enables. The development of GestureCap will facilitate both artistic creation, as well as scientific exploration of multiple areas, including for example - how people engage interactively with vision, sound, and movement and combine their respective latent creative spaces. Such a tool will also have therapeutic/rehabilitative applications in populations of people with limited ability to generate music and in whom agency and creativity in producing music have been shown to produce beneficial effects.

Website: GSoC 2024 report · GitHub

Tech keywords: Sound/music generation, Image processing, Python, MediaPipe, Wekinator, AI, Deep-learning

2 Likes

Thanks for sharing the information about the GestureCap project for GSoC 2025. I am very keen to work on this project and extend the efforts of 2024.

As a machine learning, deep learning, and computer vision practitioner, the concept of gesture-controlled music and audio synthesis resonates with me. I am intrigued by the cross-disciplinary aspect of this project with AI, music, and motion capture. I would be keen to know more about furthering gesture recognition using MediaPipe, especially in areas like low-latency performance optimization, time-synchronous gesture tracking, and more integration with audio synthesis.

It would be a big help if you could inform me of how I can help and whether there are any microtasks or steps I can take in order to start. I look forward to your instructions!

Sincerely,
Harsh Gupta
GitHub: 4444Harsh (Harsh) · GitHub

Respected Sir,

I hope you are doing well. My name is Yash Pathak, and I am a third-year engineering student from India with experience in AI, machine learning, and real-time control systems. I came across the GestureCap project in the GSoC 2025 project list and found it highly aligned with my skills and interests.

I have experience in Python, deep learning, image processing, and real-time signal processing. I have previously worked with MediaPipe, OpenCV, and neural networks for gesture recognition and AI-driven applications. The idea of using markerless gesture recognition for music and speech generation excites me, and I would love to contribute to improving the system’s latency, accuracy, and user interaction.

I have already explored the GestureCap repository/documentation. I would greatly appreciate your guidance on how to get started.

Looking forward to your guidance and the opportunity to contribute to GestureCap!

Best regards,
Yash Pathak

yashpradeeppathak@gmail.com
https://www.linkedin.com/in/vindicta07/