This is a new exploratory project that aims to use the recent advances in markerless motion capture software (MediaPipe, Jarvis, DeepLabCut) to create a working framework whereby movements can be translated into sound with short-latency, allowing for example, gesture-driven new musical instruments and dancers who control their own music while dancing. The project will require familiarity with Python, and ability to interface with external packages like MediaPipe and Jarvis. Familiarity with low-latency sound generation, image processing, and audiovisual displays is an advantage, though not necessary. The development of such tools will facilitate both artistic creation, as well as scientific exploration of multiple areas, including for example - how people engage interactively with vision, sound, and movement and combine their respective latent creative spaces. Such a tool will also have therapeutic/rehabilitative applications in populations of people with limited ability to generate music and in whom agency and creativity in producing music have been shown to produce beneficial effects.
Skill level: Intermediate/advanced
Required skills: Comfortable with Python. Experience with image/video processing and using deep-learning based image-processing models. Familiarity with C/C++ programming, low-latency sound generation, image processing, and audiovisual displays, as well as MediaPipe and/or DeeplabCut/Jarvis is an advantage, though not necessary.
Time commitment: Full-time (350 h)
Lead mentor: Suresh Krishna
Project website: github.com/m2b3/Gestures
Backup mentors: Yohai-Eliel Berreby
Tech keywords: Sound/music generation, Image processing, Python, MediaPipe, Wekinator
This project is thematically very similar to LivePose from Metalab, and consultations with Metalab will be possible: sat-mtl / tools / LivePose · GitLab
My name is Soham Mulye, and I am a second-year CSE student from Mumbai.
I’ve previously worked with image processing and various ml models.
Here’s a link to one of my projects, which I collaborated on with my classmates and won second place in the opencv spatial ai competition:
I’d like to contribute to this project.
Could you please advise me on where to begin and how to begin contributing to the project?
Welcome aboard, @soham_mulye. It is quite possible that this particular sub-project will migrate to the Metalab Livepose GSoC org, where it belongs more naturally. I will announce that here, and then you can start the conversation along with the Metalab folks. Stay tuned.
Thank you @suresh.krishna for your reply, will wait for the communication from you in this regard.
In the meantime are there any similar projects or projects based on deep-learning or image processing that will be mentored by this organization for which i can start contributing.
Thank you for your suggestion.
I went over the project and even the previous year’s project. The eye tracking portion of the project seems very interesting, and I believe I could contribute to it as well. However this year’s project primarily focuses on the development of an application, I don’t think I could make a significant contribution to the project as I am not familiar with app development.
It is a Windows program written in Python. App was perhaps a bad choice of words. It is not intended to be an Android/iOS app. @soham_mulye
I would love to work on the project then. Could you please guide me from where should i start and how should i start contributing? @suresh.krishna
Hi @suresh.krishna. I am Shikha Sharma, a third year at IIT Kanpur. I am interested in this project and am willing to start working on this too. I have done project on Face Recognition based Smart Attendence System under Microsoft Engage Mentorship Program’22 and also learnt various image processing techniques and algorithms in my Image Processing course. I would really love to get started with contributing towards it. I’ve attached my CV for your reference.
Link: Shikha Sharma_Resume.pdf - Google Drive
@Shikha_Sharma - please see the thread above.