Hi everyone,
My name is Dhanush, a Computer Science student interested in AI systems, ML/DL, and LLM-based tools for research workflows. I’m planning to apply to GSoC 2026 with INCF and wanted to introduce myself while starting discussions around a couple of project ideas that caught my interest.
I’ve recently worked on projects involving RAG pipelines, semantic search, embeddings, and agentic AI workflows, where LLMs are used for multi-step reasoning and structured outputs.
Two projects I’m particularly interested in are:
33 - AStats: an agentic-AI approach to applied statistical practitioner workflows
(Mentors: Jonathan Morris, Yohai-Eliel Berreby, Suresh Krishna)
The idea of building an agentic system for dataset exploration and statistical workflows from scratch sounds very interesting to me, especially exploring how LLM-based agents can assist practitioners in exploratory and confirmatory analysis.
40 - Semantic Search for Neuroimaging Datasets (Neurobagel)
(Mentors: Alyssa Dai, Arman Jahanpour, Brent McPherson, Sebastian Urchs)
I’m also very interested in this project since it involves local embeddings and semantic search over dataset metadata, which aligns with some of the retrieval and embedding systems I’ve worked with.
I’ve started exploring the repositories and have already opened a few PRs (currently waiting for review/merge). In the meantime, I’m continuing to explore the codebases and draft my proposal.
If mentors or contributors have suggestions on areas worth exploring early or issues that would be good starting points, I’d really appreciate the guidance.
Looking forward to learning from and contributing to the community.
Thanks!
Dhanush
Github | Linkedin