Prototype: Local RAG Agent for KnowledgeSpace (Privacy-Focused & Metal Optimized) - Seeking Mentor Feedback

Subject: Exploring Local LLMs for Neuroscience Knowledge Retrieval (RAG) - Seeking Feedback

Hi everyone,

My name is Sai Vinay, and I am a Data Science undergraduate interested in the intersection of LLMs and Neuroinformatics.

In preparation for the upcoming GSoC 2026 cycle (aiming for the Knowledge Space/Agent projects), I built a proof-of-concept tool called Neuro-RAG-Assistant.

Repository: GitHub - Saivinay24/neuro-rag-assistant: A high-performance, local RAG assistant optimized for Apple Metal (GPU) to accelerate knowledge discovery in neuroscience academic documentation.

What it does: It allows researchers to perform semantic searches over local neuroscience textbooks (PDFs) using a quantized Phi-3 model and FAISS. It runs entirely locally (via llama.cpp), ensuring data privacy and zero API costs.

My Question for Mentors: As I refine this, I want to align it with INCF standards.

  1. Are there specific metadata schemas (like BIDS derivatives) that are preferred for indexing “unstructured” text data in the Knowledge Space?
  2. Would integration with the NWB (Neurodata Without Borders) documentation be a valuable test case for this RAG system?

Any feedback on the code or approach would be greatly appreciated!

Best, Sai Vinay