GSOC 2026 Project #7 : Open source Community Sustainability Greenfield

Mentors: Bradly Alicea <bradly.alicea@outlook.com>, Sarrah Bastawala <sarrahbastaw@google.com>, Jesse Parent <jesse@jopro.org>

Skill level: Advanced

Required skills: Expertise or the ability to integrate multiple development environments is an important baseline skill. The ability to extract model representations from complex systems is helpful. Knowledge of open-source development practices and an interest in interdisciplinary research are a must.

Time commitment: Fulltime (350 hours)

About: Open-source communities are only as powerful as their ability to collectively complete tasks and projects. One way to enable the functional capacity of such a community is to model the collective behavioral and cognitive aspects of day-to-day project engagement. Your current involvement will involve the maintenance, development, and further implementation of two models from past years: a Reinforcement Learning model, and a hybrid Agent-based/Large Language Model. The candidate will build an analytical model that incorporates features such as general feedback loops (recurrent relationships) and causal loops (reciprocal causality). This might be in the form of a traditional boxes and arrows (input-output) model, or something more exotic such as Reinforcement Learning.

Aims: In 2024, Github activity was simulated with a CodeSpace Environment. This included generating issues of varying difficulty for both contributor and maintainer agents. Implementing CodeSpace resulted in the following capabilities: different coding ability and experience levels, a discussion space to discuss various approaches for a particular task among agents, an automated pull request lifecycle as well as multiple decision-making algorithms to choose task allocation for contributors and corresponding metrics for its simulation. For 2025, you might help to improve upon the underlying models (Add Collaboration Algorithm for Multiple Agents on a Single Issue · Issue #64 · OREL-group/GSoC · GitHub) or add a ConversationSpace (Add ConversationSpace (to Simulate Slack) · Issue #60 · OREL-group/GSoC · GitHub) within this framework to simulate a IRC (Internet Relay Chat) / Slack / Discord model, an essential part of many open source communities. Last year’s project utilized RAG (Retrieval Augmented Generation) (Integrate RAG within ConversationSpace and GithubDiscussion · Issue #62 · OREL-group/GSoC · GitHub), but other approaches backed by research are also welcome. Our goal is to develop one or more maintainers of the platform who are also capable of research software engineering (https://www.hpcwire.com/off-the-wire/ncsa-innovators-bridging-research-and-software-engineering/).

What can I do before GSoC?

You can join the Orthogonal Lab Slack and Github, as well as attend our Saturday

Morning NeuroSim meetings. You might also become familiar with the existing codebase:

group/GSoC/tree/main/Open%20Source%20Sustainability%20using%20RL

Project website: https://orthogonal-research.weebly.com

Tech keywords: Computational Modeling, Reinforcement Learning, Language Models

1 Like

Hi, I recently came across this GSoC project and have spent time diving into the LLAMOSC codebase in detail.
During my exploration, I identified that the collaborative algorithm (–algorithm c) existed in sim.py but was not connected to the main simulation loop, and that the ConversationSpace data pipeline was not logging contributor discussions in collaborative mode. I addressed these issues and was able to run the first successful end-to-end simulation under the collaborative algorithm, including team formation (Lead/Reviewer/Support roles), LLM-generated collaboration logs, and working ConversationSpace engagement metrics.
My proposal builds on this work by focusing on fully integrating the collaborative mode, introducing a real-world data ingestion pipeline from IRC, Slack, and Discord archives, and refactoring RAG with a persistent vector store and prompt-level context injection.
Code is available here:
https://github.com/Chrishhh9874/GSoC/tree/feat/collaborative-algorithm-conversationspace
I have submitted my proposal and would be happy to discuss it further if there is an opportunity. Thank you for your time.

Kewen (Chris) Chen