Mentors: Bradly Alicea <bradly.alicea@outlook.com>, Sarrah Bastawala <sarrahbastaw@google.com>, Jesse Parent <jesse@jopro.org>
Skill level: Advanced
Required skills: Expertise or the ability to integrate multiple development environments is an important baseline skill. The ability to extract model representations from complex systems is helpful. Knowledge of open-source development practices and an interest in interdisciplinary research are a must.
Time commitment: Fulltime (350 hours)
About: Open-source communities are only as powerful as their ability to collectively complete tasks and projects. One way to enable the functional capacity of such a community is to model the collective behavioral and cognitive aspects of day-to-day project engagement. Your current involvement will involve the maintenance, development, and further implementation of two models from past years: a Reinforcement Learning model, and a hybrid Agent-based/Large Language Model. The candidate will build an analytical model that incorporates features such as general feedback loops (recurrent relationships) and causal loops (reciprocal causality). This might be in the form of a traditional boxes and arrows (input-output) model, or something more exotic such as Reinforcement Learning.
Aims: In 2024, Github activity was simulated with a CodeSpace Environment. This included generating issues of varying difficulty for both contributor and maintainer agents. Implementing CodeSpace resulted in the following capabilities: different coding ability and experience levels, a discussion space to discuss various approaches for a particular task among agents, an automated pull request lifecycle as well as multiple decision-making algorithms to choose task allocation for contributors and corresponding metrics for its simulation. For 2025, you might help to improve upon the underlying models (Add Collaboration Algorithm for Multiple Agents on a Single Issue · Issue #64 · OREL-group/GSoC · GitHub) or add a ConversationSpace (Add ConversationSpace (to Simulate Slack) · Issue #60 · OREL-group/GSoC · GitHub) within this framework to simulate a IRC (Internet Relay Chat) / Slack / Discord model, an essential part of many open source communities. Last year’s project utilized RAG (Retrieval Augmented Generation) (Integrate RAG within ConversationSpace and GithubDiscussion · Issue #62 · OREL-group/GSoC · GitHub), but other approaches backed by research are also welcome. Our goal is to develop one or more maintainers of the platform who are also capable of research software engineering (https://www.hpcwire.com/off-the-wire/ncsa-innovators-bridging-research-and-software-engineering/).
What can I do before GSoC?
You can join the Orthogonal Lab Slack and Github, as well as attend our Saturday
Morning NeuroSim meetings. You might also become familiar with the existing codebase:
- LLAMOSC (Agent-based and Large Language Hybrid Model): GSoC/Open Source Sustainibility using LLMs at main · OREL-group/GSoC · GitHub
- MARLSOC (Multi-agent Reinforcement Learning): https://github.com/OREL-
group/GSoC/tree/main/Open%20Source%20Sustainability%20using%20RL
Project website: https://orthogonal-research.weebly.com
Tech keywords: Computational Modeling, Reinforcement Learning, Language Models