GSOC 2026 Project #18 : SciCommons: Fullstack development and AI features for front-end, back-end and browser extensions

Mentors: Armaan Alam <armanalam032001@gmail.com>, Mohd Faisal Ansari <fa058593@gmail.com>, Suresh Krishna <suresh.krishna@mcgill.ca>

Skill level: Intermediate – Advanced

Required Skills: If working on the front-end, familiarity with Next.js, React, Server Side Rendering, TanStack Query, and component-based UI systems such as ShadCN/Radix UI is expected. If working on the backend, familiarity with Django and REST APIs is useful. For AI/NLP work, experience with LLMs, NLP pipelines, and open-source models is preferred. Experience with browser extensions or research tools is a plus.

Time commitment: Full time (350 hours)

About: Over the last few GSoC cycles, we have built SciCommons (https://www.scicommons.org), a web platform to support open scientific discussion, article reviews, ratings, and community-driven evaluation of research. The core frontend (Next.js) and backend (Django) are now stable and in active development and testing, and the platform is approaching a wider public release. The project is now moving into a phase of expanding features, improving usability, and building tools around the platform.

Aims: This year, we invite GSoC contributors to help extend SciCommons in several directions: improving the front-end experience, building a browser extension to connect SciCommons directly with relevant websites, adding AI-based tools for a range of tasks including literature discovery and summarization, and developing features for community-run journals and editorial workflows. The goal is to make SciCommons not just a place to comment on papers, but a complete environment for discovering, reading, discussing, and organizing scientific knowledge. The work will focus on the following areas:

  • Front-end and UX improvements for reading, reviewing, and discussion workflows

  • A browser extension to save papers, view discussions, and annotate articles from anywhere on the web

  • AI tools for semantic search, recommendations, and paper/discussion summarization using open-source models

  • Journal and community management features (editorial flows, moderation, curation, integrations)

Website: https://alphatest.scicommons.org

Tech keywords: Science publishing, social web, science portals, Next.js, Django, browser extensions, natural language processing, large language models, AI-assisted literature discovery.

2 Likes

Hi everyone,

My name is Kaja Obinna and I’m interested in contributing to SciCommons for GSoC 2026.

A bit about me: I have hands-on experience building Python/Flask backend systems, working with Celery, Redis, and Docker in production-like environments. — which maps closely to the SciCommons backend.

I’ve already explored both repos and found several open issues I can contribute to, particularly around Celery async task implementation (issue #8) and the
local development setup (issue #125). I plan to open PRs on these shortly.

For GSoC, I’m most interested in the Django backend and AI/NLP features — specifically improving async task handling, building out API endpoints, and
contributing to the literature discovery and summarization tools.

A few questions:

  1. Are there backend areas you’d like contributors to prioritize during the application period?
  2. Is there a preferred channel for discussing implementation approaches before opening a PR?
  3. Any good first issues beyond the ones I’ve already identified?

Looking forward to working with the team.

GitHub: xenacode-art (Erik Obinna) · GitHub

1 Like

Hello, thank you for the interest and you are welcome. Please join the GSoC2026 community on test.scicommons.org. We are “dogfooding” the site there.

Hello everyone,

My name is Juned Khan and I am a BCA 5th semester student with experience in MERN stack and TypeScript. I have built several full-stack projects and also worked with AI APIs like Gemini.

I am very interested in contributing to the SciCommons project, especially in the areas of frontend improvements, browser extensions, and AI-based tools.

Could you please guide me on how I can start exploring the codebase and where new contributors can begin?

Thank you!

1 Like

Please go through the messages above.

Hello everyone,
My name is Katerina and I’m very interested in contributing to the SciCommons project for GSoC 2026. I have a background in software engineering and I also hold a Master’s degree in Neuroscience. During my studies I completed a six-month research internship at a neuroscience laboratory at UCL, where I worked with research workflows and scientific data.
From a technical perspective, I have experience with backend development (Java / REST APIs) and I am also interested in AI/NLP applications for scientific literature, such as semantic search, summarization and knowledge discovery.
The goal of SciCommons to improve scientific discussion and discovery resonates a lot with my research background, so I would be very excited to contribute to features around: - AI-assisted literature discovery and summarization - research discussion workflows - scientific knowledge organization I have started exploring the platform and would love to begin contributing.
A few questions:

  1. Are there recommended repositories or “good first issues” for new contributors?
  2. Are there specific areas (frontend, backend, AI features, browser extension) where help is most needed right now?
  3. Is there a development guide or preferred workflow for first contributions?
    Looking forward to learning more and contributing to the project.

Best regards,
Katerina Eleftheriadi

1 Like

@katerinaeleftheriadi - welcome. please go through the messages above. We will be happy to see at the test.scicommons.org community site.

Hello everyone,

I’m a Statistics undergraduate at UNICAMP in Brazil and I’m very interested in contributing to SciCommons.

My background is mainly in data systems and AI applied to research workflows. Recently I worked on building an end-to-end pipeline that processes large collections of documents using NLP and LLM-based RAG systems, as well as developing research data tools for scientists, including database systems and a front-end interface for querying research datasets, I’ve also worked on machine learning projects in healthcare and scientific computing.

I had a couple of questions for the mentors:

For the AI/NLP components, are there specific open-source models or frameworks currently being considered for semantic search or summarization?
For the browser extension idea, is the goal mainly paper bookmarking and annotation, or also deeper integration with discussion threads and community feedback on SciCommons?

I would love to start exploring the repository and contribute if there are beginner issues or areas where help is currently needed.

Thank you!

1 Like

@Julia_Soares_de_Souz - please go through the messages above, and join the community at test.scicommons.org as indicated above.

The answer to your questions is , a) no specific omponents yet and b) deeper integration. for bookmarking, there is zotero.

You may also be interested in the AStats project offered this year through INCF.

All the best.

1 Like

Thank you for the suggestion about the AStats project as well! I will take a look at it.

In the meantime, I’ve joined the SciCommons test platform and will explore the community features and workflows as you suggested and come back with more specific ideas soon.

Hi everyone,

I explored the SciCommons platform and really liked the idea of community-driven peer review and article discussions.

While going through the platform, I noticed that reading and understanding research papers can still be quite time-consuming, and there’s currently limited support for intelligent discovery or summarization.

I’m particularly interested in contributing to the AI/NLP side by working on features like:

  • Paper summarization for quick understanding
  • Semantic search for better discovery of relevant research
  • Discussion summarization for extracting key insights from reviews

I have experience building AI-based systems (including embeddings, NLP pipelines, and backend APIs), and I’m planning to prototype a small feature around paper summarization to explore how it could integrate into SciCommons.

I wanted to ask:
Are there any existing AI pipelines or planned directions in this area that contributors should align with?

Looking forward to contributing!

1 Like

Welcome. @NitinX

Please read the entire discussion above and join the scicommons community as indicated…

Hi everyone!

I’m Dev Gajjar, an Integrated MSc IT student from, India, and I’m interested in contributing to SciCommons for GSoC 2026. I’ve spent time on the platform, read through the GSoC aims and reports to understand how the project evolved. So I have a real picture of where the codebase stands and what this iteration is building toward.

What excites me about this project specifically is the combination of challenges: it’s not just a frontend job or just an AI job it’s making all three layers (UX, backend, and intelligent features) work together for researchers who have very specific expectations about how academic tools should behave.

My stack:

  • Frontend: Next.js, React, Angular TanStack Query, ShadCN/Radix UI, SSR,
  • Backend: Django, Node.js, REST APIs, MongoDB, Redis, Distributed systems, metadata
  • AI/NLP: LLMs, prompt engineering, fine-tuning, semantic search, AWS Bedrock,
  • Infrastructure: Docker, CI/CD, AWS, Linux

I’m planning to contribute PRs before submitting any proposal.Looking forward to being part of this community. @suresh.krishna @armanalam03

1 Like

Hi @suresh.krishna and @armanalam03,

I’m Yash, a 3rd-year student at IIT Roorkee, and I’ve just begun ‘dogfooding’ the test site. My interest in Project #18 stems from the unique challenge of scaling a scientific discussion platform while integrating native AI discovery tools.

I currently lead the Information Management Group (IMG) at my institute, where I manage the end-to-end development of platforms serving 15,000+ active users via Django/DRF and PostgreSQL. This production experience has taught me that high-load scientific platforms require more than just feature implementation—they require architectural resilience.

Regarding the AI/summarization pipeline: I previously engineered an AI news aggregator using BERT for real-time classification and automated content generation. For SciCommons, I’m interested in moving beyond basic API calls to implement an asynchronous task priority queue system (Celery/Redis). This ensures that heavy LLM inference for paper summarization doesn’t impact the responsiveness of the core scientific discussion threads.

I’ve requested to join the community on test.scicommons.org and will be sharing a technical breakdown of potential ‘optimistic update’ improvements for the UI there shortly. Looking forward to contributing!

1 Like

@suresh.krishna @arnab1896 For ai paper summarization should the user upload the pdf r when the user opens a paper will the ai bot pop up

both options are possible

Hi everyone,

My name is Valentine Emmanuel, and I’m interested in contributing to SciCommons for GSoC 2026.

I’m a Mathematics student and a full-stack developer with hands-on experience in Django, Next.js/React, TypeScript, and Python. I’ve built API-driven systems, worked with async workflows (Celery/Redis), and designed modern frontend interfaces.

I’ve gone through the backend repository in depth and have a clear picture of where the project stands. A few specific areas I’ve identified where I can contribute:

  1. Article status-change notifications - communities/articles_api.py has about 11 TODO comments marking missing notifications across the review workflow (submitted, approved, under review, accepted/rejected, published). The email infrastructure with Celery tasks and HTML templates is already in place, they just need to be wired up.

  2. API test coverage - the articles, communities, and posts apps currently only have model-level tests. There are no API-level tests for any of those endpoints, and myapp/test_api.py is empty. I’d like to help build out a proper test suite there.

  3. AI/NLP features - the DiscussionSummary model and admin endpoints are in place but summarization is currently manual. I’m interested in integrating open-source LLMs to power automated summarization, semantic search over article abstracts, and paper recommendations, which seems to line up with the GSoC goals.

  4. Frontend - I’m also interested in the Next.js side (reading/review UX, discussion flows) and can pick that up once I’ve established a baseline on the backend.

I plan to start with a small focused PR on one of the above before going broader.

A few questions:

  • Of the above, which is most useful to the team right now?
  • Is there a preferred branch and PR process (issue-first, draft PRs, etc.)?
  • Are there any areas explicitly reserved for the core team?

Looking forward to contributing.

Best regards,
Valentine Emmanuel

1 Like

Hello @HardProga - thank you for your interest. At this stage, given how close the deadline is, we are not inviting more PRs since there is an onboarding process for PRs with the ambition that the ones you mention have. In particular, we are cautious about backend changes. So once the GSoC coding period starts, we will see who is still around (both GSoC intern(s) and volunteers) and then depending on the sustained commitment available, we can discuss how to divide up ideas and tasks. If you are going to submit a proposal, you can do that without a PR. All the best !

1 Like

Hi, thank you for the clarification.

1 Like

@suresh.krishna Hello sir, I hope you doing fine.
Here is an update, I have been following Scicommns, from long time since Jan start 2026.
After that I discussed many things about sites and testing, how you made sureshDev branch for testing and various Migrations, docker issues, bugs, learnt many things from you!.
Till now I succesfully merged 10+ prs. And many I have keep in mind but as you mentioned you are not accepting more pr as deadline is getting near.
So, apart from that you and @armanalam03 sir told me to give my idea on backend fixes for docker setup and version issues. Due to that now docker build is now good.
From my side , I fell seriusly ill for 10 days. But now I am rerady to go. I am now started working on my proposal and thinking of getting your opinion atleast once before submittion. I trying to implement those gsoc potential ideas and present to you in proposal. I hope we can get in touch when I share my proposal to you.
Thank you for reading this.
I was hoping to add more prs but no worries, I will now work on proposal(half way done) .
Just an update from my side

1 Like