GSOC 2026 Project #19 : ActiveVision: continued development of a data and model portal for the study of goal-directed vision

Mentors: Buxin Liao <buxin.liao@mail.mcgill.ca>, Katarzyna Jurewicz <jurewicz.ka@gmail.com>, Suresh Krishna <suresh.krishna@mcgill.ca>

Skill level: Intermediate – Advanced

Required Skills: Familiarity with open-source vision and multimodal AI models. Fluency in Python and PyTorch. Familiarity with Slurm and working with clusters preferred. Basic web-development skills or interest in learning them will be useful.

Time commitment: Full time (350 hours)

About: Salience map research in computer vision has extensively examined where human observers look in images and videos during free viewing. Despite cognitive psychology recognizing the role of behavioral goals for over 50 years, integrating task dependence into quantitative models and large open datasets is a recent development. This project aims to create an open portal that consolidates existing machine learning/AI models and eye-tracking datasets related to goal-directed vision (e.g., visual search) while providing tools for model testing and validation. A key focus is on multimodal AI, particularly language-vision integration. Additionally, this platform will serve as a prototype for similar data+model initiatives on public hardware platforms.

Aims: In last year’s GSoC project ( GSoC 2025 report.md · GitHub), we created a library of machine vision models and a toolbox for their application to scanpath datasets. Over the past year, there has been substantial progress in terms of better models and datasets. This year’s project aims to bring the library and toolbox up to date, and additionally create a user-facing web portal on Compute Canada that will facilitate submission and evaluation of models on scanpath datasets.

Project website: GitHub - m2b3/ActiveVisionPortal: This project is the work for the Google Summer of Code 2025, with the organization INCF. · GitHub and
GitHub - m2b3/SciCommons-frontend · GitHub

Tech keywords: Python, PyTorch, Visual search, Saliency, Science portals, Vision AI, Vision-language models.