GSoC Project Idea 6.1: Make published datasets accessible on the platform

malin · January 11, 2019, 11:21am

The aim of this part of the project is to develop code to collect published datasets (raw images, segmentations, skeletons, synapses, graphs, volume meshes etc.) and convert them into formats that are compatible with the BrainCircuits.io platform and the CATMAID data model.

We have a large list of datasets and online data sources distributed all-over the internet in a large variety of different formats and ways to access them. After learning about the structure of the target formats, we would want you to get your hands dirty and go through this list of datasets. We want you to understand the dataset’s content and write code to convert them to be accessible and browsable on the BrainCircuits.io platform and in CATMAID. This will be a nice learning experience and we expect that you’ll become very fast at doing this after a few datasets – juggling with a number of tools and libraries

Skills required: Python, SQL; Careful and Systematic Work

Some of the things you’ll be dealing with: JSON, XML-parsing, REST APIs, SQL Queries, NumPy Arrays, HDF5, Jupyter Notebooks, Dataset descriptions in publications, …

Nice-to-have Skills but not strictly required: Linux, Bash, SysAdmin, WordPress

Mentor: Stephan Gerhard, PhD, Zurich, Switzerland.

Links :

Johnover_Board · February 2, 2019, 3:18pm

Hi, I am Joel V Zachariah a junior year student from India. I am interested to get started in this project as the task involves domains that I am familiar with/eager to learn. I would like to hear from project mentor Mr. Stephan on how to proceed. I am currently getting familiarize with BrainCircuits system and the CATMAID data model (trying to solve some beginner friendly issues to start with). Few pointers would help me better focus on the trajectory to take. I look forward to hearing more.

unidesigner · February 5, 2019, 9:21am

Hi Joel, thanks for you interest in the project. I’d suggest to first setup a CATMAID instance on your local machine according to the install instruction on catmaid.org, and install catpy. Then, create a simple skeleton in SWC format and POST this to the CATMAID instance via API call using catpy.
You can also check the ‘good first issues’ on CATMAID GitHub, select one that you’d be able to implement and submit a Pull Requests.
https://github.com/catmaid/CATMAID/issues?q=is%3Aissue+is%3Aopen+label%3A"good+first+issue"
Let me know if you have any questions.
Best, Stephan

Davide95 · February 27, 2019, 12:51pm

Hi to everyone.
I’m here because I’m interested in two projects (6.1 and 1) and this is one of the two.

My name is Davide, I’m an MSc student of Computer Science and I’ve a BSc in Computer Science. I’m fluent in Python (e.g. https://github.com/Davide95/pydbscan) and SQL (for instance, see the DB section of this software that I’ve developed: https://github.com/KDE/brooklyn). I’ve used Jupyter Notebooks and Numpy at the university. About the “Linux, Bash, SysAdmin, WordPress” skillset, I’ve used these technologies in the past at work (see my LinkedIn).
If you want to know more about me:

I’m wondering if I someone could give me more information about the proposal.
Thanks in advance.

unidesigner · February 28, 2019, 8:31am

Hi Davide!
Thanks for your interest in the project! Your skills seems to be a very good fit for the project, and don’t worry about the nice-to-have skills. I can give you some more in-depth information about the project if you write me per e-mail to info@braincircuits.io.
Best regards,
Stephan