GSoC Project Idea 13: Improving unit testing and test coverage for the TE-Dependence ANAlysis (tedana) toolbox

malin · January 28, 2019, 12:03pm

Mentor(s): Kirstie Whitaker and Taylor Salo

Context and motivation: Traditional fMRI denoising makes a priori assumptions about the shape of noise fluctuations across time. Multi-echo fMRI (ME-fMRI) enables data-driven denoising by collecting multiple echoes in a single fMRI volume, offering a significant improvement over standard approaches. Supporting this, previous ME-fMRI denoising methods such as ME-ICA (multi-echo independent component analysis) have been shown to improve data quality. However, existing implementations lack provenance for data inclusion criteria and are difficult to extend or improve.

The tedana Python package [1][2] is designed to serve as both a canonical multi-echo denoising pipeline with robust default settings and a toolbox into which researchers can integrate new methods for denoising. In creating tedana as a Python package, we have remained committed to best-practice principles in open-source development, including extensive documentation for both users and contributors as well as an open governance structure [3].

The proposed project aims to further develop tedana’s testing suite in order to improve test coverage and to make the codebase more robust to improvements and additions. Given the development team’s current emphasis on integrated workflows, appropriate unit tests have not been written for a large portion of the core functionality of the package. The GSOC project goal is to improve test coverage of tedana by modularizing existing workflows and writing unit tests of existing code.

The GSoC student will develop their skills working with Python, employing modern software testing suites to improve reproducibility and robustness of code and will have explicit mentorship in ways of open and collaborative working using git and GitHub.

Tool description: The tedana test suite is implemented as a collection of pytest-compatible testing functions. Tests will be evaluated on the continuous integration platforms CircleCI [4] and Travis [5], and will employ coverage profilers like CodeCov [6].

Improved modularization of existing workflows will be done in conjunction with members of the tedana developer community, leveraging the community’s distributed expertise both in Python programming knowledge and familiarity with tedana.

Project description and aims: This project is aimed towards students seeking to develop their coding skills and to gain familiarity with collaborative development. The successful candidate will gain 1) real world experience engaging with a wide range of researchers and developers and 2) experience with test-driven development.

Measurable outcomes include increased test coverage of the tedana package and implemented checksums [7] in regression testing.

Skills needed/desired: Interested students should be comfortable with Python and GitHub, with a desire to learn the continuous integration platforms CircleCI [4] and Travis [5], and coverage profilers like CodeCov [6]. A basic familiarity with neuroimaging data formats and preprocessing is also desirable. A commitment to open and collaborative working is essential. All contributors to tedana are expected to comply with the tedana code of conduct at all times [8].

Key words: Python; usability; brain imaging; reproducible research

Relevant external links:

souravsingh · February 3, 2019, 10:38am

Hello,

I am interested in working on the project,since I am familiar with Travis and CircleCI and have worked with fMRI data format heavily, since I am working under Dr. Bertrand Thirion on a large scale analysis project.

I would like to discuss what kind of test coverage would need to be done.

KirstieJane · February 4, 2019, 8:35am

This is fab @souravsingh!! Thank you for getting in touch

You can see from the code coverage report (https://codecov.io/gh/ME-ICA/tedana) that some of the codebase has very good coverage which other parts do not. Basically we’re looking to up that starburst to looking much more green

One of the ways that we’ll assess the GSOC applications we recieve will be according to the ideas that the students put forwards themselves. So have a dig around the github repo and see what you think would be a sensible place to start and we’re super happy to give feedback on your plan

souravsingh · February 6, 2019, 5:38am

Thans for the reply. Would it be possible to add type-checking as part of the testing process. It won’t contribute to the coverage of the code, but it can help remove some of the bugs when a user gives wrong type.

KirstieJane · February 7, 2019, 8:41pm

Ah! So sorry for the slow reply.

Definitely really useful to add different checks. Did you see the link to checksums? I think that’s going to be a really valuable addition to the project

souravsingh · February 11, 2019, 6:35am

Hello,

Thanks for the reply!

I have taken a look at the code coverage and tests and I have formulated a small plan for the same. I am currently working on fully drafting the plan for the project and intend to share the plan here.

souravsingh · February 16, 2019, 3:19pm

@KirstieJane I am sharing a draft proposal for the project which can be accessed here- https://docs.google.com/document/d/12FPCjzhZgOGFklKjdMjIpr8-zUpKbR0-nh1J4v8uSso/edit?usp=sharing

I have written a timeline until the first evaluation period, so I would need help in drafting up the full proposal and make it even better. I have enabled commenting on the doc to help with the same.

souravsingh · February 22, 2019, 12:49pm

Hello @KirstieJane, Were you able to take a look at the draft proposal for the project? I have made some edits for the same.

KirstieJane · March 20, 2019, 11:15pm

Hi @souravsingh!

I’m so sorry I dropped the ball here! Are you still interested in the project? I will entirely understand if you’ve found one where folks actually replied to your message

Let me know if you are and I can take a look at the project on Saturday

souravsingh · March 22, 2019, 2:37am

Hello @KirstieJane, I am really interested in working on the project. It would be nice if you can review the proposal and help me in creating a nice proposal.

KirstieJane · March 25, 2019, 9:11am

Hi @souravsingh! I just added some comments to the google doc! Looking good

souravsingh · April 2, 2019, 7:53am

@KirstieJane I have changed the timeline to make it biweekly. Can you take a look and see if it needs improvement?

souravsingh · April 2, 2019, 12:31pm

I have shared the draft proposal on the GSoC website.

KirstieJane · April 4, 2019, 9:28pm

Thanks @souravsingh! The edits look great!

souravsingh · April 5, 2019, 4:36pm

Thanks a lot @KirstieJane. Is there something else that could be added to enhance the proposal?

ankiitgupta7 · April 5, 2019, 10:41pm

Hey @souravsingh, looked at your proposal draft. Your credentials look great.
I thought you could improve on write-up as I see some minor errors in formatting text at certain places and suggested edits for the same in your draft. You can use Grammarly to improve upon this. Also, you can ask your mentor, @KirstieJane if she can share any sample proposal so that you can improve upon if there’s any scope.
All the Best!

souravsingh · April 9, 2019, 6:58pm

@KirstieJane I have uploaded my final proposal to GSoC. I would like to thank you for helping me and I loo forward to working under you.

KirstieJane · April 10, 2019, 11:11am

Thank you @souravsingh!! I’m looking forward to reviewing the applications this weekend

souravsingh · April 19, 2019, 10:30pm

@KirstieJane Would it be possible to start some preliminary work on the project and add some tests for the modules?