Hello everyone,
I’m a B.Tech student in Electronics, currently preparing for Google Summer of Code 2026, and I’ve recently started exploring open-source work related to machine learning and time-series analysis.
As part of my learning, I’m working on a small exploratory prototype that focuses on understanding why time-series ML models fail during training and evaluation. Right now, this includes very basic aspects such as:
- observing overfitting and underfitting via loss curves,
- simple sanity checks on training vs validation behavior,
- and structuring the code in a way that makes debugging more systematic.
I’m aware that these are foundational ideas and not novel by themselves. My goal at this stage is to build a solid conceptual understanding before moving towards more advanced issues like data leakage, distribution shift, and robustness in real neural or signal-based datasets.
I’d really appreciate:
- pointers to recommended practices or case studies in this area,
- common pitfalls beginners often miss when analyzing time-series models,
- or suggestions on how to meaningfully extend such basic tools toward research-relevant workflows.
Thank you for your time, and I’m looking forward to learning from the community.