Skip to content

GSoC 2021 project

vfdev edited this page Apr 4, 2021 · 14 revisions

Google Summer of Code 2021 project

Getting started

There few basic requirements for new contributors. They have to accomplish the following steps:

To be considered as a GSoC student, it is necessary to have at least 3 accepted PRs to the project. They can be something small like a doc fix or a simple bug fix. Please, see help wanted issues.

Please, see also this GitHub Discussions thread for updated information.

Projects

Improve Metrics module

Metrics module in PyTorch-Ignite proposes very unique features to evaluate a trained PyTorch model in an online manner and for any type of computation settings (single process or distributed configuration). Namely, the module provides essential metrics for classification/segmentation tasks and various regression metrics.

The plan for the improvements contains the following topics:

  • All metrics should work in distributed configuration (#1284)
    • Update current implementation of few metrics that still do not support distributed configuration
    • Implement the tests based on current testing practice
    • Explore asymmetric distributed metrics computation: PyTorch uneven distributed input support
    • Provide configurable distributed metrics reduce/gather methods (#1242)
  • Provide new metrics for object detection task: implement mean Average Precision metric
    • Implement the metric and its tests
  • Provide new metrics for NLP: essential metrics for common tasks: ROUGE, BLEU, etc
    • Implement the metric and its tests
  • Provide new metrics for GANs: FID, PPL, others (see #998)
    • Implement the metric and its tests
  • Work on enabling label-wise metrics (Accuracy etc.) for multi-label problems (#513)
    • Prototype new API to add label-wise option for multi-label metrics
    • Implement chosen API and implement the tests
  • Add minor improvements:
    • better support of sklearn metrics
    • classification metrics with micro/macro options

One student can choose any of thoses topics and assuming ~175 hours per project is advised to pick at most 2 topics.

Expected outcome

We would expect at least 2 items from the list below to be done for this project:

  • Implemented one or two new metrics (e.g. mAP, ROUGE, BLEU, FID, etc).
    • Optionally, create a basic example with its usage.
  • Implemented new tests to ensure that all library's metrics work in distributed configuration
  • Implemented configurable distributed metrics reduce/gather methods
  • Implemented label-wise API for related metrics

In addition, it would be nice to have a short blog post communicating about the work done.

Preferred skills and mindset

  • Fluent with Python
  • Already trained neural networks with PyTorch
  • Willing to maintain AI-related open-source project
  • Curiosity and motivation to learn new technical things

Complexity rating

4 / 5

Potential mentors:

Development of a Higher-level API

Library provides a very flexible way to construct a model's trainer, however this API can be optionally simplified further for a number of common tasks (#912).

The plan for this project:

  • Explore existing open-source solutions
    • List of most interesting implementations: #912.
    • Few points on how to assess existing solution:
      • generic or task specific (can handle one or multiple models/optimizers, multiple data sources etc)
      • out-of-the-box integration of software engineering features like experiment tracking systems, auto-checkpointing, automatic-mixed-precision, auto-batching, etc.
      • API simplicity (e.g. 10k parameters vs compose things)
      • Code maintenance
    • Read 1 or 2 online short materials on how to design an API (for example, link)
  • Create a short document presenting all studied solutions with pros/cons
  • Prototype your new API based on studied approaches
    • Try to implement a self-supervised training or multi-models training algorithm using your API
    • Nice to have features:
      • Automatic integration of torch native Automatic Mixed Precision
      • Distributed options
      • Automatic batch size via toma
  • Implement new API in the project's codebase
    • Implement necessary tests
  • Create examples using new API
  • (Optionally) Short blog post communicating about the work done.

Expected outcome

  • Short document presenting all studied solutions with pros/cons
  • Implemented new Higher-level API with tests
  • Implemented 1-2 examples with new Higher-level API
  • (Optionally) Short blog post communicating about the work done.

Preferred skills and mindset

  • Fluent with Python
  • Already trained neural networks with PyTorch
  • Do not afraid of try/fail/succeed work
  • Want to learn on how to design an API

Complexity rating

4 / 5

Potential mentors: