Learn Avalanche in 5 Minutes
A Short Guide for Researchers on the Run
Last updated
A Short Guide for Researchers on the Run
Last updated
Avalanche is mostly about making the life of a continual learning researcher easier.
What are the three pillars of any respectful continual learning research project?
Benchmarks
: Machine learning researchers need multiple benchmarks with efficient data handling utils to design and prototype new algorithms. Quantitative results on ever-changing benchmarks has been one of the driving forces of Deep Learning.
Training
: Efficient implementation and training of continual learning algorithms; comparisons with other baselines and state-of-the-art methods become fundamental to asses the quality of an original algorithmic proposal.
Evaluation
: Training utils and Benchmarks are not enough alone to push continual learning research forward. Comprehensive and sound evaluation protocols and metrics need to be employed as well.
With Avalanche, you can find all these three fundamental pieces together and much more, in a single and coherent, well-maintained codebase.
Let's take a quick tour on how you can use Avalanche for your research projects with a 5-minutes guide, for researchers on the run!
In this short guide we assume you have already installed Avalanche. If you haven't yet, check out how you can do it following our How to Install guide.
Avalanche is organized in five main modules:
Benchmarks
: This module maintains a uniform API for data handling: mostly generating a stream of data from one or more datasets. It contains all the major CL benchmarks (similar to what has been done for torchvision).
Training
: This module provides all the necessary utilities concerning model training. This includes simple and efficient ways of implement new continual learning strategies as well as a set pre-implemented CL baselines and state-of-the-art algorithms you will be able to use for comparison!
Evaluation
: This modules provides all the utilities and metrics that can help in evaluating a CL algorithm with respect to all the factors we believe to be important for a continually learning system.
Models
: In this module you'll be able to find several model architectures and pre-trained models that can be used for your continual learning experiment (similar to what has been done in torchvision.models).
Logging
: It includes advanced logging and plotting features, including native stdout, file and Tensorboard support (How cool it is to have a complete, interactive dashboard, tracking your experiment metrics in real-time with a single line of code?)
In the graphic below, you can see how Avalanche sub-modules are available and organized as well:
We will learn more about each of them during this tutorial series, but keep in mind that the Avalanche API documentation is your friend as well!
All right, let's start with the benchmarks module right away π
The benchmark module offers three main features:
Datasets: a comprehensive list of PyTorch Datasets ready to use (It includes all the Torchvision Datasets and more!).
Classic Benchmarks: a set of classic Continual Learning Benchmarks ready to be used (there can be multiple benchmarks based on a single dataset).
Generators: a set of functions you can use to generate your own benchmark starting from any PyTorch Dataset!
Datasets can be imported in Avalanche as simply as:
Of course, you can use them as you would use any PyTorch Dataset.
The Avalanche benchmarks (instances of the Scenario class), contains several attributes that describe the benchmark. However, the most important ones are the train
and test streams
.
In Avalanche we often suppose to have access to these two parallel stream of data (even though some benchmarks may not provide such feature, but contain just a unique test set).
Each of these streams
are iterable, indexable and sliceable objects that are composed of experiences. Experiences are batch of data (or "tasks") that can be provided with or without a specific task label.
Avalanche maintains a set of commonly used benchmarks built on top of one or multiple datasets.
What if we want to create a new benchmark that is not present in the "Classic" ones? Well, in that case Avalanche offers a number of utilities that you can use to create your own benchmark with maximum flexibility: the benchmark generators!
The specific scenario generators are useful when starting from one or multiple PyTorch datasets and you want to create a "New Instances" or "New Classes" benchmark: i.e. it supports the easy and flexible creation of a Domain-Incremental, Class-Incremental or Task-Incremental scenarios among others.
Finally, if your ideal benchmark does not fit well in the aforementioned Domain-Incremental, Class-Incremental or Task-Incremental scenarios, you can always use our generic generators:
filelist_scenario
paths_scenario
dataset_scenario
tensor_scenario
You can read more about how to use them the full Benchmarks module tutorial!
The training
module in Avalanche is build on modularity and it has two main goals:
provide a set of standard continual learning baselines that can be easily run for comparison;
provide the necessary utilities to implement and run your own strategy in the most efficient and simple way possible thanks to the building blocks we already prepared for you.
If you want to compare your strategy with other classic continual learning algorithms or baselines, in Avalanche this is as simple as creating an object:
The simplest way to build your own strategy is to create a python class that implements the main train
and eval
methods.
Let's define our Continual Learning algorithm "MyStrategy" as a simple python class:
Then, we can use our strategy as we would do for the pre-implemented ones:
While this is the easiest possible way to add your own strategy, Avalanche supports more sophisticated modalities (based on callbacks) that lets you write more neat, modular and reusable code, inheriting functionality from a parent classes and using pre-implemented plugins.
Check out more details about what Avalanche can offer in this module following the "Training" chapter of the "From Zero to Hero" tutorial!
The evaluation
module is quite straightforward: it offers all the basic functionalities to evaluate and keep track of a continual learning experiment.
This is mostly done through the Metrics and the Loggers. The Metrics provide a set of classes which implements the main continual learning metrics like A_ccuracy_, F_orgetting_, M_emory Usage_, R_unning Times_, etc.
Metrics should be created via the utility functions (e.g. accuracy_metrics
, timing_metrics
and others) specifying in the arguments when those metrics should be computed (after each minibatch, epoch, experience etc...).
The Loggers specify a way to report the metrics (e.g. with Tensorboard, on console or others). Loggers are created by instantiating the respective class.
Metrics and loggers interact via the Evaluation Plugin: this is the main object responsible of tracking the experiment progress. Metrics and loggers are directly passed to the EvaluationPlugin
instance. You will see the output of the loggers automatically during training and evaluation! Let's see how to put this together in few lines of code:
For more details about the evaluation module (how to write new metrics/loggers, a deeper tutorial on metrics) check out the extended guide in the "Evaluation" chapter of the "From Zero to Hero" Avalanche tutorial!
EvaluationYou've learned how to install Avalanche, how to create benchmarks that can suit your needs, how you can create your own continual learning algorithm and how you can evaluate its performance.
Here we show how you can use all these modules together to design your experiments as quantitative supporting evidence for your research project or paper.
You can run this chapter and play with it on Google Colaboratory:
Notebook currently unavailable.