Models

First things first: let's start with a good model!

Welcome to the "Models" tutorial of the "From Zero to Hero" series. In this notebook we will talk about the features offered by the models Avalanche sub-module.

Support for pytorch Modules

Every continual learning experiment needs a model to train incrementally. You can use any torch.nn.Module, even pretrained models. The models sub-module provides the most commonly used architectures in the CL literature.

You can use any model provided in the Pytorch official ecosystem models as well as the ones provided by pytorchcv!

!pip install avalanche-lib==0.5
from avalanche.models import SimpleCNN
from avalanche.models import SimpleMLP
from avalanche.models import SimpleMLP_TinyImageNet
from avalanche.models import MobilenetV1

model = SimpleCNN()
print(model)

Dynamic Model Expansion

A continual learning model may change over time. As an example, a classifier may add new units for previously unseen classes, while progressive networks add a new set units after each experience. Avalanche provides DynamicModules to support these use cases. DynamicModules are torch.nn.Modules that provide an addition method, adaptation, that is used to update the model's architecture. The method takes a single argument, the data from the current experience.

For example, an IncrementalClassifier updates the number of output units:

from avalanche.benchmarks import SplitMNIST
from avalanche.models import IncrementalClassifier

benchmark = SplitMNIST(5, shuffle=False, class_ids_from_zero_in_each_exp=False)
model = IncrementalClassifier(in_features=784)

print(model)
for exp in benchmark.train_stream:
    model.adaptation(exp)
    print(model)

As you can see, after each call to the adaptation method, the model adds 2 new units to account for the new classes. Notice that no learning occurs at this point since the method only modifies the model's architecture.

Keep in mind that when you use Avalanche strategies you don't have to call the adaptation yourself. Avalanche strategies automatically call the model's adaptation and update the optimizer to include the new parameters.

Multi-Task models

Some models, such as multi-head classifiers, are designed to exploit task labels. In Avalanche, such models are implemented as MultiTaskModules. These are dynamic models (since they need to be updated whenever they encounter a new task) that have an additional task_labels argument in their forward method. task_labels is a tensor with a task id for each sample.

from avalanche.benchmarks import SplitMNIST
from avalanche.models import MultiHeadClassifier

benchmark = SplitMNIST(5, shuffle=False, return_task_id=True, class_ids_from_zero_in_each_exp=True)
model = MultiHeadClassifier(in_features=784)

print(model)
for exp in benchmark.train_stream:
    model.adaptation(exp)
    print(model)

When you use a MultiHeadClassifier, a new head is initialized whenever a new task is encountered. Avalanche strategies automatically recognize multi-task modules and provide task labels to them.

How to define a multi-task Module

If you want to define a custom multi-task module you need to override two methods: adaptation (if needed), and forward_single_task. The forward method of the base class will split the mini-batch by task-id and provide single task mini-batches to forward_single_task.

from avalanche.models import MultiTaskModule

class CustomMTModule(MultiTaskModule):
    def __init__(self, in_features, initial_out_features=2):
        super().__init__()

    def adaptation(self, dataset):
        super().adaptation(dataset)
        # your adaptation goes here

    def forward_single_task(self, x, task_label):
        # your forward goes here.
        # task_label is a single integer
        # the mini-batch is split by task-id inside the forward method.
        pass

Alternatively, if you only want to convert a single-head model into a multi-head model, you can use the as_multitask wrapper, which converts the model for you.

from avalanche.models import as_multitask

model = SimpleCNN()
print(model)

mt_model = as_multitask(model, 'classifier')
print(mt_model)

🤝 Run it on Google Colab