Dealing with transformations (groups, appending, replacing, freezing).
While torchvision (and other) datasets typically have a fixed set of transformations, AvalancheDataset also provides some additional functionalities. AvalancheDataset
s can:
Have multiple transformation "groups" in the same dataset (like separate train and eval transformations).
Manipulate transformation by freezing, replacing and removing them.
The following sub-sections show examples on how to use these features. It is warmly recommended to run this page as a notebook using Colab (info at the bottom of this page).
Let's start by installing Avalanche:
AvalancheDatasets can contain multiple transformation groups. This can be useful to keep train and test transformations in the same dataset and to have different sets of transformations. For instance, you can easily add ad-hoc transformations to using for replay data.
For classification dataset, we follow torchvision conventions. Therefore, make_classification_dataset
supports transform
, which is applied to input (X) values, and target_transform
, which is applied to class labels (Y). The latter is rarely used. This means that a transformation group is a pair of transformations to be applied to the X and Y values of each instance returned by the dataset. In both torchvision and Avalanche implementations, a transformation must be a function (or other callable object) that accepts one input (the X or Y value) and outputs its transformed version. A comprehensive guide on transformations can be found in the torchvision documentation.
In the following example, a MNIST dataset is created and then wrapped in an AvalancheDataset. When creating the AvalancheDataset, we can set train and eval transformations by passing a transform_groups parameter. Train transformations usually include some form of random augmentation, while eval transformations usually include a sequence of deterministic transformations only. Here we define the sequence of train transformations as a random rotation followed by the ToTensor operation. The eval transformations only include the ToTensor operation.
Of course, one can also just use the transform
and target_transform
constructor parameters to set the transformations for both the train and the eval groups. However, it is recommended to use the approach based on transform_groups (shown in the code above) as it is much more flexible.
.train()
and .eval()
The default behaviour of the AvalancheDataset is to use transformations from the train group. However, one can easily obtain a version of the dataset where the eval group is used. Note: when obtaining the dataset of experiences from the test stream, those datasets will already be using the eval group of transformations so you don't need to switch to the eval group ;).
You can switch between the train and eval groups using the .train()
and .eval()
methods to obtain a copy (view) of the dataset with the proper transformations enabled. As a general rule, methods that manipulate the AvalancheDataset fields (and transformations) always create a view of the dataset. The original dataset is never changed.
In the following cell we use the avl_mnist_transform dataset created in the cells above. We first obtain a view of it in which eval transformations are enabled. Then, starting from this view, we obtain a version of it in which train transformations are enabled. We want to double-stress that .train()
and .eval()
never change the group of the dataset on which they are called: they always create a view.
One can check that the correct transformation group is in use by looking at the content of the transform/target_transform fields.
In AvalancheDatasets the train and eval transformation groups are always available. However, AvalancheDataset also supports custom transformation groups.
The following example shows how to create an AvalancheDataset with an additional group named replay. We define the replay transformation as a random crop followed by the ToTensor operation.
However, once created the dataset will use the train group. You can switch to the group using the .with_transforms(group_name)
method. The .with_transforms(group_name)
method behaves in the same way .train()
and .eval()
do by creating a view of the original dataset.
The replacement operation follows the same idea (and benefits) of the append one. By using .replace_current_transform_group(transform, target_transform)
one can obtain a view of the original dataset in which the transformaations for the current group are replaced with the given ones. One may also change tranformations for other groups by passing the name of the group as the optional parameter group
. As with any transform-related operation, the original dataset is not affected.
Note: one can use .replace_transforms(...)
to remove previous transformations (by passing None
as the new transform).
The following cell shows how to use .replace_transforms(...)
to replace the transformations of the current group:
One last functionality regarding transformations is the ability to "freeze" transformations. Freezing transformations menas permanently glueing transformations to the dataset so that they can't be replaced or changed in any way (usually by mistake). Frozen transformations cannot be changed by using .replace_transforms(...)
.
One may wonder when this may come in handy... in fact, you will probably rarely need to freeze transformations. However, imagine having to instantiate the PermutedMNIST benchmark. You want the permutation transformation to not be changed by mistake. However, the end users do not know how the internal implementations of the benchmark works, so they may end up messing with those transformations. By freezing the permutation transformation, users cannot mess with it.
Transformations for all transform groups can be frozen at once by using .freeze_transforms()
. As always, those methods return a view of the original dataset.
The cell below shows a simplified excerpt from the PermutedMNIST benchmark implementation. First, a PixelsPermutation instance is created. That instance is a transformation that will permute the pixels of the input image. We then create the train end test sets. Once created, transformations for those datasets are frozen using .freeze_transforms()
.
In this way, that transform can't be removed. However, remember that one can always append other transforms atop of frozen transforms.
The cell below shows that replace_transforms
can't remove frozen transformations:
This completes the Mini How-To for the functionalities of the AvalancheDataset related to transformations.
Here you learned how to use transformation groups and how to append/replace/freeze transformations in a simple way.
Other Mini How-Tos will guide you through the other functionalities offered by the AvalancheDataset class. The list of Mini How-Tos can be found here.