We include baselines (Finetuning, Freezing and Incremental Joint Training) and the approaches defined in Class-incremental learning: survey and performance evaluation (arxiv). The regularization-based approaches are EWC, MAS, PathInt, LwF, LwM and DMC. The rehearsal approaches are iCaRL, EEIL and RWalk. The bias-correction approaches are IL2M, BiC and LUCIR.
When running an experiment, the approach used can be defined in main_incremental.py using
--approach
. Each approach is called by their respective *.py
name. All approaches inherit from class
Inc_Learning_Appr
, which has the following main arguments:
--nepochs
: number of epochs per training session (default=200)--lr
: starting learning rate (default=0.1)--lr-min
: minimum learning rate (default=1e-4)--lr-factor
: learning rate decreasing factor (default=3)--lr-patience
: maximum patience to wait before decreasing learning rate (default=5)--clipping
: clip gradient norm (default=10000)--momentum
: momentum factor (default=0.0)--weight-decay
: weight decay (L2 penalty) (default=0.0)--warmup-nepochs
: number of warm-up epochs (default=0)--warmup-lr-factor
: warm-up learning rate factor (default=1.0)--multi-softmax
: apply separate softmax for each task (default=False)--fix-bn
: fix batch normalization after first task (default=False)--eval-on-train
: show train loss and accuracy (default=False)
If the approach has some specific arguments, those are defined in the specific extra_parser()
of each approach file
and are also listed below. All of this information is also available by using --help
.
For all approaches using exemplars, the corresponding arguments are:
--num-exemplars
: fixed memory, total number of exemplars (default=0)--num-exemplars-per-class
: growing memory, number of exemplars per class (default=0)--exemplar-selection
: exemplar selection strategy (default='random')
where --num-exemplars
and --num-exemplars-per-class
cannot be used at the same time. We extend LwF, EWC, MAS,
Path Integral to allow exemplar rehearsal.
To add a new approach, follow this:
- Create a new file similar to finetuning.py. The name used will be the one that can be called with
--approach
. - Implement the method as needed and overwrite necessary functions and methods from incremental_learning.py.
- Add necessary arguments to the approach parser and make sure to not modify
calculate_metrics()
unless necessary to make sure that metrics are comparable.
--approach finetuning
Learning approach which learns each task incrementally while not using any data or knowledge from previous tasks. By
default, weights corresponding to the outputs of previous classes are not updated. This can be changed by using
--all-outputs
. This approach allows the use of exemplars.
--approach freezing
Learning approach which freezes the model after training the first task so only the heads are learned. The task after
which the model is frozen can be changed by using --freeze-after num_task (int)
. As in Finetuning, by default the
corresponding to the current task outputs are updated, but can be changed by using --all-outputs
.
--approach joint
Learning approach which has access to all data from previous tasks and serves as an upperbound baseline. Joint training
can be combined with Freezing by using --freeze-after num_task (int)
. However, this option is disabled (default=-1).
--approach lwf
arxiv
| TPAMI 2017
--lamb
: forgetting-intransigence trade-off (default=1)--T
: temperature scaling (default=2)
--approach icarl
arxiv
| CVPR 2017
| code
--lamb
: forgetting-intransigence trade-off (default=1)
--approach ewc
arxiv
| PNAS 2017
--lamb
: forgetting-intransigence trade-off (default=5000)--alpha
: trade-off for how old and new fisher are fused (default=0.5)--fi-sampling-type
: sampling type for Fisher information (default='max_pred')--fi-num-samples
: number of samples for Fisher information (-1: all available) (default=-1)
--approach path_integral
arxiv
| ICML 2017
| code
--lamb
: forgetting-intransigence trade-off (default=0.1)--damping
: damping (default=0.1)
--approach mas
arxiv
| ECCV 2018
| code
--lamb
: forgetting-intransigence trade-off (default=1)--alpha
: trade-off for how old and new fisher are fused (default=0.5)--fi-num-samples
: number of samples for Fisher information (-1: all available) (default=-1)
--approach r_walk
arxiv
| ECCV 2018
| code
--lamb
: forgetting-intransigence trade-off (default=1)--alpha
: trade-off for how old and new fisher are fused (default=0.5)--damping
: damping (default=0.1)--fi-sampling-type
: sampling type for Fisher information (default='max_pred')--fi-num-samples
: number of samples for Fisher information (-1: all available) (default=-1)
--approach eeil
arxiv
| ECCV 2018
| code
--lamb
: forgetting-intransigence trade-off (default=1)--T
: temperature scaling (default=2)--lr-finetuning-factor
: finetuning learning rate factor (default=0.01)--nepochs-finetuning
: number of epochs for balanced training (default=40)--noise-grad
: add noise to gradients (default=False)
--approach lwm
arxiv
| CVPR 2019
--beta
: trade-off for distillation loss (default=1)--gamma
: trade-off for attention loss (default=1)--gradcam-layer
: which layer take for GradCAM calculations (default='layer3')--log-gradcam-samples
: how many examples of GradCAM to log (default=0)
--approach dmc
arxiv
| WACV 2020
| code
--aux-dataset
: auxiliary dataset (default='imagenet_32_reduced')--aux-batch-size
: batch size for auxiliary dataset (default=128)
--approach bic
arxiv
| CVPR 2019
| code
--lamb
: forgetting-intransigence trade-off (-1: original moving trade-off) (default=-1)--T
: temperature scaling (default=2)--val-exemplar-percentage
: percentage of exemplars that will be used for validation (default=0.1)--num-bias-epochs
: number of epochs for training bias (default=200)
--approach lucir
CVPR 2019
| code
--lamb
: trade-off for distillation loss (default=5)--lamb-mr
: trade-off for the MR loss (default=1)--dist
: margin threshold for the MR loss (default=0.5)--K
: Number of "new class embeddings chosen as hard negatives for MR loss (default=2)