learning without forgetting

problem definition

how to uing new data to train network while preserving the original capabilities

comparable methods

  • feature extraction: extracting feature from unchanged parameters trained on old tasks in training new branches for new tasks. however, the shared parameters fail to represent discriminative feature for new task.

  • fine-tuning FC: shared layers are fixed and finetuning the fc layers. however, the finetuned shared parameters degrade performance on previous tasks because there is not new guidance for original datasets.

  • joint training: upper bound of the lwf, need original datasets and new datasets.

learning without forgetting

  • similar to joint training, don’t need original images and labels