Link to our paper
Authors: Naitik Khandelwal, Xiao Liu, Mengmi Zhang
Our paper has been accepted in NeurIPS 2024.
Scene graph generation (SGG) analyzes images to extract meaningful information about objects and their relationships. In the dynamic visual world, it is crucial for AI systems to continuously detect new objects and establish their relationships with existing ones. Recently, numerous studies have focused on continual learning within the domains of object detection and image recognition. However, a limited amount of research focuses on a more challenging continual learning problem in SGG. This increased difficulty arises from the intricate interactions and dynamic relationships among objects, and their associated contexts. Thus, in continual learning, SGG models are often required to expand, modify, retain, and reason scene graphs within the process of adaptive visual scene understanding. To systematically explore Continual Scene Graph Generation (CSEGG), we present a comprehensive benchmark comprising three learning regimes: relationship incremental, scene incremental, and relationship generalization. Moreover, we introduce a Replays via Analysis by Synthesis method named RAS. This approach leverages the scene graphs, decomposes and re-composes them to represent different scenes, and replays the synthesized scenes based on these compositional scene graphs. The replayed synthesized scenes act as a means to practice and refine proficiency in SGG in known and unknown environments. Our experimental results not only highlight the challenges of directly combining existing continual learning methods with SGG backbones but also demonstrate the effectiveness of our proposed approach, enhancing CSEGG efficiency while simultaneously preserving privacy and memory usage.
Below is an illustration of all the learning scenarios in CSEGG:
CSEGG Learning Scenarios. |
From left to right, they are S1. relationship (Rel.) incremental learning (Incre.); S2. relationship and object (Rel. + Obj.) Incre.; and S3. relationship generalization (Rel. Gen.) in Object Incre.. In S1 and S2, example triplets in the training (solid line) and test sets (dash line) from each task are presented. The training and test sets from the same task are color-coded. The new objects or relationships in each task are bold and underlined. In S3, one single test set (dashed gray box) is used for benchmarking the relationship generalization ability of object incre. learning models across all the tasks.
Check INSTALL.md for installation instructions.
Check DATASET.md for instructions of dataset preprocessing.
Now look at HOW_T0_USE.md for knowing various commands to run the train and eval scripts (especially if you are using multiple gpus)
Check HOW_T0_USE.md for the instructions
After running the evaluation code, you can load the "Final_Results.csv" in the jupyter notebook "results_figure.ipynb"
Training:
- --num-gpus : Number of GPUs used for training.
- --start_task : To resume the training from certain task.
- --sgg : To activate Stage 2 for Learning Scenario S2, S3. (This argument is not present in Learning Scenario S1).
- --continual : To choose which CSEGG model to train.
- Learning Scenario S1 :- "replay_10", "ewc", "replay_100", "packnet", "ras" (Our new model. See RAS.md for details and ras folder for the code. .) To train "naive", exclude this argument from training command.
- Learning Scenario S2 :- "replay_10", "ewc", "replay_20", "packnet", "ras". To train "naive", exclude this argument from training command.
- Learning Scenario S3 :- "replay_10", "ras". To train "naive", exclude this argument from training command.
Evaluation:
- --num-gpus : Number of GPUs used for testing.
There is only Stage 2 training for Learning Scenario S1. To train the model, run the following in the command window:
cd ~/CSEGG/playground/sgg/detr.res101.c5.one_stage_rel_tfmer
pods_train_S1 --num-gpus 4 --continual "replay_10"
To evaluate,
cd ~/CSEGG/playground/sgg/detr.res101.c5.one_stage_rel_tfmer
pods_test_S1 --num-gpus 1
To train the model, run the following in the command window:
#Stage 1
cd ~/CSEGG/playground/sgg/detr.res101.c5.multiscale.150e.bs16
pods_train_S2 --num-gpus 4 --continual "replay_10"
#Stage 2
cd ~/CSEGG/playground/sgg/detr.res101.c5.one_stage_rel_tfmer
pods_train_S2 --num-gpus 4 --continual "replay_10" --sgg "sgg"
To evaluate,
cd ~/CSEGG/playground/sgg/detr.res101.c5.one_stage_rel_tfmer
#Evaluation of Object Detection (Stage 1) and SGG (Stage 2) is combined
pods_test_S2 --num-gpus 1
To train the model, run the following in the command window:
#Stage 1
cd ~/CSEGG/playground/sgg/detr.res101.c5.multiscale.150e.bs16
pods_train_S3 --num-gpus 4 --continual "replay_10"
#Stage 2
cd ~/CSEGG/playground/sgg/detr.res101.c5.one_stage_rel_tfmer
pods_train_S3 --num-gpus 4 --continual "replay_10" --sgg "sgg"
To evaluate,
#evaluation of R_bbox and R@k_relation_gen
cd ~/CSEGG/playground/sgg/detr.res101.c5.one_stage_rel_tfmer
pods_test_S3 --num-gpus 1
This repository borrows code from scene graph benchmarking frameworks: Scene Graph Benchmark developed by KaihuaTang, PySGG and SGTR developed by Rongjie Li.
- Import ipdb in anywhere in your code will cause the multi-process initialization error, try pdb when you debug in multi-process mode.