Skip to content

We designed an instance-based compositional relation network (I-cRN) for machine graphical perception tasks by aligning with the actual human perception procedure.

Notifications You must be signed in to change notification settings

mahsageshvadi/Instance-based-RN

 
 

Repository files navigation

Instance-based-RN

We designed an IRN_m network for multi-objects ratio tasks, and an IRN_p network for pair-ratio estimation tasks.

if you meet any problem or bug, please tell me ([email protected]) thanks (^,^).

Usage.

1. Environments:

  • Recommended Env. Python3.6, Tensorflow1.14.0, Keras 2.2.4.

  • Neccessary libaries: numpy, opencv-python, argparse, scikit-learn, openpyxl, pickle, matplotlib.

2. Example:

For quick experiments, we provide Code_PureExperiments to automatically run the network 5 times by default and compute the average and SD of MSE and MLAE. In PureExperiments, dataset will automatically re-generated every time.

  • I'll take Task1.1 Pie3_6 (codes) as an example to show how to use the codes.
python Net_VGG.py    --gpu 0 (--times 5)   
python Net_RN.py     --gpu 1 (--times 5)
python Net_RN_seg.py --gpu 2 (--times 5)
python Net_IRNm.py   --gpu 3 (--times 5)  # or `python Net_IRNp.py   --gpu 3 (--times 5)` for pair tasks.
  • if you need to check some sample images of dataset in local directory, you could run the script as follows to generate a small dataset in current path './datasets/'. This small dataset contains 600/200/200 training/val/test images, including original image, segmented subimages and ground-truths.
python Dataset_generator.py

3. Output files:

For most tasks, we store the best model that obtains the lowest loss on val_sets (Exactly speaking, it's the condition that train_set and test_set have same features.). Here is an example output of an RN network.

  • RN_0.p ~ RN_4.p: the pickle files corresponding to each experiment. It contains the MSE, MLAE of train/val/test sets and the loss history of train/val sets.

  • RN_avg.p: the pickle files that summarize all experiments. It contains the average and SD of MSE and MLAE of train/test sets.

  • folders RN_0 ~ RN_4: it contains the model & weight of the trained network, and the predicted results & ground-truth.

  • Note I also compute the ERROR RATE in POINT CLOUOD tasks which could be obtained in its pickle files.

However, we also have some generalization tasks whose training and test set have different features, e.g, PieNumber, PieLineWidth and PieColor. In these conditions, we not only store the best results that obtains the lowest loss on val_set, but also store the best results on train_set. That's because VGG and RN can only fit the train_set well, but cannot fit the val_set and test_set. If we observe the loss curve of VGG in PieColor_fixedTrain or RN in PieLineWidth tasks, we could find it's possible that the val_loss have got the lowest value whereas the train_loss still didn't converge. That's because the network optimized only on train_set, and the val_set is totally different from train_set. So the val_loss don't have any relation with train_loss. Thererfore, I store both two results to show the best performance that the network can achieve on train_set and val_set respectively.

The following figure shows an example file VGG_0.p in PieLineWidth task.

4. Our network structure:

To make the generalization ability of our network more powerful, we redesgin the IRN_m network, as shown in the following figure. It makes great improvements on the conditions that (1) the training and testing set are different, e.g., Task1.1 PieNumber, or (2) the object number is large, for example task1.3 Pie3_12.

Details:

(1) Train/val/test sets contains 60000/20000/20000 charts respectively. And We use Adam optimizer (lr=0.0001) to train the network.

(2) During training, we first shuffle the datasets before each epoch, and save the best model which can get the lowest MSE loss on validation set.

(3) Noises were directly added during dataset generation. And Position-length and Point cloud got the most different values from the obvious results in Daniel's paper when I use Adam optimizer.

Experiments1: Our new tasks.

(These experiments focus on verifying the generalization ability of networks.)

Note that: For most tasks, we use the best model on validation set to compute its final MSE and MLAE etc. However, VGG, VGG_seg and RN don't have strong generalization abilibity so that they can not deal with the validation/testing sets in PieNumber and PieLineWidth. It may happen that the network has obtained the lowest loss on validation sets but it still doesn't converage on training set. Therefor, to evaluate them better, only for VGG and RN in PieNumber and PieLineWidth tasks, we use the best model on training set instead of on validation set to compute the MSE on training set, while we still use the best model on validation set to compute its MSE on testing sets.

Task1.1: Pie3_6 and Pie3_12

[Pie3_6 Codes] [Pie3_12 Codes]

  • This task is to test the performance when the maximun object number is large and the number changes greatly. The object number in both training and testing sets is 3 to 6 (12) for task Pie3_6(Pie3_12). I think 12 is large enough since if we use a larger number, the chart would looks messy.
  • It's clear that VGG and RN performs worse when the number of objects increased, as the following results proved. Whereas, our network, IRN_m can still perform very nice. It seems that the MLAE of VGG increases more significantly than MSE.
MSE(MLAE) VGG RN IRN_p IRN_m (!!!)
Pie3_6: Train set 0.00036(0.67) 0.00435(2.26) 0.00016(-0.25) 0.00012(-0.29)
Pie3_6: Test set 0.00038(0.70) 0.00438(2.26) 0.00017(-0.22) 0.00012(-0.28)
Pie3_12: Train set 0.00089(1.24) 0.00705(2.54) 0.00033(0.12) 0.00023(0.00)
Pie3_12: Test set 0.00098(1.29) 0.00727(2.56) 0.00041(0.25) 0.00024(0.02)

Task1.4: PieColor

[FixedTrain Codes] [RandomColor Codes]

  • We design two tasks in PieColor. (1) FixedTrain: the training set only uses 6 colors, while testing set use random colors. (2) RandomColor: Both training and testing set use random colors.
FixedTrain VGG RN IRN_m (!!!)
Train set 0.00040(0.76) 0.00443(2.28) 0.00014(-0.24)
Test set 0.06982(3.90) 0.08715(4.38) 0.00480(1.45)
RandomColor VGG RN IRN_m (!!!)
Train set 0.00051(0.86) 0.00492(2.36) 0.00015(-0.22)
Test set 0.00095(0.95) 0.00599(2.41) 0.00015(-0.21)

Pie Number: Loss

Pie Number: Loss

Task1.2: PieNumber.

[Codes]

  • The range of object number are different between training and testing sets. By default, the pie charts in training sets contain 3 to 6 pie sectors, while those in testing sets contain 7 to 9 pie sectors. For VGG, RN and IRNm, all the outputs are 9-dim vector.
  • Only our IRN_m and IRN_p can get a good result on testing set. (1) Our network can deal with the condition that training and testing sets have different object number. (2) Our network seems converage faster than VGG and RN. It seems that the validation loss of our IRN_m network has a stronger fluctuation than VGG, but it's not true. Because the order of magnitudes (数量级) of their validation loss are too much different.
MSE(MLAE) VGG VGG_seg RN IRN_p IRN_m (!!!)
Train set 0.00023(0.18) 0.00022(0.11) 0.00289(1.70) 0.00015(-0.56) 0.00010(-0.57)
Test set 0.13354(4.56) 0.14972(4.79) 0.15874(4.84) 0.00087(0.97) 0.00058(0.81)

Pie Number: Loss

Task1.3: PieLineWidth.

[Codes]

  • The line width are different between training and testing sets in this task. By default, the line width of the piechart in training sets is 1, while the width of those in testing sets is 2 or 3. In addition, the output of networks is 6-dim vector, and each chart contains 3 to 6 pie sectors.
  • Due to different line width, PieLineWidth is unlike PieNumber whose training and testing sets have same appearence domain. However, the result is surprising. We found that both IRN_p and IRN_m can get a good result in testing set.. That means if we segmeneted objects in advance and directly using CNN to extract their individual features, it does make some effect.

  • For our IRN_m network, val_loss declines with train_loss in the early stage and also keeps for many epochs. We could see that IRN_m network is able to perform very well on val_set as on train_set. However, because we always opitimize the network using train_set, so it's okay and normal that val_loss would become bad on val_set when the network try to get much better results on train_set.

MSE(MLAE) VGG RN IRN_p IRN_m (!!!)
Train set 0.00036(0.69) 0.00429(2.26) 0.00065(0.59) 0.00018(0.01)
Test set 0.06459(4.26) 0.05459(4.08) 0.00160(1.27) 0.00032(0.33)

Experiments2: ClevelandMcGill

(The experiments that are same as Daniel's paper.)

For the following experiments, I only show the MSE and MLAE on testing sets.

Task2.1: Position-Angle.

Codes: [Bar charts] [Pie charts]

MSE(MLAE) VGG RN IRN_m (!!!)
Bar chart 0.00016(0.21) 0.00394(2.34) 0.00014(-0.31)
Pie chart 0.00028(0.57) 0.00390(2.34) 0.00021(0.11)

Task2.2: Position-Length.

Codes: [MULTI] [Type1] [Type2] [Type3] [Type4] [Type5]

MSE(MLAE) VGG RN IRN_p (!!!)
Type1 0.000004(-1.77) 0.000546(0.80) 0.000008(-1.49)
Type2 0.000005(-1.66) 0.000485(0.72) 0.000007(-1.57)
Type3 0.000006(-1.63) 0.000524(0.78) 0.000007(-1.54)
Type4 0.000004(-1.80) 0.000494(0.74) 0.000010(-1.34)
Type5 0.000004(-1.77) 0.000509(0.77) 0.000009(-1.42)
Multi 0.000011(-1.41) 0.000507(0.76) 0.000008(-1.49)

Task2.3: Point-Cloud.

Codes: [Num10] [Num100] [Num1000]

MSE(MLAE) VGG RN IRN_p (!!!)
Base10 0.000099(-0.17) 0.002772(2.06) 0.000016(-1.26)
Base100 0.099914(4.77) 0.005228(2.56) 0.000045(-0.65)
Base1000 0.101107(4.79) 0.022654(3.58) 0.000894(1.29)

Experiments3: Supplement

Task3.1: The effect of Non_local_block.

Task3.2: Can orignal RN be improved by changing its structure?

Task3.3: How would RN perform when the objects are segmented directly rather than extracted by CNN.

About

We designed an instance-based compositional relation network (I-cRN) for machine graphical perception tasks by aligning with the actual human perception procedure.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 57.7%
  • Jupyter Notebook 41.8%
  • Other 0.5%