Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encountered Problems/Error #1

Open
mosecedr opened this issue Nov 18, 2020 · 1 comment
Open

Encountered Problems/Error #1

mosecedr opened this issue Nov 18, 2020 · 1 comment

Comments

@mosecedr
Copy link

mosecedr commented Nov 18, 2020

Hi,

First I want to preface this post by stating that I'm quite new to Python and programming in general so my description of problems and the terms I use are probably lacking.

System used:
Linux 4.15.0-123-generic x86_64 with Ubuntu 18.04.5

My questions would be:

  1. Are the template pipelines (segmentation.py etc.) complete (besides changing the paths) and useable for the full process from training with annotated data to actually using the trained neural network on a new video or is it expected that the user also makes changes in the code? Because from what I understood, the "segmentation.py", "behavior.py" etc. modules only train the networks and require annotated data. So which module would be used to analyze new videos with already trained networks? Is it the "full_inference.py" module?

  2. I tried to run the "segmentation.py" module with the command
    python ./SwissKnife/segmentation.py --operation train_mouse
    after changing the following things in the code/system:

  • adjusted paths
  • temporarily (and probably stupidly) duplicated resnet50.py and replaced the name resnet50 with resnet101 in one of the files in both, the keras/application and the keras-application folder of miniconda because there was no resnet101.py in the pip-installed package
  • line 519 in segmentation.py:
    model.init_training(model_path=model_path, init_with="last")
    to
    model.init_training(model_path=model_path, init_with="coco")
  1. the console gave the following output/error:

Using TensorFlow backend.
WARNING: Logging before flag parsing goes to stderr.
W1118 18:19:00.544893 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/SIPEC/SwissKnife/utils.py:200: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

W1118 18:19:00.559130 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/SIPEC/SwissKnife/utils.py:201: The name tf.random.set_random_seed is deprecated. Please use tf.compat.v1.random.set_random_seed instead.

load data
data loaded
W1118 18:19:00.560931 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W1118 18:19:00.565707 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W1118 18:19:00.588845 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W1118 18:19:00.613820 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:1919: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

W1118 18:19:00.615761 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3976: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

W1118 18:19:02.903520 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:2018: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

W1118 18:19:03.479370 140560456329024 deprecation.py:323] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py:1354: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W1118 18:19:03.597378 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/SIPEC/SwissKnife/mrcnn/model.py:557: The name tf.random_shuffle is deprecated. Please use tf.random.shuffle instead.

W1118 18:19:03.681768 140560456329024 deprecation.py:506] From /tungstenfs/scratch/gkeller/mosecedr/SIPEC/SwissKnife/mrcnn/model.py:604: calling crop_and_resize_v1 (from tensorflow.python.ops.image_ops_impl) with box_ind is deprecated and will be removed in a future version.
Instructions for updating:
box_ind is deprecated, use box_indices instead

Configurations:
BACKBONE resnet101
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 1
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE None
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.9
DETECTION_NMS_THRESHOLD 0.3
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_NORM 1.0
IMAGES_PER_GPU 1
IMAGE_CHANNEL_COUNT 3
IMAGE_MAX_DIM 1024
IMAGE_META_SIZE 14
IMAGE_MIN_DIM 1024
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE square
IMAGE_SHAPE [1024 1024 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 4
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56, 56)
NAME mouse
NUM_CLASSES 2
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
PRE_NMS_LIMIT 6000
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 100
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN False
TRAIN_ROIS_PER_IMAGE 128
USE_MINI_MASK True
USE_RPN_ROIS True
VALIDATION_STEPS 50
WEIGHT_DECAY 0.0001

2020-11-18 18:19:10.843771: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2020-11-18 18:19:10.873395: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3092635000 Hz
2020-11-18 18:19:10.877795: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55cb72c91a10 executing computations on platform Host. Devices:
2020-11-18 18:19:10.877838: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
2020-11-18 18:19:10.880047: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-11-18 18:19:11.295720: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-18 18:19:11.296426: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:05:00.0
2020-11-18 18:19:11.296997: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2020-11-18 18:19:11.299417: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
2020-11-18 18:19:11.301091: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10
2020-11-18 18:19:11.301581: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10
2020-11-18 18:19:11.303551: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10
2020-11-18 18:19:11.304743: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10
2020-11-18 18:19:11.309008: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-11-18 18:19:11.309173: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-18 18:19:11.309775: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-18 18:19:11.310259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-11-18 18:19:11.310315: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2020-11-18 18:19:11.423073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-18 18:19:11.423123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2020-11-18 18:19:11.423132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-11-18 18:19:11.423378: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-18 18:19:11.424004: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-18 18:19:11.424569: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-18 18:19:11.425110: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7612 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:05:00.0, compute capability: 6.1)
2020-11-18 18:19:11.427620: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55cb737d3c30 executing computations on platform CUDA. Devices:
2020-11-18 18:19:11.427647: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1080, Compute Capability 6.1
training on #NUM images : 0

Starting at epoch 0. LR=0.001

Checkpoint Path: /tungstenfs/scratch/gkeller/mosecedr/SIPEC_results/mouse_/mouse20201118T1819/mask_rcnn_mouse_{epoch:04d}.h5
Selecting layers to train
fpn_c5p5 (Conv2D)
fpn_c4p4 (Conv2D)
fpn_c3p3 (Conv2D)
fpn_c2p2 (Conv2D)
fpn_p5 (Conv2D)
fpn_p2 (Conv2D)
fpn_p3 (Conv2D)
fpn_p4 (Conv2D)
In model: rpn_model
rpn_conv_shared (Conv2D)
rpn_class_raw (Conv2D)
rpn_bbox_pred (Conv2D)
mrcnn_mask_conv1 (TimeDistributed)
mrcnn_mask_bn1 (TimeDistributed)
mrcnn_mask_conv2 (TimeDistributed)
mrcnn_mask_bn2 (TimeDistributed)
mrcnn_class_conv1 (TimeDistributed)
mrcnn_class_bn1 (TimeDistributed)
mrcnn_mask_conv3 (TimeDistributed)
mrcnn_mask_bn3 (TimeDistributed)
mrcnn_class_conv2 (TimeDistributed)
mrcnn_class_bn2 (TimeDistributed)
mrcnn_mask_conv4 (TimeDistributed)
mrcnn_mask_bn4 (TimeDistributed)
mrcnn_bbox_fc (TimeDistributed)
mrcnn_mask_deconv (TimeDistributed)
mrcnn_class_logits (TimeDistributed)
mrcnn_mask (TimeDistributed)
W1118 18:19:16.837014 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/engine/training_generator.py:47: UserWarning: Using a generator with use_multiprocessing=True and multiple workers may duplicate your data. Please consider using thekeras.utils.Sequence class. UserWarning('Using a generator with use_multiprocessing=True`'
W1118 18:19:21.265680 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/callbacks.py:850: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

W1118 18:19:21.266057 140560456329024 deprecation_wrapper.py:119] From /tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/callbacks.py:853: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

Epoch 1/3
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/tungstenfs/scratch/gkeller/mosecedr/SIPEC/SwissKnife/mrcnn/model.py", line 1696, in data_generator
image_index = (image_index + 1) % len(image_ids)
ZeroDivisionError: integer division or modulo by zero

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/utils/data_utils.py", line 626, in next_sample
return six.next(_SHARED_SEQUENCES[uid])
File "/tungstenfs/scratch/gkeller/mosecedr/SIPEC/SwissKnife/mrcnn/model.py", line 1814, in data_generator
dataset.image_info[image_id]))
UnboundLocalError: local variable 'image_id' referenced before assignment
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "./SwissKnife/segmentation.py", line 775, in
main()
File "./SwissKnife/segmentation.py", line 688, in main
fraction=fraction,
File "./SwissKnife/segmentation.py", line 617, in train_on_data
model_path, species, cv_folds=cv_folds, fold=0, fraction=fraction
File "./SwissKnife/segmentation.py", line 523, in train_on_data_once
model.train(dataset_train, dataset_val)
File "./SwissKnife/segmentation.py", line 261, in train
augmentation=self.augmentation,
File "/tungstenfs/scratch/gkeller/mosecedr/SIPEC/SwissKnife/mrcnn/model.py", line 2383, in train
use_multiprocessing=True,
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/engine/training_generator.py", line 181, in fit_generator
generator_output = next(output_generator)
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/utils/data_utils.py", line 709, in get
six.reraise(*sys.exc_info())
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/site-packages/keras/utils/data_utils.py", line 685, in get
inputs = self.queue.get(block=True).get()
File "/tungstenfs/scratch/gkeller/mosecedr/miniconda3/envs/SIPEC/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
UnboundLocalError: local variable 'image_id' referenced before assignment

  1. Do you have any idea what the problem could be?

Edit: I managed to solve the problem. I didn't read the code properly and did not store the frames in a "train" or "val" folder which lead to an empty frame variable.


\

Additional remarks for the installation process that might need to be changed:

  • installing PyYAML 5.1 from requirements.txt throws an error because it tries to uninstall the newer version first which was not possible because "it is a distutils installed project"

  • after installation through the requirements.txt anaconda gave a message that for some of the installed packages the dependencies are not correct. For example tensorflow 1.14.0 needs tensorboard >=1.14.0,<1.15.0 but in the requirements.txt it specifies tensorboard==1.10.0. There were more incorrect dependencies but I'm not sure which they were.

  • As mentioned above, the keras\application as well as the keras_application folder that was downloaded through pip-install did not contain a "resnet101.py" file and the respective "init.py" files have no "reference" to it. I also tried to pip install the newer keras-application package but it didn't help.

\

I'm sorry for the wall of text and hope that some of my described experiences can maybe be useful to ease the installation process and the use of this software for beginners like me.

Kind Regards
Cedric

@damaggu
Copy link
Member

damaggu commented Nov 24, 2020

Hi Cedric,

thank you so much for your feedback, its greatly appreciated. We're currently trying to reproduce your issues and incorporating your feedback into the next version of SIPEC, plus we will add some more detailed documentation and exemplary data. I will also document how those issues have been addressed
I think we will release a new version soon (within the next 1-2 weeks) and it'd be amazing to hear more from your experience then with the new version.

Thanks a ton and hopefully hear soon,

Markus

damaggu added a commit that referenced this issue Jun 24, 2021
chadhat added a commit that referenced this issue Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants