We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I tried to train a new model by running train.py, but I got this:
train.py
[2023-06-28 10:32:08,821::train::INFO] Namespace(config='./configs/train.yml', device='cuda', logdir='./logs') [2023-06-28 10:32:08,821::train::INFO] {'model': {'vn': 'vn', 'hidden_channels': 256, 'hidden_channels_vec': 64, 'encoder': {'name': 'cftfm', 'hidden_channels': 256, 'hidden_channels_vec': 64, 'edge_channels': 64, 'key_channels': 128, 'num_heads': 4, 'num_interactions': 6, 'cutoff': 10.0, 'knn': 48}, 'field': {'name': 'classifier', 'num_filters': 128, 'num_filters_vec': 32, 'edge_channels': 64, 'num_heads': 4, 'cutoff': 10.0, 'knn': 32}, 'position': {'num_filters': 128, 'n_component': 3}}, 'train': {'seed': 2023, 'use_apex': False, 'batch_size': 8, 'num_workers': 8, 'pin_memory': True, 'max_iters': 500000, 'val_freq': 5000, 'pos_noise_std': 0.1, 'max_grad_norm': 100.0, 'optimizer': {'type': 'adam', 'lr': 0.0002, 'weight_decay': 0, 'beta1': 0.99, 'beta2': 0.999}, 'scheduler': {'type': 'plateau', 'factor': 0.6, 'patience': 8, 'min_lr': 1e-05}, 'transform': {'mask': {'type': 'mixed', 'min_ratio': 0.0, 'max_ratio': 1.1, 'min_num_masked': 1, 'min_num_unmasked': 0, 'p_random': 0.15, 'p_bfs': 0.6, 'p_invbfs': 0.25}, 'contrastive': {'num_real': 20, 'num_fake': 20, 'pos_real_std': 0.05, 'pos_fake_std': 2.0}, 'edgesampler': {'k': 8}}}, 'dataset': {'name': 'pl', 'path': './data/crossdocked_pocket10', 'split': './data/split_by_name.pt'}} [2023-06-28 10:32:08,823::train::INFO] Loading dataset... [2023-06-28 10:32:09,280::train::INFO] Building model... Num of parameters is 3711167 /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Exception in thread Thread-2: Traceback (most recent call last): File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 49, in _pin_memory_loop do_one_step() File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 26, in do_one_step r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL) File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/multiprocessing/queues.py", line 113, in get return _ForkingPickler.loads(res) File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 305, in rebuild_storage_fd fd = df.detach() File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach return reduction.recv_handle(conn) File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle return recvfds(s, 1)[0] File "/data/sdb/opt/miniconda3/envs/aidd/lib/python3.7/multiprocessing/reduction.py", line 161, in recvfds len(ancdata)) RuntimeError: received 0 items of ancdata [2023-06-28 10:32:12,068::train::INFO] [Train] Iter 1 | Loss 10.276641 | Loss(Fron) 0.631725 | Loss(Pos) 3.812413 | Loss(Cls) 1.901050 | Loss(Edge) 1.675482 | Loss(Real) 0.126777 | Loss(Fake) 2.129193 | Loss(Surf) 0.000000 [2023-06-28 10:32:12,073::train::ERROR] Runtime Error Pin memory thread exited unexpectedly Traceback (most recent call last): File "train.py", line 227, in <module> train(it) File "train.py", line 108, in train batch = next(train_iterator).to(args.device) StopIteration
The text was updated successfully, but these errors were encountered:
Try adding torch.multiprocessing.set_sharing_strategy('file_system') at the top of the file.
torch.multiprocessing.set_sharing_strategy('file_system')
Sorry, something went wrong.
No branches or pull requests
I tried to train a new model by running
train.py
, but I got this:The text was updated successfully, but these errors were encountered: