Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BiopythonParserWarning: 'HEADER' line not found; can't determine PDB ID. #54

Open
johnnytam100 opened this issue Jul 8, 2024 · 4 comments

Comments

@johnnytam100
Copy link

Hi DeepFRI!
Do you have idea how to troubleshoot this BiopythonParserWarning: 'HEADER' line not found error as follows?

(DeepFRI) johnnytam100@DESKTOP-BDBH5VJ:/mnt/e/test/test_DeepFRI/DeepFRI$ python predict.py --pdb_dir ./allergen_esmfold_domain_pdb -ont mf --saliency
2024-07-08 12:16:36.392951: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2024-07-08 12:16:37.742941: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2024-07-08 12:16:37.904566: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:968] could not open file to read NUMA node: /sys/bus/pci/devices/0000:04:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-07-08 12:16:37.906981: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:04:00.0 name: NVIDIA GeForce RTX 3080 Ti computeCapability: 8.6
coreClock: 1.695GHz coreCount: 80 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 849.46GiB/s
2024-07-08 12:16:37.907029: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2024-07-08 12:16:37.940296: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2024-07-08 12:16:37.975221: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2024-07-08 12:16:37.978857: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2024-07-08 12:16:38.033696: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2024-07-08 12:16:38.046140: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2024-07-08 12:16:38.047257: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64:
2024-07-08 12:16:38.047293: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2024-07-08 12:16:38.047688: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-07-08 12:16:38.056416: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 3700035000 Hz
2024-07-08 12:16:38.058111: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x481ed10 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-07-08 12:16:38.058134: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2024-07-08 12:16:38.059240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-07-08 12:16:38.059264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
### Computing predictions from directory with PDB files...
/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/Bio/SeqIO/PdbIO.py:292: BiopythonParserWarning: 'HEADER' line not found; can't determine PDB ID.
  BiopythonParserWarning,
Traceback (most recent call last):
  File "predict.py", line 47, in <module>
    predictor.predict_from_PDB_dir(args.pdb_dir)
  File "/mnt/e/test/test_DeepFRI/DeepFRI/deepfrier/Predictor.py", line 144, in predict_from_PDB_dir
    y = self.model([A, S], training=False).numpy()[:, :, 0].reshape(-1)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 985, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/engine/functional.py", line 386, in call
    inputs, training=training, mask=mask)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/engine/functional.py", line 508, in _run_internal_graph
    outputs = node.layer(*args, **kwargs)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 985, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/engine/functional.py", line 386, in call
    inputs, training=training, mask=mask)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/engine/functional.py", line 508, in _run_internal_graph
    outputs = node.layer(*args, **kwargs)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/layers/recurrent.py", line 659, in __call__
    return super(RNN, self).__call__(inputs, **kwargs)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 985, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 110, in call
    output, states = self._process_batch(inputs, initial_state)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 507, in _process_batch
    outputs, h, c, _, _ = gen_cudnn_rnn_ops.cudnn_rnnv2(**args)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/ops/gen_cudnn_rnn_ops.py", line 1740, in cudnn_rnnv2
    ctx=_ctx)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/ops/gen_cudnn_rnn_ops.py", line 1817, in cudnn_rnnv2_eager_fallback
    attrs=_attrs, ctx=ctx, name=name)
  File "/home/johnnytam100/anaconda3/envs/DeepFRI/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.NotFoundError: Could not find device for node: {{node CudnnRNNV2}} = CudnnRNNV2[T=DT_FLOAT, direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="lstm", seed=0, seed2=0]
All kernels registered for op CudnnRNNV2:
  device='GPU'; T in [DT_DOUBLE]
  device='GPU'; T in [DT_FLOAT]
  device='GPU'; T in [DT_HALF]
 [Op:CudnnRNNV2]
@Rohit-Satyam
Copy link

Hi this is more of a tensorflow error rather than Biopython. Try the installation as I recommended here and let me know if it resolves the issue

@Rohit-Satyam
Copy link

Rohit-Satyam commented Aug 1, 2024

To take care of Biopython inability to parse header of PDB I am using the following trick:

ls -1 /data/foldseek/af/*.pdb.gz | while read p; 
do
name=$(echo $p | xargs -n 1 basename | cut -f 2 -d '-'); 
python predict.py -pdb $p -ont bp -v;
python predict.py -pdb $p -ont mf -v;
python predict.py -pdb $p -ont cc -v;
python predict.py -pdb $p -ont ec -v;
sed -i "s/query_prot/${name}/g" DeepFRI_BP_predictions.csv;
sed -i "s/query_prot/${name}/g" DeepFRI_CC_predictions.csv;
sed -i "s/query_prot/${name}/g" DeepFRI_MF_predictions.csv;
sed -i "s/query_prot/${name}/g" DeepFRI_EC_predictions.csv;
mv DeepFRI_BP_predictions.csv ${name}.BP.csv;
mv DeepFRI_CC_predictions.csv ${name}.CC.csv;
mv DeepFRI_MF_predictions.csv ${name}.MF.csv;
mv DeepFRI_EC_predictions.csv ${name}.EC.csv;
done

@johnnytam100
Copy link
Author

Hi @Rohit-Satyam ! Thank you for helping out!
May I know where should I run this trick?

ls -1 /data/foldseek/af/*.pdb.gz | while read p; 
do
name=$(echo $p | xargs -n 1 basename | cut -f 2 -d '-'); 
python predict.py -pdb $p -ont bp -v;
python predict.py -pdb $p -ont mf -v;
python predict.py -pdb $p -ont cc -v;
python predict.py -pdb $p -ont ec -v;
sed -i "s/query_prot/${name}/g" DeepFRI_BP_predictions.csv;
sed -i "s/query_prot/${name}/g" DeepFRI_CC_predictions.csv;
sed -i "s/query_prot/${name}/g" DeepFRI_MF_predictions.csv;
sed -i "s/query_prot/${name}/g" DeepFRI_EC_predictions.csv;
mv DeepFRI_BP_predictions.csv ${name}.BP.csv;
mv DeepFRI_CC_predictions.csv ${name}.CC.csv;
mv DeepFRI_MF_predictions.csv ${name}.MF.csv;
mv DeepFRI_EC_predictions.csv ${name}.EC.csv;
done

@johnnytam100
Copy link
Author

Hi this is more of a tensorflow error rather than Biopython. Try the installation as I recommended here and let me know if it resolves the issue

By the way, this solution didn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants