DeepEMhancer is a python package designed to perform post-processing of
cryo-EM maps as described in "DeepEMhancer: a deep learning solution for cryo-EM volume post-processing", by Sanchez-Garcia et al, 2021.
DeepEMhancer is a deep learning model trained on pairs of experimental volumes and atomic model-corrected volumes that is able to obtain post-processed maps using as input raw volumes, preferably half maps. Please notice that post-translational modifications and ligands were not included in the traning set and consequently, results for these features could be inaccurate.
Simply speaking, DeepEMhancer performs a non-linear post-processing of cryo-EM maps that produces two main effects:
- Local sharpening-like post-processing.
- Automatic masking/denoising of cryo-EM maps.
- INSTALLATION
- USAGE GUIDE
- EXAMPLES
- TROUBLESHOOTING
To get a complete description of usage, executedeepemhancer -h
- Requirements
- Install from source option
- Install from Anaconda cloud
- Alternative installation for CUDA 10.0
- No conda installation
- Tensorflow 2 installation
DeepEMhancer has been tested on Linux systems. It employs Tensorflow version 1.14 that requires CUDA 10. Our installation recipe will automatically install, among other packages, Tensorflow and CUDA 10.1, so you will need NVIDA GPU drivers >= 418.39. If your drivers are not compatible and you cannot update them, you can try to compile tensorflow-gpu=1.14 using your library settings instead of installing it using conda. For those having old drivers but still compatible with CUDA 10.0, see "Alternative installation for CUDA 10.0 compatible systems" or "No conda installation".
The best option to keep you updated.
Requires anaconda/miniconda, that can be obtained from https://www.anaconda.com/products/individual
Steps:
- Clone this repository and cd inside
git clone https://github.com/rsanchezgarc/deepEMhancer
cd deepEMhancer
- Create a conda environment with the required dependencies
conda env create -f deepEMhancer_env.yml -n deepEMhancer_env
- Activate the environment. You always need to activate the environment before executing deepEMhancer
conda activate deepEMhancer_env
- Install deepEMhancer
python -m pip install . --no-deps
- Download our deep learning models
deepemhancer --download
- Ready! Do not forget to activate the environment for future usages. For a complete help use:
deepemhancer -h
- Optionally, you can remove the folder, since deepemhancer will be available anywhere once you activate the environment
Requires anaconda/miniconda, that can be obtained from https://www.anaconda.com/products/individual
- Create a fresh conda environment
conda create -n deepEMhancer_env python=3.6
- Activate the environment. You always need to activate the environment before executing deepEMhancer
conda activate deepEMhancer_env
- Install deepEMhancer
conda install deepEMhancer -c rsanchez1369 -c anaconda -c conda-forge
- Download our deep learning models
deepemhancer --download
- Ready! Do not forget to activate the environment for future usages. For a complete help use:
deepemhancer -h
This option is only recommended for people with old NVIDIA drivers that are still able to work with CUDA 10.0.
The steps for this option are exactly the same that for option "Install from source option" with the exception of step 2. Thus, instead of using "deepEMhancer_env.yml" when creating the environment,
conda env create -f deepEMhancer_env.yml -n deepEMhancer_env
"alternative_installation/deepEMhancer_cud10.0.env.yml" should be used.
conda env create -f alternative_installation/deepEMhancer_cud10.0.env.yml -n deepEMhancer_env
It has been reported that some problems with cudnn may occur when using this installation option. Please, see TROUBLESHOOTING section 2 for a proposed solution.
Only works for python3. Virtualenv is recommended to isolate packages.
- Clone this repository and cd inside
git clone https://github.com/rsanchezgarc/deepEMhancer
cd deepEMhancer
1.1. Optionally, create a virtual environment and activate it
pip install virtualenv
virtualenv --system-site-packages -p python3 ./deepEMhancer_env
source ./deepEMhancer_env/bin/activate
- Install deepEMhancer (using Tensorflow 1.14)
- For CPU only use (expect running times ~ 24h)
python -m pip install .
- With GPU support
- Install CUDA 10.0 and cudnn >=7.6. Make sure that they are in the LD_LIBRARY_PATH
- install python packages
DEEPEMHANCER_INSTALL_GPU=True pip install .
- Check if GPUs are successfully detected.
python -c "from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())"
You will see errors like Could not dlopen library 'libcudart.so.10.0'; dlerror
if CUDA and/or cudnn
(libcudnn.so.7) are not correctly installed or detected. On the contrary, if you see the message
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device
,
Tensorflow has been able to recognize the GPUs.
- Download our deep learning models
deepemhancer --download
- Ready! Do not forget to activate the environment, if used (step 1.1), for future usages. For a complete help use:
deepemhancer -h
- Clone this repository and cd inside
git clone https://github.com/rsanchezgarc/deepEMhancer
cd deepEMhancer
- Switch to Tensorflow 2 branch
git checkout tf2
- Create a conda environment
conda env create -f alternative_installation/deepEMhancer_tf2.env.yml
. You may want to specificity an environment name using-n envName
. - Activate the environment
conda activate envName
- Install deepEmhancer as a command line tool
pip install --no-deps .
- Download deep learning models
deepemhancer --download
- Modify the original models to be used with tf2,
python alternative_installation/convert_models_to_tf2.py
(only affects the models in the default location) orpython alternative_installation/convert_models_to_tf2.py path/where/models/are
if you want to specify the directory where you downloaded the models. - Ready! Do not forget to activate the environment, for future usages. For a complete help use:
deepemhancer -h
DeepEMhancer was trained using half-maps. Thus, as input, both half-maps are the preferred option (deepemhancer -i half1.mrc -i2 half2.mrc
).
Full maps obtained from refinement process (RELION auto-refine, cryoSPARC heterogenus refinement...) are equally valid.
However, deepEMhancer will not work correctly if post-processed (masked, sharpened...) maps are provided as input
(e.g. RELION postprocessing maps).
We provide 3 different deep learning models. The default one is the tightTarget model, that was trained using
tightly masked volumes. This is the default option and all the statistics reported in the publication were obtained
using this model. Additionally, we provide a wideTarget model that was trained using less tightly masked maps. Finally,
we have also trained a model (highRes) using a subset of the maps with resolutions <4 Å and fewer empty cubes.
We recommend our users to try the different options and choose the one that looks nicer to them. As a guidance,
we suggest to employ the highRes model for maps with overall resolution better than 4 Å and a moderate amount of bad
resolution regions. HighRes solutions tend to be noisier than others, but also more enhanced.
If the overall resolution is worse, or the number of low resolution regions is high, the tightTarget
model should do a good job. For cases in which both tightTarget and highRes produce too tightly masked solutions, possibly removing
some parts of the protein as if they were noise, we recommend to employ the wideTarget model.
One of the key aspects to succesfully employ DeepEMhancer is the normalization of the input volumes.
The default normalization mode, mode 1, normalizes the data such that the statistics of the noise regions
are forced to adopt a mean value of 0 and a standard deviation of 0.1.
If no flag is provided, deepEMhancer will try to automatically determine a spherical shell of noise from
which the statistics of the noise are estimated. This automatic normalization tends to work well, although it
may fail in some cases. For example, hollow proteins or fiber proteins could cause problems.
Alternatively, the user can manually determine the statistics of the noise and provide them to the program using
the flag --noiseStats mean_noise std_noise
. One easy way to determine the noise statistics is to employ UCSF Chimera
to crop a noise-only region of the map (Volume Viewer>Features>Region bounds
) and then compute the statistics (Volume Viewer>Tools>Volume Mean, SD, RSD
).
Finally, as an alternative normalization, mode 2 normalizes the input using a binary mask (1 protein, 0 not protein).
This option was introduced to deal with masked maps (which are not suitable for default DeepEMhancer) and is not recommended
when it is possible to employ normalization mode 1.
DeepEMhancer processes input maps by chunking them into smaller cubes that are sent to GPUs. Batch size parameter represent
the number of smaller cubes that are simultaneously processed by the GPUs. A typical value for an 8 GB GPU could be
--batch_size 6
. If OUT OF MEMORY error happens, try to lower batch_size, and if low GPU usage is observed (via nvidia-smi), try
to increase it. Setting the environmental variable TF_FORCE_GPU_ALLOW_GROWTH='true'
prior execution could also help to fix some GPU memory errors. When using multiple GPUs, for certain box sizes, there might happen a reported bug affecting the batch_size, please see TROUBLESHOOTING error 3.
- Download deep learning models
deepemhancer --download
- Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using default deep model (tightTarget)
deepemhancer -i path/to/inputVol.mrc -o path/to/outputVol.mrc
- Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using softer deep model (wideTarget)
deepemhancer -p wideTarget -i path/to/inputVol.mrc -o path/to/outputVol.mrc
- Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using high resolution deep model
deepemhancer -p highRes -i path/to/inputVol.mrc -o path/to/outputVol.mrc
- Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using high resolution deep learning model located in path/to/deep/learningModel
deepemhancer -p highRes --deepLearningModelDir path/to/deep/learningModel -i path/to/inputVol.mrc -o path/to/outputVol.mrc
- Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using high resolution deep model and providing normalization information (mean and standard deviation of the noise)
deepemhancer -p highRes -i path/to/inputVol.mrc -o path/to/outputVol.mrc --noiseStats 0.12 0.03
- Error:
tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
-
Explanation: The drivers of your NVIDA GPU are too old.
-
Solution: Update your drivers to version >= 418.39. Alternatively, for driver versions 410.48 to 418.39 you could try the "Alternative installation for Nvida-Driver 410", the "No conda installation" or install yourself Tensorflow using your CUDA setup. Although we have not tested it, deepEMhancer will probably also work with older Tensorflow versions that require CUDA 9, so they could be also considered.
- Error:
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv3d_1/convolution}}]]
0 successful operations.
0 derived errors ignored.
or
E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
-
Explanation: This is a reported issue for Tensorflow, and many times occurs when getting out of GPU memory. In other cases is related with an incompatibility between CUDA and cudnn versions.
-
Solution:
-
If it is caused by memory constraints, set dynamic GPU allocation using the environment variable TF_FORCE_GPU_ALLOW_GROWTH='true'. E.g.
TF_FORCE_GPU_ALLOW_GROWTH='true' deepemhancer -i ~/tmp/useCase/EMD-0193.mrc -o ~/tmp/outVolDeepEMhancer/out.mrc
-
If it is caused by incompatibility between CUDA and cudnn, you should try to reinstall it ensuring that CUDA and cudnn versions match and they are compatible with the Tensorflow version. We are using Tensorflow version 14, but we think that older versions, compatible with CUDA 9 could also work.
-
- Error:
F ./tensorflow/core/kernels/conv_2d_gpu.h:935] Non-OK-status: CudaLaunchKernel( SwapDimension1And2InTensor3UsingTiles<T, kNumThreads, kTileSize, kTileSize, conjugate>, total_tiles_count, kNumThreads, 0, d.stream(), input, input_dims, output) status: Internal: invalid configuration argument
Aborted (core dumped)
- Explanation: This is a reported issue for Tensorflow when using multiple GPUS and the number of subcubes or the batch size is not divisible by the number of GPUs
- Solution: Use only one GPU (
-g 1
) and/or batch size 1 (-b 1
)