Deep cryo-EM Map Enhancer (DeepEMhancer)

DeepEMhancer is a python package designed to perform post-processing of cryo-EM maps as described in "DeepEMhancer: a deep learning solution for cryo-EM volume post-processing", by Sanchez-Garcia et al, 2021.
DeepEMhancer is a deep learning model trained on pairs of experimental volumes and atomic model-corrected volumes that is able to obtain post-processed maps using as input raw volumes, preferably half maps. Please notice that post-translational modifications and ligands were not included in the traning set and consequently, results for these features could be inaccurate.
Simply speaking, DeepEMhancer performs a non-linear post-processing of cryo-EM maps that produces two main effects:

Local sharpening-like post-processing.
Automatic masking/denoising of cryo-EM maps.

Requirements

DeepEMhancer has been tested on Linux systems. It employs Tensorflow version 1.14 that requires CUDA 10. Our installation recipe will automatically install, among other packages, Tensorflow and CUDA 10.1, so you will need NVIDA GPU drivers >= 418.39. If your drivers are not compatible and you cannot update them, you can try to compile tensorflow-gpu=1.14 using your library settings instead of installing it using conda. For those having old drivers but still compatible with CUDA 10.0, see "Alternative installation for CUDA 10.0 compatible systems" or "No conda installation".

Install from source option:

The best option to keep you updated.
Requires anaconda/miniconda, that can be obtained from https://www.anaconda.com/products/individual

Steps:

Clone this repository and cd inside

git clone https://github.com/rsanchezgarc/deepEMhancer
cd deepEMhancer

Create a conda environment with the required dependencies

conda env create -f deepEMhancer_env.yml  -n deepEMhancer_env

Activate the environment. You always need to activate the environment before executing deepEMhancer

conda activate deepEMhancer_env

Install deepEMhancer

python -m pip install . --no-deps

Download our deep learning models

deepemhancer --download

Ready! Do not forget to activate the environment for future usages. For a complete help use:

deepemhancer -h

Optionally, you can remove the folder, since deepemhancer will be available anywhere once you activate the environment

Install from Anaconda cloud:

Requires anaconda/miniconda, that can be obtained from https://www.anaconda.com/products/individual

Create a fresh conda environment

conda create -n deepEMhancer_env python=3.6

Activate the environment. You always need to activate the environment before executing deepEMhancer

conda activate deepEMhancer_env

Install deepEMhancer

conda install deepEMhancer -c rsanchez1369 -c anaconda -c conda-forge

Download our deep learning models

deepemhancer --download

Ready! Do not forget to activate the environment for future usages. For a complete help use:

deepemhancer -h

Alternative installation for CUDA 10.0 compatible systems

This option is only recommended for people with old NVIDIA drivers that are still able to work with CUDA 10.0.

The steps for this option are exactly the same that for option "Install from source option" with the exception of step 2. Thus, instead of using "deepEMhancer_env.yml" when creating the environment,

conda env create -f deepEMhancer_env.yml  -n deepEMhancer_env

"alternative_installation/deepEMhancer_cud10.0.env.yml" should be used.

conda env create -f alternative_installation/deepEMhancer_cud10.0.env.yml  -n deepEMhancer_env

It has been reported that some problems with cudnn may occur when using this installation option. Please, see TROUBLESHOOTING section 2 for a proposed solution.

No conda installation

Only works for python3. Virtualenv is recommended to isolate packages.

Clone this repository and cd inside

git clone https://github.com/rsanchezgarc/deepEMhancer
cd deepEMhancer

1.1. Optionally, create a virtual environment and activate it

pip install virtualenv
virtualenv --system-site-packages -p python3 ./deepEMhancer_env
source ./deepEMhancer_env/bin/activate

Install deepEMhancer (using Tensorflow 1.14)

For CPU only use (expect running times ~ 24h)

python -m pip install .

With GPU support
- Install CUDA 10.0 and cudnn >=7.6. Make sure that they are in the LD_LIBRARY_PATH
- install python packages

DEEPEMHANCER_INSTALL_GPU=True pip install .

Check if GPUs are successfully detected.

python -c "from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())"

You will see errors like Could not dlopen library 'libcudart.so.10.0'; dlerror if CUDA and/or cudnn (libcudnn.so.7) are not correctly installed or detected. On the contrary, if you see the message I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device, Tensorflow has been able to recognize the GPUs.

Download our deep learning models

deepemhancer --download

Ready! Do not forget to activate the environment, if used (step 1.1), for future usages. For a complete help use:

deepemhancer -h

Tensorflow 2 installation

Clone this repository and cd inside

git clone https://github.com/rsanchezgarc/deepEMhancer
cd deepEMhancer

Switch to Tensorflow 2 branch git checkout tf2
Create a conda environment conda env create -f alternative_installation/deepEMhancer_tf2.env.yml. You may want to specificity an environment name using -n envName.
Activate the environment conda activate envName
Install deepEmhancer as a command line tool pip install --no-deps .
Download deep learning models deepemhancer --download
Modify the original models to be used with tf2, python alternative_installation/convert_models_to_tf2.py (only affects the models in the default location) or python alternative_installation/convert_models_to_tf2.py path/where/models/are if you want to specify the directory where you downloaded the models.
Ready! Do not forget to activate the environment, for future usages. For a complete help use:

deepemhancer -h

Usage guide:

About the input

DeepEMhancer was trained using half-maps. Thus, as input, both half-maps are the preferred option (deepemhancer -i half1.mrc -i2 half2.mrc).
Full maps obtained from refinement process (RELION auto-refine, cryoSPARC heterogenus refinement...) are equally valid.
However, deepEMhancer will not work correctly if post-processed (masked, sharpened...) maps are provided as input (e.g. RELION postprocessing maps).

About the deep learning models (-p option)

We provide 3 different deep learning models. The default one is the tightTarget model, that was trained using tightly masked volumes. This is the default option and all the statistics reported in the publication were obtained using this model. Additionally, we provide a wideTarget model that was trained using less tightly masked maps. Finally, we have also trained a model (highRes) using a subset of the maps with resolutions <4 Å and fewer empty cubes.
We recommend our users to try the different options and choose the one that looks nicer to them. As a guidance, we suggest to employ the highRes model for maps with overall resolution better than 4 Å and a moderate amount of bad resolution regions. HighRes solutions tend to be noisier than others, but also more enhanced. If the overall resolution is worse, or the number of low resolution regions is high, the tightTarget model should do a good job. For cases in which both tightTarget and highRes produce too tightly masked solutions, possibly removing some parts of the protein as if they were noise, we recommend to employ the wideTarget model.

About the normaliztion

One of the key aspects to succesfully employ DeepEMhancer is the normalization of the input volumes. The default normalization mode, mode 1, normalizes the data such that the statistics of the noise regions are forced to adopt a mean value of 0 and a standard deviation of 0.1.
If no flag is provided, deepEMhancer will try to automatically determine a spherical shell of noise from which the statistics of the noise are estimated. This automatic normalization tends to work well, although it may fail in some cases. For example, hollow proteins or fiber proteins could cause problems.
Alternatively, the user can manually determine the statistics of the noise and provide them to the program using the flag --noiseStats mean_noise std_noise. One easy way to determine the noise statistics is to employ UCSF Chimera to crop a noise-only region of the map (Volume Viewer>Features>Region bounds) and then compute the statistics (Volume Viewer>Tools>Volume Mean, SD, RSD).
Finally, as an alternative normalization, mode 2 normalizes the input using a binary mask (1 protein, 0 not protein). This option was introduced to deal with masked maps (which are not suitable for default DeepEMhancer) and is not recommended when it is possible to employ normalization mode 1.

About the batch size

DeepEMhancer processes input maps by chunking them into smaller cubes that are sent to GPUs. Batch size parameter represent the number of smaller cubes that are simultaneously processed by the GPUs. A typical value for an 8 GB GPU could be
--batch_size 6. If OUT OF MEMORY error happens, try to lower batch_size, and if low GPU usage is observed (via nvidia-smi), try to increase it. Setting the environmental variable TF_FORCE_GPU_ALLOW_GROWTH='true' prior execution could also help to fix some GPU memory errors. When using multiple GPUs, for certain box sizes, there might happen a reported bug affecting the batch_size, please see TROUBLESHOOTING error 3.

Examples

Download deep learning models

deepemhancer --download

Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using default deep model (tightTarget)

deepemhancer  -i path/to/inputVol.mrc -o  path/to/outputVol.mrc

Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using softer deep model (wideTarget)

deepemhancer -p wideTarget -i path/to/inputVol.mrc -o  path/to/outputVol.mrc

Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using high resolution deep model

deepemhancer -p highRes -i path/to/inputVol.mrc -o  path/to/outputVol.mrc

Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using high resolution deep learning model located in path/to/deep/learningModel

deepemhancer -p highRes  --deepLearningModelDir path/to/deep/learningModel -i path/to/inputVol.mrc -o  path/to/outputVol.mrc

Post-process input map path/to/inputVol.mrc and save it at path/to/outputVol.mrc using high resolution deep model and providing normalization information (mean and standard deviation of the noise)

deepemhancer -p highRes -i path/to/inputVol.mrc -o  path/to/outputVol.mrc --noiseStats 0.12 0.03

TROUBLESHOOTING

Error:

tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

Explanation: The drivers of your NVIDA GPU are too old.
Solution: Update your drivers to version >= 418.39. Alternatively, for driver versions 410.48 to 418.39 you could try the "Alternative installation for Nvida-Driver 410", the "No conda installation" or install yourself Tensorflow using your CUDA setup. Although we have not tested it, deepEMhancer will probably also work with older Tensorflow versions that require CUDA 9, so they could be also considered.

Error:

  (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node conv3d_1/convolution}}]]
0 successful operations.
0 derived errors ignored.

or

E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

Explanation: This is a reported issue for Tensorflow, and many times occurs when getting out of GPU memory. In other cases is related with an incompatibility between CUDA and cudnn versions.
Solution:
- If it is caused by memory constraints, set dynamic GPU allocation using the environment variable TF_FORCE_GPU_ALLOW_GROWTH='true'. E.g. TF_FORCE_GPU_ALLOW_GROWTH='true' deepemhancer -i ~/tmp/useCase/EMD-0193.mrc -o ~/tmp/outVolDeepEMhancer/out.mrc
- If it is caused by incompatibility between CUDA and cudnn, you should try to reinstall it ensuring that CUDA and cudnn versions match and they are compatible with the Tensorflow version. We are using Tensorflow version 14, but we think that older versions, compatible with CUDA 9 could also work.

Error:

F ./tensorflow/core/kernels/conv_2d_gpu.h:935] Non-OK-status: CudaLaunchKernel( SwapDimension1And2InTensor3UsingTiles<T, kNumThreads, kTileSize, kTileSize, conjugate>, total_tiles_count, kNumThreads, 0, d.stream(), input, input_dims, output) status: Internal: invalid configuration argument

Aborted (core dumped)

Explanation: This is a reported issue for Tensorflow when using multiple GPUS and the number of subcubes or the batch size is not divisible by the number of GPUs
Solution: Use only one GPU (-g 1) and/or batch size 1 (-b 1)

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
alternative_installation		alternative_installation
condaDeepEMHancer		condaDeepEMHancer
deepEMhancer		deepEMhancer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deepEMhancer_env.yml		deepEMhancer_env.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep cryo-EM Map Enhancer (DeepEMhancer)

Table of Contents

INSTALLATION:

Requirements

Install from source option:

Install from Anaconda cloud:

Alternative installation for CUDA 10.0 compatible systems

No conda installation

Tensorflow 2 installation

Usage guide:

About the input

About the deep learning models (-p option)

About the normaliztion

About the batch size

Examples

TROUBLESHOOTING

About

Releases

Packages

Languages

License

sparrow2015/deepEMhancer

Folders and files

Latest commit

History

Repository files navigation

Deep cryo-EM Map Enhancer (DeepEMhancer)

Table of Contents

INSTALLATION:

Requirements

Install from source option:

Install from Anaconda cloud:

Alternative installation for CUDA 10.0 compatible systems

No conda installation

Tensorflow 2 installation

Usage guide:

About the input

About the deep learning models (-p option)

About the normaliztion

About the batch size

Examples

TROUBLESHOOTING

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages