Implementation of antibody CDR conformation classification method for Conformational analysis of antibody CDR loops upon binding
Unbound CDR conformation clusters, both Canonical and affinity propagation (AP) clusters grouped as LRCs, in this publication are placed in dirs/classifier.
- Each classifier is a scikit-learn AP AffinityPropagation object, packed as
joblib
file, for details refer to scikit-learn documentation at here
A JSON file summarizing the LRC groups, Canonical clusters, and AP clusters is placed in dirs/LRC_AP_cluster.json.
To obtain a snapshot of the AbDb version 20220926
used in this publication
# install gdown if not installed
$ pip install gdown
$ gdown https://drive.google.com/uc?id=1kAgSOjYBqb02IIEsc9yhJNiaoWRhpoaL -O AbDb_20220926.tar.gz
$ tar -zxf AbDb_20220926.tar.gz
- This will download a tarball of AbDb version
20220926
(size: 2.4G) and extract it to current folder./abdb_newdata_20220926
, you can rename the folder to e.g.AbDb
if you like. - You can place this folder anywhere on your machine, but make sure to include a softlink to it in
./dirs
folder (see Dependencies for details)
- python == 3.9
- numpy==1.23.5
- pandas==2.0.3
- scipy==1.1.0
- scikit-learn==1.3.0
- PyYAML==6.0.1
- joblib==1.3.1
- biopython==1.81
clustal-omega
executable path, this is set default to /usr/local/bin/clustalo
in config/classify_general_abdb_entry.yaml
, change it to the correct path if necessary.
The script classify_general_abdb_entry.py
(see Usage) for CDR conformation classification takes an AbDb antibody file as input, you can obtain AbDb from abYbank, the version used in the publication is 20220926
.
After download AbDb, place it inside ./dirs
or simply create a softlink inside ./dirs
pointing to it, for example
cd ./dirs
ln -s /path/to/ABDB ./ABDB
pip install git+https://github.com/biochunan/CDRConformationClassification.git
This will install a command line tool cdrclu
for CDR conformation classification, see Usage for details.
$ cdrclu --cdr all \
--outdir path/to/folder/output \
--abdb path/to/folder/AbDb/ \
1ikf_0P
Create a python 3.9 environment and install dependencies
# create an environment named cdrclass
$ conda create -n cdrclass python=3.9
$ conda activate cdrclass
# install dependencies
$ cd /path/to/CDRConformationClassification
$ pip install -e .
Run classification on a single AbDb structure, for example 1ikf_0P
$ python classify_general_abdb_entry.py \
--cdr all \
--outdir path/to/folder/output \
--abdb path/to/folder/AbDb \
1ikf_0P
Or use the command line tool cdrclu
(see above)
This outputs a JSON file in ./results
directory, the file name is 1ikf_0P.json
, it has the following content:
[
{
"H1": {
"closest_lrc": "H1-10-allT",
"closest_AP_cluster_label": 35,
"closest_AP_cluster_exemplar_id": "4z95_0",
"closest_AP_cluster_size": 105,
"closest_can_cluster_index": 1,
"merged_AP_cluster_label": null,
"merged_AP_cluster_exemplar_id": null,
"merged_AP_cluster_size": null,
"merged_can_cluster_index": null,
"merge_with_closest_exemplar_torsional": true,
"merge_with_any_exemplar_cartesian": null,
"merged": true
}
},
]
Here, use CDR-H1 loop as an example, the classification results are stored in a list of dictionaries, each dictionary contains the classification results for a CDR loop.
closest_lrc
: the closest LRC group in torsional spaceclosest_AP_cluster_label
: the closest AP cluster label, this is a unique integer assigned to each AP cluster within a LRC groupclosest_AP_cluster_exemplar_id
: the closest AP cluster exemplar ID, this is the ID of the structure that is used as the exemplar for the AP clusterclosest_AP_cluster_size
: the size of the closest AP cluster, this is the number of structures in the AP clusterclosest_can_cluster_index
: the closest canonical cluster index, this is the index of the canonical cluster that the closest AP cluster belongs to, and this is a unique integer assigned to each canonical cluster within a LRC groupmerged_AP_cluster_label
: the merged AP cluster label, this is a unique integer assigned to each AP cluster within a LRC groupmerged_AP_cluster_exemplar_id
: the merged AP cluster exemplar ID, this is the ID of the structure that is used as the exemplar for the AP clustermerged_AP_cluster_size
: the size of the merged AP cluster, this is the number of structures in the AP clustermerged_can_cluster_index
: the merged canonical cluster index, this is the index of the canonical cluster that the merged AP cluster belongs to, and this is a unique integer assigned to each canonical cluster within a LRC groupmerge_with_closest_exemplar_torsional
: True if the query CDR conformation is merged with the closest AP cluster measured in torsional spacemerge_with_any_exemplar_cartesian
:True
if the query CDR conformation is merged with an AP cluster measured in Cartesian spaceFalse
otherwisenull
if searching in Cartesian space was not carried out, i.e. the query CDR conformation is merged with the closest AP cluster measured in torsional space
- "merged":
True
if the query CDR conformation is merged with an AP cluster`,False
otherwise
In this case, the query CDR-H1 loop conformation (CDR-H1 loop in 1ikf_0P
) is merged with the closest AP cluster in torsional space whose exemplar is the CDR-H1 loop in 6azk_0
.
See the following gif for the visualization of CDR-H1 loop conformation of 1ikf_0P
(lightblue) superimposed onto the CDR-H1 loop of the closest AP cluster exemplar 6azk_0
(blue).