diff --git a/README.md b/README.md index 178ab074..475db58c 100644 --- a/README.md +++ b/README.md @@ -10,12 +10,14 @@ You can retrieve the repository state for the published KinFragLib paper in rele ## Table of contents -- [Description](#description) - [Repository content](#repository-content) +- [Description](#description) - [Quick start](#quick-start) - [Contact](#contact) - [License](#license) - [Citation](#citation) +- [List of publications](#list-of-publications) + ## Repository content @@ -25,9 +27,9 @@ This repository holds the following resources: 2. *Quick start* notebook explaining how to load and use the fragment library. 3. Notebooks - 3.1. Notebooks covering the full analyses regarding the fragment and combinatorial libraries as described in + 3.1. *KinFragLib*: Notebooks covering the full analyses regarding the fragment and combinatorial libraries as described in the corresponding paper. - 3.2. Notebooks providing a custom filtering framework to reduce the fragment library size. + 3.2. *CustomKinFragLib*: Notebooks providing a custom filtering framework to reduce the fragment library size. Please find detailed descriptions of files in `data/` and `notebooks/` in the folders' `README` files. @@ -52,7 +54,7 @@ Following this approach, a fragment library is created with respective subpocket an in-depth analysis of the chemical space of known kinase inhibitors and can be used to enumerate recombined fragments in order to generate novel potential inhibitors. -We have added an extension with *CustomKinFragLib* which provides a pipeline to filter the fragments in KinFragLib checking for unwanted substructures (PAINS and Brenk et al.), lead-/drug-likeness (Rule of Three and QED), synthesizability (similarity to buyable building blocks and SYBA) and pairwise retrosynthesizability. Each filter can be (de-)activated and the parameters can be modified by the user to create a customized filtered fragment library. +We have added an extension with *CustomKinFragLib* which provides a pipeline to filter the fragments in KinFragLib checking for unwanted substructures (PAINS and Brenk et al.), drug-likeness (Rule of Three and QED), synthesizability (similarity to buyable building blocks and SYBA) and pairwise retrosynthesizability. Each filter can be (de-)activated and the parameters can be modified by the user to create a customized filtered fragment library. ## Quick start diff --git a/data/filters/retrosynthesizability/README.md b/data/filters/retrosynthesizability/README.md index 0c7e487c..304cba9d 100644 --- a/data/filters/retrosynthesizability/README.md +++ b/data/filters/retrosynthesizability/README.md @@ -2,4 +2,4 @@ - `retro.txt`: File containing the results that were already assembled in previous queries for fragment pairs or those that will be queried in future searches. It contains \[pair SMILES\]; \[child(ren) 1 SMILES\]; \[child(ren) 2 SMILES\]; \[plausibility/ies\] for every requested fragment pair. -**Note:** The ASKCOS results for the given KinFragLib data are already precomputed, thus for that ASKCOS does not need to be installed to successfully run this notebook. However, if the notebook is executed on new data, the ASKCOS needs to be installed beforehand. To install ASKCOS, please follow the installation given at [https://askcos-docs.mit.edu/](https://askcos-docs.mit.edu/guide/1-Introduction/1.1-Introduction.html). \ No newline at end of file +**Note:** The ASKCOS results for the given KinFragLib data are already precomputed, thus ASKCOS does not need to be installed to successfully run this notebook. However, if the notebook is executed on new data, the ASKCOS needs to be installed beforehand. To install ASKCOS, please follow the installation given at [https://askcos-docs.mit.edu/](https://askcos-docs.mit.edu/guide/1-Introduction/1.1-Introduction.html). \ No newline at end of file diff --git a/data/fragment_library/README.md b/data/fragment_library/README.md index 54fe774f..62c04e2a 100644 --- a/data/fragment_library/README.md +++ b/data/fragment_library/README.md @@ -1,6 +1,6 @@ # KinFragLib: Full fragment library -The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of about 3,000 fragments, +The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises about 3,000 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases. ## Fragment library diff --git a/data/fragment_library_custom_filtered/README.md b/data/fragment_library_custom_filtered/README.md index 56dd4302..0c5c6185 100644 --- a/data/fragment_library_custom_filtered/README.md +++ b/data/fragment_library_custom_filtered/README.md @@ -1,16 +1,16 @@ # Custom filtered fragment library -The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 7486 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`). +The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`). -To reduce the fragment library size and enable the recombination avoiding combinatorial explosion and to increase the chance of synthesizability of the newly created molecules, the fragment library can now be filtered by customizable filtering steps, namely: +To reduce the fragment library size and enable the recombination avoiding combinatorial explosion and increase the chance of synthesizability of the newly created molecules, the fragment library can now be filtered by customizable filtering steps, namely: 1. Pre-filtering (Remove pool X, deduplicate, remove unfragmented fragments, remove fragments only connecting to pool X and fragments in pool X) \[mandatory\] 2. Filter for unwanted substructures (PAINS and Brenk et al.) \[optional\] -3. Filter for drug likeness (Ro3 and QED) \[optional\] +3. Filter for drug-likeness (Ro3 and QED) \[optional\] 4. Filter for synthesizability (Buyable building blocks and SYBA) \[optional\] 5. Filter for pairwise retrosynthesizability (using ASKCOS) \[optional\] - `custom_filter_results.csv`: File containing the filtering results, including per fragment, from the pre-filtered library, the SMILES and subpocket as indices, the calculated scores and boolean columns, if a fragment passes a specific filter, generated by the filtering steps. -- `AP.sdf`, `FP.sdf`, `GA.sdf`, `SE.sdf`, `B1.sdf`, and `B2.sdf`: custom filtered fragment library organized by subpocket (as decribed in `data/fragment_library`) +- `AP.sdf`, `FP.sdf`, `GA.sdf`, `SE.sdf`, and `B1.sdf`: custom filtered fragment library organized by subpocket (as described in `data/fragment_library`) -Please refer to the notebook `notebooks/custom_kinfraglib/2_1_custom_filters_pipeline.ipynb` to check how the data was generated and/or to generate your own custom fragment library (de-)activating filters and modifying the filtering parameters. +Please refer to the notebook `notebooks/custom_kinfraglib/2_1_custom_filters_pipeline.ipynb` to check how the data was generated and/or to generate your own filtered fragment library (de-)activating filters and modifying the filtering parameters. diff --git a/data/fragment_library_filtered/README.md b/data/fragment_library_filtered/README.md index 73d44d08..ef663f0b 100644 --- a/data/fragment_library_filtered/README.md +++ b/data/fragment_library_filtered/README.md @@ -1,8 +1,8 @@ # Filtered fragment library -The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`). +The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`). -In order to prepare a library with fragments tailored for recombination, we offer heare a filtered fragment library (2447 fragments) based on the following filters: +In order to prepare a library with fragments tailored for recombination, we offer here a filtered fragment library (2447 fragments) based on the following filters: 1. Remove pool X 2. Deduplicate fragment library (per subpocket) diff --git a/data/fragment_library_old/README.md b/data/fragment_library_old/README.md index 843f7456..62ff466e 100644 --- a/data/fragment_library_old/README.md +++ b/data/fragment_library_old/README.md @@ -16,13 +16,13 @@ Please download the previous KinFragLib version from zenodo ([https://zenodo.org Fragments are organized by the subpockets they occupy. Each fragment subpocket pool is stored in an SDF file: - `AP.sdf` - `FP.sdf` - `GA.sdf` - `SE.sdf` - `B1.sdf` - `B2.sdf` - `X.sdf` + AP.sdf + FP.sdf + GA.sdf + SE.sdf + B1.sdf + B2.sdf + X.sdf Each fragment contains the following information: @@ -36,13 +36,13 @@ co-crystallized with. - `atom.prop.subpocket`: Subpocket assignment for each of the fragment's atoms. - `atom.prop.environment`: BRICS environment IDs for each of the fragment's atoms. -Please refer to `notebooks/1_1_quick_start.ipynb` on how to load and work with this dataset. +Please refer to `notebooks/kinfraglib/1_1_quick_start.ipynb` on how to load and work with this dataset. ## Original ligands Original ligands that are composed of the fragments in the full fragment library are stored as a CSV file: - `original_ligands.csv` + original_ligands.csv Each ligand contains the following information: diff --git a/data/fragment_library_reduced/README.md b/data/fragment_library_reduced/README.md index 02adcb85..31e46bbd 100644 --- a/data/fragment_library_reduced/README.md +++ b/data/fragment_library_reduced/README.md @@ -1,6 +1,6 @@ # Reduced fragment library -The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`). +The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`). In order to demonstrate how this library can be used for recombining ligands, we offer here a reduced fragment library (727 fragments) based on the following filters: diff --git a/notebooks/custom_kinfraglib/README.md b/notebooks/custom_kinfraglib/README.md index fbf92c7d..611ad9ea 100644 --- a/notebooks/custom_kinfraglib/README.md +++ b/notebooks/custom_kinfraglib/README.md @@ -8,7 +8,7 @@ This notebook filters out fragments not fulfilling the Rule of Three and the Qua of Druglikeness (QED), which both reflect the molecular properties of the fragments. ### `1_3_custom_filters_synthesizability.ipynb` This notebook filters the fragments for synthesizability using a buyable building block -filter and the SYnthetic Bayesian Accessibility (SYBA). +filter and the SYnthetic Bayesian Accessibility (SYBA) score. ### `1_4_custom_filters_pairwise_retrosynthesizability.ipynb` This notebook builds fragment pairs using only those fragments that passed all custom filtering steps. Next, it uses the ASKCOS API to check if a one-step retrosynthetic route for this pair can be found and children, building this fragment pair, are returned from ASKCOS. diff --git a/notebooks/kinfraglib/README.md b/notebooks/kinfraglib/README.md index 9185b8e4..29763428 100644 --- a/notebooks/kinfraglib/README.md +++ b/notebooks/kinfraglib/README.md @@ -1,6 +1,5 @@ # KinFragLib notebooks - -Overview on notebook content. +Overview of notebook content. ## 1. Quick start @@ -50,15 +49,15 @@ The aim of this notebook is to extract information from the combinatorial librar ### `4_2_combinatorial_library_properties.ipynb` -In this notebook, we want to analyze properties of the combinatorial library, such as the ligand size and Lipinski's rule of five criteria. +In this notebook, we want to analyze the properties of the combinatorial library, such as the ligand size and Lipinski's rule of five criteria. ### `4_3_combinatorial_library_comparison_klifs.ipynb` -In this notebook, we want to compare the combinatorial library to the original KLIFS ligands, i.e. the ligands from which the fragment library originates from. We consider exact and substructure matches. +In this notebook, we want to compare the combinatorial library to the original KLIFS ligands, i.e. the ligands from which the fragment library originates. We consider exact and substructure matches. ### `4_4_combinatorial_library_comparison_chembl.ipynb` In this notebook, we want to compare the combinatorial library to the ChEMBL 33 dataset in order to find exact matches and the most similar ChEMBL molecule per recombined ligand. ### `4_5_combinatorial_library_consrtuct_ligand.ipynb` -In this notebook, we showcase how the molecules described via fragment and bond indices the combinatorial library can be build into `rdkit` molecule objects. \ No newline at end of file +In this notebook, we showcase how the molecules described via fragment and bond indices from the combinatorial library can be built into `rdkit` molecule objects. \ No newline at end of file