Skip to content

Commit

Permalink
checking readmes
Browse files Browse the repository at this point in the history
  • Loading branch information
PaulaKramer committed Aug 5, 2024
1 parent cc51094 commit ba2b845
Show file tree
Hide file tree
Showing 9 changed files with 30 additions and 29 deletions.
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,14 @@ You can retrieve the repository state for the published KinFragLib paper in rele

## Table of contents

- [Description](#description)
- [Repository content](#repository-content)
- [Description](#description)
- [Quick start](#quick-start)
- [Contact](#contact)
- [License](#license)
- [Citation](#citation)
- [List of publications](#list-of-publications)


## Repository content

Expand All @@ -25,9 +27,9 @@ This repository holds the following resources:
2. *Quick start* notebook explaining how to load and use the fragment library.
3. Notebooks

3.1. Notebooks covering the full analyses regarding the fragment and combinatorial libraries as described in
3.1. *KinFragLib*: Notebooks covering the full analyses regarding the fragment and combinatorial libraries as described in
the corresponding paper.
3.2. Notebooks providing a custom filtering framework to reduce the fragment library size.
3.2. *CustomKinFragLib*: Notebooks providing a custom filtering framework to reduce the fragment library size.

Please find detailed descriptions of files in `data/` and `notebooks/` in the folders' `README` files.

Expand All @@ -52,7 +54,7 @@ Following this approach, a fragment library is created with respective subpocket
an in-depth analysis of the chemical space of known kinase inhibitors and can be used to enumerate recombined
fragments in order to generate novel potential inhibitors.

We have added an extension with *CustomKinFragLib* which provides a pipeline to filter the fragments in KinFragLib checking for unwanted substructures (PAINS and Brenk et al.), lead-/drug-likeness (Rule of Three and QED), synthesizability (similarity to buyable building blocks and SYBA) and pairwise retrosynthesizability. Each filter can be (de-)activated and the parameters can be modified by the user to create a customized filtered fragment library.
We have added an extension with *CustomKinFragLib* which provides a pipeline to filter the fragments in KinFragLib checking for unwanted substructures (PAINS and Brenk et al.), drug-likeness (Rule of Three and QED), synthesizability (similarity to buyable building blocks and SYBA) and pairwise retrosynthesizability. Each filter can be (de-)activated and the parameters can be modified by the user to create a customized filtered fragment library.

## Quick start

Expand Down
2 changes: 1 addition & 1 deletion data/filters/retrosynthesizability/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

- `retro.txt`: File containing the results that were already assembled in previous queries for fragment pairs or those that will be queried in future searches. It contains \[pair SMILES\]; \[child(ren) 1 SMILES\]; \[child(ren) 2 SMILES\]; \[plausibility/ies\] for every requested fragment pair.

**Note:** The ASKCOS results for the given KinFragLib data are already precomputed, thus for that ASKCOS does not need to be installed to successfully run this notebook. However, if the notebook is executed on new data, the ASKCOS needs to be installed beforehand. To install ASKCOS, please follow the installation given at [https://askcos-docs.mit.edu/](https://askcos-docs.mit.edu/guide/1-Introduction/1.1-Introduction.html).
**Note:** The ASKCOS results for the given KinFragLib data are already precomputed, thus ASKCOS does not need to be installed to successfully run this notebook. However, if the notebook is executed on new data, the ASKCOS needs to be installed beforehand. To install ASKCOS, please follow the installation given at [https://askcos-docs.mit.edu/](https://askcos-docs.mit.edu/guide/1-Introduction/1.1-Introduction.html).
2 changes: 1 addition & 1 deletion data/fragment_library/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# KinFragLib: Full fragment library

The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of about 3,000 fragments,
The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises about 3,000 fragments,
which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases.

## Fragment library
Expand Down
10 changes: 5 additions & 5 deletions data/fragment_library_custom_filtered/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Custom filtered fragment library

The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 7486 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).
The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).

To reduce the fragment library size and enable the recombination avoiding combinatorial explosion and to increase the chance of synthesizability of the newly created molecules, the fragment library can now be filtered by customizable filtering steps, namely:
To reduce the fragment library size and enable the recombination avoiding combinatorial explosion and increase the chance of synthesizability of the newly created molecules, the fragment library can now be filtered by customizable filtering steps, namely:

1. Pre-filtering (Remove pool X, deduplicate, remove unfragmented fragments, remove fragments only connecting to pool X and fragments in pool X) \[mandatory\]
2. Filter for unwanted substructures (PAINS and Brenk et al.) \[optional\]
3. Filter for drug likeness (Ro3 and QED) \[optional\]
3. Filter for drug-likeness (Ro3 and QED) \[optional\]
4. Filter for synthesizability (Buyable building blocks and SYBA) \[optional\]
5. Filter for pairwise retrosynthesizability (using ASKCOS) \[optional\]

- `custom_filter_results.csv`: File containing the filtering results, including per fragment, from the pre-filtered library, the SMILES and subpocket as indices, the calculated scores and boolean columns, if a fragment passes a specific filter, generated by the filtering steps.
- `AP.sdf`, `FP.sdf`, `GA.sdf`, `SE.sdf`, `B1.sdf`, and `B2.sdf`: custom filtered fragment library organized by subpocket (as decribed in `data/fragment_library`)
- `AP.sdf`, `FP.sdf`, `GA.sdf`, `SE.sdf`, and `B1.sdf`: custom filtered fragment library organized by subpocket (as described in `data/fragment_library`)

Please refer to the notebook `notebooks/custom_kinfraglib/2_1_custom_filters_pipeline.ipynb` to check how the data was generated and/or to generate your own custom fragment library (de-)activating filters and modifying the filtering parameters.
Please refer to the notebook `notebooks/custom_kinfraglib/2_1_custom_filters_pipeline.ipynb` to check how the data was generated and/or to generate your own filtered fragment library (de-)activating filters and modifying the filtering parameters.
4 changes: 2 additions & 2 deletions data/fragment_library_filtered/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Filtered fragment library

The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).
The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).

In order to prepare a library with fragments tailored for recombination, we offer heare a filtered fragment library (2447 fragments) based on the following filters:
In order to prepare a library with fragments tailored for recombination, we offer here a filtered fragment library (2447 fragments) based on the following filters:

1. Remove pool X
2. Deduplicate fragment library (per subpocket)
Expand Down
18 changes: 9 additions & 9 deletions data/fragment_library_old/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ Please download the previous KinFragLib version from zenodo ([https://zenodo.org

Fragments are organized by the subpockets they occupy. Each fragment subpocket pool is stored in an SDF file:

`AP.sdf`
`FP.sdf`
`GA.sdf`
`SE.sdf`
`B1.sdf`
`B2.sdf`
`X.sdf`
AP.sdf
FP.sdf
GA.sdf
SE.sdf
B1.sdf
B2.sdf
X.sdf

Each fragment contains the following information:

Expand All @@ -36,13 +36,13 @@ co-crystallized with.
- `atom.prop.subpocket`: Subpocket assignment for each of the fragment's atoms.
- `atom.prop.environment`: BRICS environment IDs for each of the fragment's atoms.

Please refer to `notebooks/1_1_quick_start.ipynb` on how to load and work with this dataset.
Please refer to `notebooks/kinfraglib/1_1_quick_start.ipynb` on how to load and work with this dataset.

## Original ligands

Original ligands that are composed of the fragments in the full fragment library are stored as a CSV file:

`original_ligands.csv`
original_ligands.csv

Each ligand contains the following information:

Expand Down
2 changes: 1 addition & 1 deletion data/fragment_library_reduced/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Reduced fragment library

The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).
The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises 9505 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).

In order to demonstrate how this library can be used for recombining ligands, we offer here a reduced fragment library (727 fragments) based on the following filters:

Expand Down
2 changes: 1 addition & 1 deletion notebooks/custom_kinfraglib/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This notebook filters out fragments not fulfilling the Rule of Three and the Qua
of Druglikeness (QED), which both reflect the molecular properties of the fragments.
### `1_3_custom_filters_synthesizability.ipynb`
This notebook filters the fragments for synthesizability using a buyable building block
filter and the SYnthetic Bayesian Accessibility (SYBA).
filter and the SYnthetic Bayesian Accessibility (SYBA) score.
### `1_4_custom_filters_pairwise_retrosynthesizability.ipynb`
This notebook builds fragment pairs using only those fragments that passed all custom filtering steps.
Next, it uses the ASKCOS API to check if a one-step retrosynthetic route for this pair can be found and children, building this fragment pair, are returned from ASKCOS.
Expand Down
9 changes: 4 additions & 5 deletions notebooks/kinfraglib/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# KinFragLib notebooks

Overview on notebook content.
Overview of notebook content.

## 1. Quick start

Expand Down Expand Up @@ -50,15 +49,15 @@ The aim of this notebook is to extract information from the combinatorial librar

### `4_2_combinatorial_library_properties.ipynb`

In this notebook, we want to analyze properties of the combinatorial library, such as the ligand size and Lipinski's rule of five criteria.
In this notebook, we want to analyze the properties of the combinatorial library, such as the ligand size and Lipinski's rule of five criteria.

### `4_3_combinatorial_library_comparison_klifs.ipynb`

In this notebook, we want to compare the combinatorial library to the original KLIFS ligands, i.e. the ligands from which the fragment library originates from. We consider exact and substructure matches.
In this notebook, we want to compare the combinatorial library to the original KLIFS ligands, i.e. the ligands from which the fragment library originates. We consider exact and substructure matches.

### `4_4_combinatorial_library_comparison_chembl.ipynb`

In this notebook, we want to compare the combinatorial library to the ChEMBL 33 dataset in order to find exact matches and the most similar ChEMBL molecule per recombined ligand.

### `4_5_combinatorial_library_consrtuct_ligand.ipynb`
In this notebook, we showcase how the molecules described via fragment and bond indices the combinatorial library can be build into `rdkit` molecule objects.
In this notebook, we showcase how the molecules described via fragment and bond indices from the combinatorial library can be built into `rdkit` molecule objects.

0 comments on commit ba2b845

Please sign in to comment.