Releases: openkinome/kinodata
Releases · openkinome/kinodata
Add ChEMBL 33
- Update to ChEMBL 33
- Removal of kinase.com as a data source
- Updated Uniprot API calls
- Both non-curated and curated CSVs are provided as well as a sample of the first 100 data points for testing
- EGFR kinase bioactivities based on the curated CSV as well as a sample of the first 100 data points for testing
Datapoints in each dataset:
Dataset | Non-curated | Curated | Sample | EGFR Kinase | EGFR Sample |
---|---|---|---|---|---|
ChEMBL 27 | 217 612 | 174 238 | - | - | - |
ChEMBL 28 | 237 336 | 186 972 | - | - | - |
ChEMBL 29 | 242 609 | 190 634 | 100 | 6 509 | 100 |
ChEMBL 30 | 252 191 | 197 073 | 100 | 6 733 | 100 |
ChEMBL 33 | 272 922 | 211 607 | 100 | 7 287 | 100 |
Add CHEMBL 30
- Update to ChEMBL 30
- Both non-curated and curated CSVs are provided as well as a sample of the first 100 data points for testing
- EGFR kinase bioactivities based on the curated CSV as well as a sample of the first 100 data points for testing
Datapoints in each dataset:
Dataset | Non-curated | Curated | Sample | EGFR Kinase | EGFR Sample |
---|---|---|---|---|---|
ChEMBL 27 | 217 612 | 174 238 | - | - | - |
ChEMBL 28 | 237 336 | 186 972 | - | - | - |
ChEMBL 29 | 242 609 | 190 634 | 100 | 6 509 | 100 |
ChEMBL 30 | 252 191 | 197 073 | 100 | 6 733 | 100 |
Add CHEMBL 29
- Update to ChEMBL 29
- Both non-curated and curated CSVs are provided as well as a sample of the first 100 data points for testing
- EGFR kinase bioactivities based on the curated CSV as well as a sample of the first 100 data points for testing
Datapoints in each dataset:
Dataset | Non-curated | Curated | Sample | EGFR Kinase | EGFR Sample |
---|---|---|---|---|---|
ChEMBL 27 | 217 612 | 174 238 | - | - | - |
ChEMBL 28 | 237 336 | 186 972 | - | - | - |
ChEMBL 29 | 242 609 | 190 634 | 100 | 6 509 | 100 |
Add ChEMBL 28 and curation pipeline
- Fixed UniProt queries for the human kinome
- Add support for ChEMBL 28
- Add curation pipeline. Both non-curated and curated CSVs are provided
Datapoints in each dataset:
Dataset | Non-curated | Curated |
---|---|---|
ChEMBL 27 | 182 223 | 148 836 |
ChEMBL 28 | 199 238 | 159 978 |
The *sample100*
file is a subset of 100 datapoints for testing purposes.
Initial release
I will start cutting releases for datascripts
so we can easily guarantee data provenance in other projects.
This adds data for ChEMBL27, hereby attached.