Auto-QChem stores molecular descriptors in a MongoDB type database. A small web-based user-interface has been
created to facilitate extractions of descriptors from the database into .xlsx
files for further analysis.
Navigate to the landing page
Query form has 2 fields, both are optional:
- Select tags (multiple choice) - each molecule in the DB has an associated tag (or a list of tags), they are used to mark specific collections of molecules. If you select multiple tags, molecules for all tags will be displayed. If left blank all molecules in the DB will be queried
- SMARTS substructure - queries the molecules for a substructre using the SMARTS query (SMILES strings are a subset of SMARTS), quick reference to the SMARTS query language can be looked up here https://www.daylight.com/dayhtml_tutorials/languages/smarts/index.html
There are two buttons Query
and Export
.
- Query - queries the DB and displays the table of queried molecules
- Export - downloads the displayed table as an
.xlsx
file, shall be used after hittingQuery
Result of an example query on a single tag with 1166 molecules, and with SMARTS query for anhydrides:
For each entry in the table a link to a descriptors lookup called is available in the rightmost column. It will display the the QChem descriptors for the given molecule. If the molecule contains multiple conformations, the "Boltzmann" average of all descriptors is shown.
Once molecules have been queried, their descriptors can be extracted into an .xlsx
file by toggling the
Download descriptors
bar and filling the form.
All fields are required
- Descriptor Presets (multiple choice) - the following presets are available, choose as many as needed:
- Global - molecule level descriptors, e.g. homo energy, dipole moment, molecular weight, etc.
- Min Max Atomic - atomic level descriptors minimum and maximum over the atoms within the molecule, e.g. buried volume, Mulliken charge, NMR shift, etc.
- Substructure Core - atomic level descriptors for the common core of atoms within the dataset, the common core is determined using the MCS procedure from rdkit. If substructure has been used for filtering, the common core will include the substructure and potentially more atoms.
- Substructure Labeled - atomic level descriptors for labeled molecules. The labels must be consistent, i.e. each molecule must have exactly the same labels, for example 1,2,3,4, the labelled elements can be different, only the numbering scheme shall be consistent.
- Transitions - top 10 excited state transitions ordered by their oscillation strength
- Conformer option (single choice) - choose one of the following options:
- Boltzmann - Boltzmann average
- Max - lowest energy conformer (maximum weight conformer)
- Min - highest energy conformer (minimum weight conformer)
- Mean - arithmetic average
- Std - standard deviation over the conformers
- Any - randomily chosen conformer
- Download - download the descriptors to an
.xlsx
file. Note: when extracting descriptors for hundreds of molecules this operation can take up to few minutes, depending on the server load.