-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atom Corrections and Bond additivity Correction for DLPNO-CCSD(T) #1935
Conversation
…-tzvp normalPNO and wb97xd/def2tzvp were added.
We should also update the ReferenceSpecies database with these calculations. @dranasinghe do you have the Arkane YAML files from the calculations? |
Codecov Report
@@ Coverage Diff @@
## master #1935 +/- ##
=======================================
Coverage 45.05% 45.05%
=======================================
Files 86 86
Lines 22213 22213
Branches 5788 5788
=======================================
Hits 10008 10008
Misses 11099 11099
Partials 1106 1106
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!!
Could you also address Mark's comment?
Also, the commit message is too long, could you make a short header and add the remaining details to the commit message body?
arkane/encorr/data.py
Outdated
SOC = {'H': 0.0, 'N': 0.0, 'O': -0.000355, 'C': -0.000135, 'S': -0.000893, 'P': 0.0, 'I': -0.011547226} | ||
# Spin orbit correction for F, Si, Cl, Br and B taken form https://cccbdb.nist.gov/elecspin.asp | ||
SOC = {'H': 0.0, 'N': 0.0, 'O': -0.000355, 'C': -0.000135, 'S': -0.000893, 'P': 0.0, 'I': -0.011547226, | ||
'F': -0.000614, 'Si': -0.000682 , 'Cl': -0.001338, 'Br': -0.005597 , 'B': -0.000046 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please correct white spaces: there's an etra one before 'F'
, and an extra one after 'B': -0.000046
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
arkane/encorr/data.py
Outdated
'O-O': 0.09663, 'C=S': 0.80888, 'H-S': 1.23648, 'C=O': 0.88066, 'C=N': 0.03998, | ||
'H-N': -0.49564, 'H-O': -0.41183, 'H-H': 0.6263, 'N#N': 3.71325, 'N-N': 0.74915, | ||
'N-O': -0.62156, 'C=C': -0.63901, 'O=S': -1.39626, 'O-S': -1.37002, 'S-S': 0.1515, | ||
'F-S': -0.68693, 'F-O': 0.09202, 'F-H': -1.68214, 'F-F': 0.95483, 'O=O': -2.64949 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a ,
at the end of the line (and also for the block below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These corrections are obtained from calculations that use dlpno-ccsd(t)/def2-tzvp
for both energy calculations and geometry optimizations? If the geometry optimization was done with a different level of theory, it should be included in the model chemistry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how to include it current structure. So I left a comment.
'C=O': 1.00501, 'C=N': -0.68305, 'H-N': -0.52232, 'H-O': -1.18129, 'H-H': -1.76862, 'N#N': -3.84259, | ||
'N-N': 2.66325, 'N-O': 1.69619, 'C=C': -0.11192, 'O=S': -1.33397, 'O-S': -1.71863, 'S-S': 0.5224, | ||
'F-S': -1.28933, 'F-O': -0.03756, 'F-H': -3.71018, 'F-F': -1.71494, 'O=O': -6.70857 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add two lines to the Model Chemistry table in the documentation (https://github.com/ReactionMechanismGenerator/RMG-Py/blob/master/documentation/source/users/arkane/input.rst).
@aelong and @amarkpayne Thank you for the comments. I made the changes. I have all the YAML files. I will update the reference database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I added a couple of minor comments to the docs commit.
@@ -119,12 +119,16 @@ Model Chemistry AEC BC SOC Freq Scale Supp | |||
``'B3LYP/6-31G(d,p)'`` v v v (0.961) H, C, O, S | |||
``'MRCI+Davidson/aug-cc-pV(T+d)Z'`` v v H, C, N, O, S | |||
``'wb97x-d/aug-cc-pvtz'`` v v H, C, N, O | |||
``DLPNO-CCSD(T)/def2-tzvp`` v v v H, C, N, O, S, F, Cl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind placing DLPNO after B-CCSD(T)-F12/aug-cc-pVnZ
so all CCSD and all wb97 methods are adjacent?
================================================ ===== ==== ==== ========== ==================== | ||
|
||
Notes: | ||
|
||
- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results. | ||
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points. | ||
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO | ||
- In ``wb97xd/def2tzvp`` run using Gaussian16. Do n't use the AEC and BC for wb97xd calculated with Qchem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are excellent tips!
In Arkane, which wb97xd were calculated by QChem? Is it 'wb97x-d/aug-cc-pvtz'
? If so, Arkane won't use it due to the basis set (and also the dash in the functional name). If I misunderstood, could you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should be using the notation without hyphens. We could consider using wb97x-d/def2-tzvp
to be consistent with notation in the literature rather than notation Gaussian uses. I doubt any wb97x-d
calculations were done using Q-Chem because you would usually use wb97x-d3
in Q-Chem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of additional comments but overall it looks good, thanks for running these calcs!
When you update this branch next, drop the merge from master commit and rebase.
arkane/encorr/data.py
Outdated
# Calculated atomic energies fitted orca 4.2.1 dlpno-ccsd(t)/def2-tzvp NormalPNO | ||
# fitted using Colin's BAC algorithm and SOC are included in the correction | ||
# AEs are fitted to neutral molecules with RMSE and MAE 7.99 and 5.96 kJ/mol respectively. | ||
'dlpno-ccsd(t)/def2-tzvp': { | ||
'H': -0.49641082, 'C': -37.77274125, 'N': -54.50388932, 'O': -74.96760414, | ||
'F': -99.62044819, 'S': -397.63480236, 'Cl': -459.65784960 | ||
}, | ||
# wb97xd/def2tzvp conducted using G16. | ||
# fitted using Colin's BAC algorithm and SOC are included in the correction | ||
# AEs are fitted to neutral molecules with RMSE and MAE 9.16 and 6.66 kJ/mol respectively. | ||
'wb97xd/def2tzvp': { | ||
'H': -0.50224721, 'C': -37.84235323, 'N': -54.58757237, 'O': -75.06982154, | ||
'F': -99.74180094, 'S': -398.10765642, 'Cl': -460.14650064 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of manually adding in the SOC terms, write the energy and explicitly write in the SOC correction. For example, the correction for oxygen for dlpno-ccsd(t)/def2-tzvp would be 'O': -74.96724914 + SOC['O']
(which will get evaluated to -74.96760414). Do this for all atoms, even if the SOC is zero. This is a very minor thing, but it is better just in case we ever need to update the SOC values, and it makes it very clear that the correct SOC values were used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured I would chime in here. So these were fitted and not just calculated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
================================================ ===== ==== ==== ========== ==================== | ||
|
||
Notes: | ||
|
||
- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results. | ||
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points. | ||
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you use anything like TightSCF? It can be added here if so. I agree with @alongd it is a great idea to include this here
================================================ ===== ==== ==== ========== ==================== | ||
|
||
Notes: | ||
|
||
- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results. | ||
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points. | ||
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO | ||
- In ``wb97xd/def2tzvp`` run using Gaussian16. Do n't use the AEC and BC for wb97xd calculated with Qchem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in don't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just saw this PR and wanted to make some comments. In addition to the individual comments I made, I wanted to address some bigger issues.
What exactly is the issue with ionic/radical species? We have the full dataset including such species, so I think they should be included in the fit. I am able to obtain very good fits including ions and radicals using DFT methods, so I'd be surprised if it didn't work for DLPNO. Additionally, if we are providing Petersson-type corrections, we should also be providing Melius-type corrections.
Were the corrections fit with or without substructure weighting?
Note that a recent RMG-database PR corrected some species in the reference database and ReactionMechanismGenerator/RMG-database#404 corrects some more. The corrections in this PR should be refit after the database PR has been merged.
arkane/encorr/data.py
Outdated
# Calculated atomic energies fitted orca 4.2.1 dlpno-ccsd(t)/def2-tzvp NormalPNO | ||
# fitted using Colin's BAC algorithm and SOC are included in the correction | ||
# AEs are fitted to neutral molecules with RMSE and MAE 7.99 and 5.96 kJ/mol respectively. | ||
'dlpno-ccsd(t)/def2-tzvp': { | ||
'H': -0.49641082, 'C': -37.77274125, 'N': -54.50388932, 'O': -74.96760414, | ||
'F': -99.62044819, 'S': -397.63480236, 'Cl': -459.65784960 | ||
}, | ||
# wb97xd/def2tzvp conducted using G16. | ||
# fitted using Colin's BAC algorithm and SOC are included in the correction | ||
# AEs are fitted to neutral molecules with RMSE and MAE 9.16 and 6.66 kJ/mol respectively. | ||
'wb97xd/def2tzvp': { | ||
'H': -0.50224721, 'C': -37.84235323, 'N': -54.58757237, 'O': -75.06982154, | ||
'F': -99.74180094, 'S': -398.10765642, 'Cl': -460.14650064 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured I would chime in here. So these were fitted and not just calculated?
arkane/encorr/data.py
Outdated
'O-O': 0.09663, 'C=S': 0.80888, 'H-S': 1.23648, 'C=O': 0.88066, 'C=N': 0.03998, | ||
'H-N': -0.49564, 'H-O': -0.41183, 'H-H': 0.6263, 'N#N': 3.71325, 'N-N': 0.74915, | ||
'N-O': -0.62156, 'C=C': -0.63901, 'O=S': -1.39626, 'O-S': -1.37002, 'S-S': 0.1515, | ||
'F-S': -0.68693, 'F-O': 0.09202, 'F-H': -1.68214, 'F-F': 0.95483, 'O=O': -2.64949 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These corrections are obtained from calculations that use dlpno-ccsd(t)/def2-tzvp
for both energy calculations and geometry optimizations? If the geometry optimization was done with a different level of theory, it should be included in the model chemistry.
================================================ ===== ==== ==== ========== ==================== | ||
|
||
Notes: | ||
|
||
- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results. | ||
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points. | ||
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO | ||
- In ``wb97xd/def2tzvp`` run using Gaussian16. Do n't use the AEC and BC for wb97xd calculated with Qchem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should be using the notation without hyphens. We could consider using wb97x-d/def2-tzvp
to be consistent with notation in the literature rather than notation Gaussian uses. I doubt any wb97x-d
calculations were done using Q-Chem because you would usually use wb97x-d3
in Q-Chem.
I'll let Duminda respond to the comments as well, but my understanding for the lack of hyphens in certain place is to distinguish that this model chemistry is run using Gaussian, which does not use hypens for this in the input files. This should probably be part of a separate discussion on the future of model chemistry. In ARC, they have separate variables for "level of theory" and "model chemistry". I am not certain on what they envisioned the difference between these two to be, but I think that we need to make a distinction between the verbatim text that must be supplied to QM software to specify the level of theory and associated items like grid size and tolerances, and on the other hand have a string/dictionary/something that uniquely identifies the model chemistry to the user in a standardized way, that ideally can be used as keys in the reference species database. @alongd , @oscarwumit , @dranasinghe , @cgrambow it might be worth having a discussion over Zoom soon to discuss how to standardize model chemistry/level of theory between ARC, Arkane, and the reference species database. Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply. I made the changes you requested. I added Melius-type bond additivity correction parameters and push the changes.
arkane/encorr/data.py
Outdated
# Calculated atomic energies fitted orca 4.2.1 dlpno-ccsd(t)/def2-tzvp NormalPNO | ||
# fitted using Colin's BAC algorithm and SOC are included in the correction | ||
# AEs are fitted to neutral molecules with RMSE and MAE 7.99 and 5.96 kJ/mol respectively. | ||
'dlpno-ccsd(t)/def2-tzvp': { | ||
'H': -0.49641082, 'C': -37.77274125, 'N': -54.50388932, 'O': -74.96760414, | ||
'F': -99.62044819, 'S': -397.63480236, 'Cl': -459.65784960 | ||
}, | ||
# wb97xd/def2tzvp conducted using G16. | ||
# fitted using Colin's BAC algorithm and SOC are included in the correction | ||
# AEs are fitted to neutral molecules with RMSE and MAE 9.16 and 6.66 kJ/mol respectively. | ||
'wb97xd/def2tzvp': { | ||
'H': -0.50224721, 'C': -37.84235323, 'N': -54.58757237, 'O': -75.06982154, | ||
'F': -99.74180094, 'S': -398.10765642, 'Cl': -460.14650064 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
arkane/encorr/data.py
Outdated
'O-O': 0.09663, 'C=S': 0.80888, 'H-S': 1.23648, 'C=O': 0.88066, 'C=N': 0.03998, | ||
'H-N': -0.49564, 'H-O': -0.41183, 'H-H': 0.6263, 'N#N': 3.71325, 'N-N': 0.74915, | ||
'N-O': -0.62156, 'C=C': -0.63901, 'O=S': -1.39626, 'O-S': -1.37002, 'S-S': 0.1515, | ||
'F-S': -0.68693, 'F-O': 0.09202, 'F-H': -1.68214, 'F-F': 0.95483, 'O=O': -2.64949 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how to include it current structure. So I left a comment.
Sorry I realized bac has moved to RMGdatabase. I will make a new PR to RMG database |
Everything in this PR with the exception of the documentation update has been moved to ReactionMechanismGenerator/RMG-database#404. I'll close out this PR, but we'll add the documentation update in a later PR along with the other new model chemistries once the level of theory standardization PR #1940 has been merged in. |
Atom Corrections and Bond additivity Correction for DLPNO-CCSD(T)/def2-tzvp normalPNO and wb97xd/def2tzvp were added.
Motivation or Problem
DLPNO- CCSD(T) method is use full to calculate thermochemistry parameters of large molecules. The BAC and Atom Corrections are important for accurate thermochemical estimation. BAC correction for DLPO-CCSD(T)/def2-tzvp NormalPNO and wb97xd/deftzvp calculated using https://github.com/cgrambow/bac
algorithm. I only considered neutral molecules for the fitting. I notice cation produced large deviations. So cations were left out form the fitting. Further, rotors were not considered
model chemistry BAC can be used for
DLPNO-CCSD(T)/def2-tzvp//wb97xd/deftzvp NormalPNO
wb97xd/deftzvp//wb97xd/deftzvp
Description of Changes
Atom corrections and Petersson type bond correction were added.
Testing
Reran the test set using ARC.
Reviewer Tips
easy to check H298 by running H2, N2, O2, F2, and Cl2