Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atom Corrections and Bond additivity Correction for DLPNO-CCSD(T) #1935

Closed
wants to merge 6 commits into from

Conversation

dranasinghe
Copy link
Contributor

@dranasinghe dranasinghe commented Apr 27, 2020

Atom Corrections and Bond additivity Correction for DLPNO-CCSD(T)/def2-tzvp normalPNO and wb97xd/def2tzvp were added.

Motivation or Problem

DLPNO- CCSD(T) method is use full to calculate thermochemistry parameters of large molecules. The BAC and Atom Corrections are important for accurate thermochemical estimation. BAC correction for DLPO-CCSD(T)/def2-tzvp NormalPNO and wb97xd/deftzvp calculated using https://github.com/cgrambow/bac
algorithm. I only considered neutral molecules for the fitting. I notice cation produced large deviations. So cations were left out form the fitting. Further, rotors were not considered
model chemistry BAC can be used for
DLPNO-CCSD(T)/def2-tzvp//wb97xd/deftzvp NormalPNO
wb97xd/deftzvp//wb97xd/deftzvp

Description of Changes

Atom corrections and Petersson type bond correction were added.

Testing

Reran the test set using ARC.

Reviewer Tips

easy to check H298 by running H2, N2, O2, F2, and Cl2

…-tzvp normalPNO and wb97xd/def2tzvp were added.
@amarkpayne
Copy link
Member

We should also update the ReferenceSpecies database with these calculations. @dranasinghe do you have the Arkane YAML files from the calculations?

@codecov
Copy link

codecov bot commented Apr 27, 2020

Codecov Report

Merging #1935 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1935   +/-   ##
=======================================
  Coverage   45.05%   45.05%           
=======================================
  Files          86       86           
  Lines       22213    22213           
  Branches     5788     5788           
=======================================
  Hits        10008    10008           
  Misses      11099    11099           
  Partials     1106     1106           
Impacted Files Coverage Δ
arkane/encorr/data.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35d114a...630fa34. Read the comment docs.

@alongd alongd changed the title Atom Corrections ad Bond additivity Correction for DLPNO-CCSD(T)/def2… Atom Corrections and Bond additivity Correction for DLPNO-CCSD(T) Apr 27, 2020
Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!
Could you also address Mark's comment?
Also, the commit message is too long, could you make a short header and add the remaining details to the commit message body?

SOC = {'H': 0.0, 'N': 0.0, 'O': -0.000355, 'C': -0.000135, 'S': -0.000893, 'P': 0.0, 'I': -0.011547226}
# Spin orbit correction for F, Si, Cl, Br and B taken form https://cccbdb.nist.gov/elecspin.asp
SOC = {'H': 0.0, 'N': 0.0, 'O': -0.000355, 'C': -0.000135, 'S': -0.000893, 'P': 0.0, 'I': -0.011547226,
'F': -0.000614, 'Si': -0.000682 , 'Cl': -0.001338, 'Br': -0.005597 , 'B': -0.000046 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please correct white spaces: there's an etra one before 'F', and an extra one after 'B': -0.000046.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

'O-O': 0.09663, 'C=S': 0.80888, 'H-S': 1.23648, 'C=O': 0.88066, 'C=N': 0.03998,
'H-N': -0.49564, 'H-O': -0.41183, 'H-H': 0.6263, 'N#N': 3.71325, 'N-N': 0.74915,
'N-O': -0.62156, 'C=C': -0.63901, 'O=S': -1.39626, 'O-S': -1.37002, 'S-S': 0.1515,
'F-S': -0.68693, 'F-O': 0.09202, 'F-H': -1.68214, 'F-F': 0.95483, 'O=O': -2.64949
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a , at the end of the line (and also for the block below)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These corrections are obtained from calculations that use dlpno-ccsd(t)/def2-tzvp for both energy calculations and geometry optimizations? If the geometry optimization was done with a different level of theory, it should be included in the model chemistry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to include it current structure. So I left a comment.

'C=O': 1.00501, 'C=N': -0.68305, 'H-N': -0.52232, 'H-O': -1.18129, 'H-H': -1.76862, 'N#N': -3.84259,
'N-N': 2.66325, 'N-O': 1.69619, 'C=C': -0.11192, 'O=S': -1.33397, 'O-S': -1.71863, 'S-S': 0.5224,
'F-S': -1.28933, 'F-O': -0.03756, 'F-H': -3.71018, 'F-F': -1.71494, 'O=O': -6.70857
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add two lines to the Model Chemistry table in the documentation (https://github.com/ReactionMechanismGenerator/RMG-Py/blob/master/documentation/source/users/arkane/input.rst).

@dranasinghe
Copy link
Contributor Author

@aelong and @amarkpayne Thank you for the comments. I made the changes. I have all the YAML files. I will update the reference database.

Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I added a couple of minor comments to the docs commit.

@@ -119,12 +119,16 @@ Model Chemistry AEC BC SOC Freq Scale Supp
``'B3LYP/6-31G(d,p)'`` v v v (0.961) H, C, O, S
``'MRCI+Davidson/aug-cc-pV(T+d)Z'`` v v H, C, N, O, S
``'wb97x-d/aug-cc-pvtz'`` v v H, C, N, O
``DLPNO-CCSD(T)/def2-tzvp`` v v v H, C, N, O, S, F, Cl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind placing DLPNO after B-CCSD(T)-F12/aug-cc-pVnZ so all CCSD and all wb97 methods are adjacent?

================================================ ===== ==== ==== ========== ====================

Notes:

- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results.
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points.
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO
- In ``wb97xd/def2tzvp`` run using Gaussian16. Do n't use the AEC and BC for wb97xd calculated with Qchem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are excellent tips!
In Arkane, which wb97xd were calculated by QChem? Is it 'wb97x-d/aug-cc-pvtz'? If so, Arkane won't use it due to the basis set (and also the dash in the functional name). If I misunderstood, could you elaborate?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should be using the notation without hyphens. We could consider using wb97x-d/def2-tzvp to be consistent with notation in the literature rather than notation Gaussian uses. I doubt any wb97x-d calculations were done using Q-Chem because you would usually use wb97x-d3 in Q-Chem.

Copy link
Member

@amarkpayne amarkpayne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of additional comments but overall it looks good, thanks for running these calcs!

When you update this branch next, drop the merge from master commit and rebase.

Comment on lines 329 to 342
# Calculated atomic energies fitted orca 4.2.1 dlpno-ccsd(t)/def2-tzvp NormalPNO
# fitted using Colin's BAC algorithm and SOC are included in the correction
# AEs are fitted to neutral molecules with RMSE and MAE 7.99 and 5.96 kJ/mol respectively.
'dlpno-ccsd(t)/def2-tzvp': {
'H': -0.49641082, 'C': -37.77274125, 'N': -54.50388932, 'O': -74.96760414,
'F': -99.62044819, 'S': -397.63480236, 'Cl': -459.65784960
},
# wb97xd/def2tzvp conducted using G16.
# fitted using Colin's BAC algorithm and SOC are included in the correction
# AEs are fitted to neutral molecules with RMSE and MAE 9.16 and 6.66 kJ/mol respectively.
'wb97xd/def2tzvp': {
'H': -0.50224721, 'C': -37.84235323, 'N': -54.58757237, 'O': -75.06982154,
'F': -99.74180094, 'S': -398.10765642, 'Cl': -460.14650064
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of manually adding in the SOC terms, write the energy and explicitly write in the SOC correction. For example, the correction for oxygen for dlpno-ccsd(t)/def2-tzvp would be 'O': -74.96724914 + SOC['O'] (which will get evaluated to -74.96760414). Do this for all atoms, even if the SOC is zero. This is a very minor thing, but it is better just in case we ever need to update the SOC values, and it makes it very clear that the correct SOC values were used.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figured I would chime in here. So these were fitted and not just calculated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

================================================ ===== ==== ==== ========== ====================

Notes:

- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results.
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points.
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you use anything like TightSCF? It can be added here if so. I agree with @alongd it is a great idea to include this here

================================================ ===== ==== ==== ========== ====================

Notes:

- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results.
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points.
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO
- In ``wb97xd/def2tzvp`` run using Gaussian16. Do n't use the AEC and BC for wb97xd calculated with Qchem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in don't

Copy link

@cgrambow cgrambow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw this PR and wanted to make some comments. In addition to the individual comments I made, I wanted to address some bigger issues.

What exactly is the issue with ionic/radical species? We have the full dataset including such species, so I think they should be included in the fit. I am able to obtain very good fits including ions and radicals using DFT methods, so I'd be surprised if it didn't work for DLPNO. Additionally, if we are providing Petersson-type corrections, we should also be providing Melius-type corrections.

Were the corrections fit with or without substructure weighting?

Note that a recent RMG-database PR corrected some species in the reference database and ReactionMechanismGenerator/RMG-database#404 corrects some more. The corrections in this PR should be refit after the database PR has been merged.

Comment on lines 329 to 342
# Calculated atomic energies fitted orca 4.2.1 dlpno-ccsd(t)/def2-tzvp NormalPNO
# fitted using Colin's BAC algorithm and SOC are included in the correction
# AEs are fitted to neutral molecules with RMSE and MAE 7.99 and 5.96 kJ/mol respectively.
'dlpno-ccsd(t)/def2-tzvp': {
'H': -0.49641082, 'C': -37.77274125, 'N': -54.50388932, 'O': -74.96760414,
'F': -99.62044819, 'S': -397.63480236, 'Cl': -459.65784960
},
# wb97xd/def2tzvp conducted using G16.
# fitted using Colin's BAC algorithm and SOC are included in the correction
# AEs are fitted to neutral molecules with RMSE and MAE 9.16 and 6.66 kJ/mol respectively.
'wb97xd/def2tzvp': {
'H': -0.50224721, 'C': -37.84235323, 'N': -54.58757237, 'O': -75.06982154,
'F': -99.74180094, 'S': -398.10765642, 'Cl': -460.14650064
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figured I would chime in here. So these were fitted and not just calculated?

'O-O': 0.09663, 'C=S': 0.80888, 'H-S': 1.23648, 'C=O': 0.88066, 'C=N': 0.03998,
'H-N': -0.49564, 'H-O': -0.41183, 'H-H': 0.6263, 'N#N': 3.71325, 'N-N': 0.74915,
'N-O': -0.62156, 'C=C': -0.63901, 'O=S': -1.39626, 'O-S': -1.37002, 'S-S': 0.1515,
'F-S': -0.68693, 'F-O': 0.09202, 'F-H': -1.68214, 'F-F': 0.95483, 'O=O': -2.64949

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These corrections are obtained from calculations that use dlpno-ccsd(t)/def2-tzvp for both energy calculations and geometry optimizations? If the geometry optimization was done with a different level of theory, it should be included in the model chemistry.

================================================ ===== ==== ==== ========== ====================

Notes:

- The ``'CBS-QB3-Paraskevas'`` model chemistry is identical to ``'CBS-QB3'`` except for BCs for C/H/O bonds, which are from Paraskevas et al. (DOI: 10.1002/chem.201301381) instead of Petersson et al. (DOI: 10.1063/1.477794). Beware, combining BCs from different sources may lead to unforeseen results.
- In ``'M08SO/MG3S*'`` the grid size used in the [QChem] electronic structure calculation utilizes 75 radial points and 434 angular points.
- In ``DLPNO-CCSD(T)/def2-tzvp`` run using Orca 4.2.1 with NormalPNO
- In ``wb97xd/def2tzvp`` run using Gaussian16. Do n't use the AEC and BC for wb97xd calculated with Qchem.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should be using the notation without hyphens. We could consider using wb97x-d/def2-tzvp to be consistent with notation in the literature rather than notation Gaussian uses. I doubt any wb97x-d calculations were done using Q-Chem because you would usually use wb97x-d3 in Q-Chem.

@amarkpayne
Copy link
Member

I'll let Duminda respond to the comments as well, but my understanding for the lack of hyphens in certain place is to distinguish that this model chemistry is run using Gaussian, which does not use hypens for this in the input files.

This should probably be part of a separate discussion on the future of model chemistry. In ARC, they have separate variables for "level of theory" and "model chemistry". I am not certain on what they envisioned the difference between these two to be, but I think that we need to make a distinction between the verbatim text that must be supplied to QM software to specify the level of theory and associated items like grid size and tolerances, and on the other hand have a string/dictionary/something that uniquely identifies the model chemistry to the user in a standardized way, that ideally can be used as keys in the reference species database.

@alongd , @oscarwumit , @dranasinghe , @cgrambow it might be worth having a discussion over Zoom soon to discuss how to standardize model chemistry/level of theory between ARC, Arkane, and the reference species database. Thoughts?

Copy link
Contributor Author

@dranasinghe dranasinghe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. I made the changes you requested. I added Melius-type bond additivity correction parameters and push the changes.

Comment on lines 329 to 342
# Calculated atomic energies fitted orca 4.2.1 dlpno-ccsd(t)/def2-tzvp NormalPNO
# fitted using Colin's BAC algorithm and SOC are included in the correction
# AEs are fitted to neutral molecules with RMSE and MAE 7.99 and 5.96 kJ/mol respectively.
'dlpno-ccsd(t)/def2-tzvp': {
'H': -0.49641082, 'C': -37.77274125, 'N': -54.50388932, 'O': -74.96760414,
'F': -99.62044819, 'S': -397.63480236, 'Cl': -459.65784960
},
# wb97xd/def2tzvp conducted using G16.
# fitted using Colin's BAC algorithm and SOC are included in the correction
# AEs are fitted to neutral molecules with RMSE and MAE 9.16 and 6.66 kJ/mol respectively.
'wb97xd/def2tzvp': {
'H': -0.50224721, 'C': -37.84235323, 'N': -54.58757237, 'O': -75.06982154,
'F': -99.74180094, 'S': -398.10765642, 'Cl': -460.14650064
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

'O-O': 0.09663, 'C=S': 0.80888, 'H-S': 1.23648, 'C=O': 0.88066, 'C=N': 0.03998,
'H-N': -0.49564, 'H-O': -0.41183, 'H-H': 0.6263, 'N#N': 3.71325, 'N-N': 0.74915,
'N-O': -0.62156, 'C=C': -0.63901, 'O=S': -1.39626, 'O-S': -1.37002, 'S-S': 0.1515,
'F-S': -0.68693, 'F-O': 0.09202, 'F-H': -1.68214, 'F-F': 0.95483, 'O=O': -2.64949
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to include it current structure. So I left a comment.

@dranasinghe
Copy link
Contributor Author

Sorry I realized bac has moved to RMGdatabase. I will make a new PR to RMG database

@amarkpayne
Copy link
Member

Everything in this PR with the exception of the documentation update has been moved to ReactionMechanismGenerator/RMG-database#404. I'll close out this PR, but we'll add the documentation update in a later PR along with the other new model chemistries once the level of theory standardization PR #1940 has been merged in.

@amarkpayne amarkpayne closed this May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants