Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need more details on the jobs #1

Open
rwest opened this issue Dec 5, 2018 · 22 comments
Open

Need more details on the jobs #1

rwest opened this issue Dec 5, 2018 · 22 comments
Assignees

Comments

@rwest
Copy link
Member

rwest commented Dec 5, 2018

@ehermes just posted via the sandia gitlab thing:

Can you specify more details about the jobs? What level of theory, what basis set, etc...

@nateharms
Copy link
Member

nateharms commented Dec 5, 2018

M06-2X/6-311+g(2df,2p) or better please

@nateharms
Copy link
Member

Or the 6-311+g** basis set if the 6-311+g(2df,2p) basis set is unavailable

@ehermes
Copy link
Collaborator

ehermes commented Dec 5, 2018

I'll try out a few of these calculations to benchmark, but compared to the 6480 saddle point searches I'm already running, your tests are using a more expensive level of theory (I'm using B3LYP/6-31G) and have some larger molecules (my biggest molecules have 8 heavy atoms). There's a chance that your tests will scale better on KNL though, since they're bigger.

Are all of these tests closed-shell? If not, how can I tell which are open shell and which are closed-shell?

@rwest
Copy link
Member Author

rwest commented Dec 5, 2018

I believe they're all H-abstraction, so will all have a radical involved.

@ehermes
Copy link
Collaborator

ehermes commented Dec 5, 2018

Well, I found at least one geometry where NWChem said a multiplicity of 2 was invalid. I suppose I can just count the number of hydrogens and determine the multiplicity that way.

@ehermes
Copy link
Collaborator

ehermes commented Dec 5, 2018

Since all of the reactions are H-abstractions, does that mean systems with an even number of electrons should have a multiplicity of 3?

@rwest
Copy link
Member Author

rwest commented Dec 5, 2018

@nateharms are some of them abstracting H from a radical, and thus have an even number of electrons overall?

(And did you include abstraction by triplet O₂?)

@nateharms
Copy link
Member

  • all reactions are H-Abstraction
  • some reactions are H-abstractions involving two radicals (e.g. OOH abstracting from [CH2]CC or something)
  • some reactions involve abstraction by triplet O2

@ehermes
Copy link
Collaborator

ehermes commented Dec 5, 2018

Can you give me some advice on how to determine which multiplicity to use for which geometries?

@nateharms
Copy link
Member

Would it be easier if I provided a text file containing a dictionary of files and their corresponding multiplicity?

@ehermes
Copy link
Collaborator

ehermes commented Dec 5, 2018

If you can provide the multiplicities for each geometry, that would certainly be helpful.

@nateharms
Copy link
Member

Okay, a text file has been added with corresponding file names and their multiplicities

@ehermes
Copy link
Collaborator

ehermes commented Dec 5, 2018

These calculations are going to take a lot of time. Unfortunately, NWChem's DFT routines (LCAO, not plane waves) are not at all optimized for many-core systems like Theta. Since the calculations in my test set are all fairly inexpensive, this wasn't a big problem, but testing M06-2X/6-311++G** I am seeing calculations take 10 minutes per single point on 2 nodes (128 cores). It will be difficult to do a meaningful amount of work in the span of a single job, given the very low time limits for jobs on Theta.

My priority currently is getting my own test set completed. Once they are done, I can start working on these systems. I think it will be a lot easier to first optimize the saddle points using a less expensive level of theory, then re-optimize at M06-2X/6-311++G** -- possibly on a different cluster.

@nateharms
Copy link
Member

Gotcha, how about run M06-2X/6-31G? Will that be less expensive? Either that or run it whatever you think seems suitable, we can re-optimize them later. Thanks for the help!

@ehermes
Copy link
Collaborator

ehermes commented Dec 5, 2018

I'll test a couple of different things to see if I can find something reasonably fast that is close to your desired settings. M06-2X being a meta-GGA also adds a nontrivial cost, particularly in terms of SCF convergence (Truhlar's functionals have notorious convergence difficulties).

@ehermes
Copy link
Collaborator

ehermes commented Jan 22, 2019

I've started running these calculations now. I'm going to start by optimizing the saddle points with B3LYP/6-31G, then refining with M06-2X/6-311+G**.

I noticed there is a med directory as well, but these structure multiplicities are not in mults.txt. Do you want me to run these structures as well? If so, can you provide their multiplicities?

@nateharms
Copy link
Member

@ehermes an updated mults.py file exists containing the multiplicities for all the files

@ehermes
Copy link
Collaborator

ehermes commented Feb 18, 2019

I'm having trouble parsing this file automatically. It seems like there's a strange character on line 2859.

@nateharms
Copy link
Member

Hmmm... I'm not sure why it isn't working... I'm able to read in the file easily. But I'll try writing a new file and seeing if that helps

@nateharms
Copy link
Member

@ehermes a file called mults.csv file was added. I didn't have any trouble reading it in with pandas. Hope this helps!

@rwest
Copy link
Member Author

rwest commented Feb 20, 2019

Some seem to be missing:

import pandas
import os
df = pandas.read_csv('mults.csv')
for dirpath, dirnames, filenames in os.walk('.'):
    if './' not in dirpath or '.git' in dirpath: continue
    _,d = dirpath.split('/')
    for f in filenames:
        if '.xyz' not in f:
            continue
        p = os.path.join(d,f)
        found = (sum(df.file_name==p))
        if found !=1:
            print(p, found)
high/C=CC=C+[O]O_C=[C]C=C+OO.xyz 0
high/[O]O+[CH2]CCC_CCCC+[O][O].xyz 0
low/C=CCC+[CH]=C_C=[C]CC+C=C.xyz 0
low/[O]OC=O+CC(C)(C)O_O=COO+CC(C)(C)[O].xyz 0
low/CO+CCCCO[O]_[CH2]O+CCCCOO.xyz 0
low/CCCO[O]+C=CC_CCCOO+[CH]=CC.xyz 0
low/CC(C)O[O]+CC=O_CC(C)OO+C[C]=O.xyz 0
low/C[C]=O+CCC(C)O_CC=O+C[CH]C(C)O.xyz 0
low/C#CC+[O][O]_C#C[CH2]+[O]O.xyz 0
low/[CH2]O+CCCC_CO+C[CH]CC.xyz 0
low/[CH3]+C=C=O_C+[CH]=C=O.xyz 0
med/[CH3]+CC(C)CC(=O)CC(C)C_[CH2]C(C)CC(=O)CC(C)C+C.xyz 0
med/[OH]+CCCC(C)CO_CCCC(C)[CH]O+O.xyz 0
med/[H]+CCCC=CC(C)C_C[CH]CC=CC(C)C+[H][H].xyz 0
med/[H]+CCC(C)CC(C)C_[H][H]+CCC(C)[CH]C(C)C.xyz 0
med/[H]+CCCCCC1CCC(C)O1_CCCCCC1C[CH]C(C)O1+[H][H].xyz 0

@nateharms
Copy link
Member

@rwest thanks for the catch, should be good now 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants