-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database test #36
Merged
Merged
Database test #36
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Found using checkWellFormed -Cs_H_out_Cs2 adjList in rules changed to match group.py. group.py adjList was more consistent with siblings. -in R3H_SS_S, some atoms changed from R to R!H. The atoms had two bonds or a radical, so they had to be by definition -Cs_H_out_H/OneDe LogicOr was replaced with an AdjList. This was a strange case where the parent and children were both Groups (Usually I would expect a given LogicOr group to have LogicOr parent. With the new flexibility that we have we have programmed into AdjList, I'm not even sure we need LogicOr anymore). I'm not sure if we should never have this weird sandwiched LogicOr, but in this case I found I could come up with an AdjList that fulfills all the correct relationships.
Previously was calculating Q using natural log, but I want to use log base 10 so that I get orders of magnitude for my comparisons. Also I was using an averaged temperature of the rule's range for calculating k. This was when I wanted to calculated the Leave-one-out value vs the rule's value. Now, I'm pretty sure I want to compare node to node for future things, so I've just simplified it so that I calculate everything at 1000K.
A few children errors: -Changed C to Cs in Cs_H and all its (numerous) descendants -Changed C to Cd in Cd_H and all its descedents
Additional lines of output given for checkWellFormed family. One error we check for is if a label in rules.py is misnamed, i.e. if any part of the label is not a group in group.py. There a decent amount of these errors due to the recent Java/Py database merge. To make my life database debugging life easier, I added a new feature which tells me if there is a group which matches the misnamed label. For example, let's say we have a rule label, A;B. Somebody changed the name of A to C in groups.py, but left it unchanged in rules.py. The new feature would suggest C as the correct name for A, by comparing adj lists of the rule and those in groups.py Unfortunately, there is a bug I can't figure out. If somebody changed A to C AND B to D, it suggest C for A and C for B. The manual work around is to visually verify which one adjList C matches, make the fix, and run checkWellFormed again. It will then suggest D for B. I feel like this isn't worth fixing because it will be unable to suggest groups with the upgrade to universal database (once all adjlists are removed from rules.py)
These were found by the checkWellFormed function. Changes to rule labels rules.py. For template A-->B (x occurences). 'A' did not exist in groups.py, so the name was changed to the group 'B' with a matching adjList. 'x' is the number of rules which used 'A'. -O_sec_rad --> OJ_sec (1 occurance) -H_rad does --> HJ (5 occurrences) -N3t_Ct/H --> N3t_Ct-H (1 occurrence) -O_pri_rad --> OJ_pri (6 occurrences) -O_rad/NonDeO --> OJ-Os (13 occurrences) -Ct/H_N3t --> Ct-H_N3t (2 occurrences) -N3d/H_N3d/H --> N3d-H_N3d-H (1 occurrence) -N3d/H_Cds/H2 --> N3d-H_Cds-HH (1 occurrence) -Cds/H2_N3d --> Cds-HH_N3d (1 occurrence) -N3t_CtH --> N3t_Ct-H (1 occurrence) Change to adj lists in rules.py -Changed radical from 3 to 3Q in "N3t_N3t;CH_quartet" to match group definition. (CH_quartet only appears in one rule by bbuesser). -For the top node R_R;YJ, the definitions of R_R and YJ? were both updated to the more inclusive definition in groups.py (effects top node) -Ct-Ct_Ct-Cs was found to be wrongly defined in groups.py and rules.py, so changed all occurrences to the correct definition (19 occurrences) -Ct-Ct_Ct-H was found to be wrongly defined in groups.py and rules.py, so changed all occurrences to the correct definition (19 occurrences) Change to adjlist in groups.py -CH2_triplet changed to only allow 2T instead of 2S in N3t_N3t;CH2_triplet (effects 1 rule by bbuesser) -Ct-Ct_Ct-Cs adj list was definitely wrong. Found because it was a duplicate of Ct-Cd_Ct-Cs. Changed the Cd in the adj list to a Ct. (appears in 19 rules) -Ct-Ct_Ct-H adj list was definitely wrong. Found because it was a duplicate of Ct-Cd_Ct-H. Changed the Cd in the adj list to a Ct. (appears in 19 rules) -Changed Ss atom to more general S atom in SJ. It had the same adj list as it's only child SsJ. (no rules use Ss)
1) Path was hard-coded to C:\RMG-database. It now runs on whatever folder the EvaluateKinetics.py script itself is in. 2) We weren't specifying which kineticsFamilies to load, which is now a requirement of the 'load' method (perhaps there should be a default?)
These were found using the checkWellFormed function. -changed CtJ's had an atom C, that I've changed to so that it's children can include nitrogen (1 occurrence in rules.py, The original paper looks like it only consider C, so I changed the rule from Ct-H_Ct-H;CtJ to Ct-H_Ct-H;CtJ_Ct -changed OdR had an atom {C,S} changed to Od-R!H, to include nitrogen in children (no rules affected) -reordered atoms in CsJ-OneDeNsCs. Doesn't change graph, but makes Adjlist more obviously a subgroup of parent -added nitrogen atoms N3s and N5s into set of possible atoms for CsJ-OneDe -changed O to Os in OJ-Os. Adj list restricts the atom to Os (and it's in the name) (also changed in 13 rules) -changed some C atoms to Cd or Cs according to definition in Cds-HH_Cds-Cs\H3/H. Also reordered pretty much the entire adj list so it matched up with it's parent's ordering. (also changed in 3 rules) -O_atom_triplet allows 2S and 2T. Changed it to only 2T. -CH_quartet changed to from C 3Q --> Cs 3Q. There were two groups named CH_quartet, I clarified one of them and deleted the other. -Two definitions of Y_1centerbirad. Deleted one of them (kept one that allows nitrogen) -Removed group N3_atom_quartet which was a duplicate from merge (same as N_atom_quartet) -Deleted one instance of OJ-OneDe (they were identical) -O_rad/OneDe was previously both renamed to OJ-OneDe and the adjlist was changed to include nitrogen. However, the rule that uses it was created using AGvandeputte's GAV value without nitrogen. As such, I've re-created a group O_rad/OneDe with no nitrogen and made it a child of the more general OJ-OneDe. Name is kinda non-descriptive though...Also gave it a random index, which are all screwed up anyway. - Same history and solution as above for O_rad/NonDe to OJ-NonDe. Also had to move some of children around to preserve correct relationships. Changes to tree: -L6: Cds-NonDe2_N3d changed to L5. Clearly a typo based on positioning, name, and comparing adj list with relatives. -Added O_atom_singlet, NH_singlet, and CH2_singlet under Y_1centerbirad. They had definitions but did not appear in the tree -Changed level of CtJ_Ct and CtJ_N3t from L3: to L4. They actually are the child of one of their siblings CtJ! That sounds like the plot of a crappy novel.
After discussing with bbuesser and agvandeputte, we concluded a singlet biradical would react much faster as a 1,2_cycloaddition than a regular addition. There were no rules which used singlet biradicals (probably because they do not react through this channel). I am only restricting singlet biradicals for *3 atoms, I still allow them in R_R tree. Groups were found by searching for the regex '\*3 .*2', which returned all lines with *3 atoms that were birad (there were only 9 of them). Then I manually searched them for biradical singlets. Removed the following groups: O_atom_singlet CH2_singlet NH_singlet Changed *3 atom from 2 to 2T in the following groups: Y_1centerbirad CO_birad
There has been some confusion as to whether Cdd is a specific case of Cd. In the current builds, they are exclusive. Thus many groups have had atoms changed from Cd to {Cd, Cdd}. Here, I have copied over all the changes which appeared in groups.py to the proper adj lists in rules.py. The groups that changed were: R5_SS_D (168 changes in rules) R4_S_D (168 changes in rules) R6_SMS_D (168 changes in rules) R5_SD_D (1 change in rules) R6_SSS_D (168 changes in rules) R3_D (1 change in rules) A few of the more general groups had the above change and nitrogen added to their definition: R3_D (1 change in rules) top node, multiplebond_intra changed to include N and Cdd
-R6_RSR was missing a label, *6, on one of it's atoms. Its parent and all it's children had the *6 label, so I'm fairly sure it was just a typo. (0 rules affected) -carbonyl_intra atom changed from O {1,D} to Od {1,D}. Same fix for all its children: carbonyl_intra_H, carbonyl_intra_Nd, carbonyl_intra_De (0 rules for all groups affected)
-For three rules: changed atom from C to Ct for group Ct_rad/Ct. -For rule Ct_rad;O_Csrad, the original rule was for C2H + CH2OH --> C2H2 + CH2O, but the group has been changed to be inclusive for nitrogen. Since we have an original reaction, I decided to make a training reaction and remove this rule. I found the last error serendipitously because the rule's adjList was mismatched with the groups. I don't have a systematic check for when rules may not be applicable to expanding rules.
The following groups had their *1 and *2 atom changed from C to Cd. This is always true based on their adjList. Unless stated, the group appeared in 0 rules: Cd/Nd2_Cd/Nd/De Cd/Nd2_Cd/H/De Cd/De2_Cd/H/Nd Cd/H/Nd_Cd/H/Nd Cd/H/De_Cd/Nd2 Cd/Nd/De_Cd/H/Nd Cd/H/De_Cd/H/De Cd/H/Nd_Cd/H/De Cd/Nd2_Cd/Nd/De Cd/H/De_Cd/H2 Cd/NdDe_Cd/H2 Cd/Nd2_Cd/De2 Cd/De2_Cd/H/De Cd/H/Nd_Cd/Nd2 Cd/Nd/De_Cd/H/De Cd/Nd/De_Cd/Nd2 Cd/H2_Cd/H/Nd (3 rules) Cd/H2_Cd/H/De Cd/De2_Cd/Nd2 Cd/Nd2_Cd/H/Nd Cd/H2_Cd/De2 Cd/H/De_Cd/Nd/De Cd/H2_Cd/Nd/De Cd/Nd/De_Cd/De2 Cd/H/Nd_Cd/Nd/De Cd/De2_Cd/De2 Cd/H2_Cd/Nd2 Cd/De2_Cd/H2 Cd/Nd2_Cd/H2 Cd/Nd2_Cd/Nd2 Cd/De2_Cd/Nd/De Cd/H/De_Cd/H/Nd Cd/H2_Cd/Cs2 (3 rules) Cd/H/Nd_Cd/H/Os (1 rule) Cd/H/Nd_Cd/H2 (2 rules)
C atoms changed to Cs in CSm;CH2CH3 rule to match the CH2CH3 group
-Moved a comment at the top to the longDesc of the nodes it affects. They were originally referenced by index, but who knows when that might get reshuffled/changed -replaced C with Cs in all instances of 'carbene' -Many groups had their *2 atom (therefore always in group2 for group1;group2) changed from C to Cs or Cd to match the definition in groups.py. Listed below is all the rules affected: CO_birad;C_pri/NonDeC carbene;C_pri/Ct carbene;C_pri/Cd carbene;Cd/H/OneDe carbene;Cd_pri carbene;C_pri/Ct carbene;ethene (2 atoms changed from C to Cd) CO_birad;C/H2/NonDeC carbene;Cd/H/NonDeC CO_birad;C_methane
First of many changes to H-Abstraction family found by the databaseTester. -Group definition (groups.py) of C/H/CtCt somehow got mixed up to: '1 *3 R!H 2'. Changed back to correct definition based on its rules and java's definition -"Cd_pri" was previously changed to include N. A child underneath it "Cd/H2/NonDeC", is it's previous definition without N. 48 rules had their label changed from Cd_pri;X, to Cd/H2/NonDeC;X. I checked the source for all the rules and they shouldn't have N. -"Ct_pri" was previously changed to include N. A child underneath it "Ct_rad/Ct"", is it's previous definition without N. 7 rules had their label changed from X;Ct_pri, to X;Ct_rad/Ct. I checked the source for all the rules, and they shouldn't have N. -Also in the rules with Ct_rad/Ct, I changed C to Ct, which is true by definition -Changed definition of "N3d/H/NonDeC" in the rules to match the group's definition. The groups.py definition was more specific and corresponded better with the label. (affected 5 rules) -Created new group O_rad/OneDeC, underneath O_rad/OneDe because O_rad/OneDe was expanded to include N, but there are rules that only use C. Changed the name on the relevant rules, and moved some children (O_rad/Cd and all it's children) underneath the new O_rad/OneDeC node -Because of above change, O_rad/Cd and all it's children had C atoms changed to Cd or {Cd,Cdd}
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
-Continuation of fixes to databases
-Minor improvements to checkWellFormed: be sure to pull with corresponding pull request in RMG-Py (or this won't work)