Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tree averaging issue #712 #713

Merged
merged 5 commits into from
Jul 26, 2016
Merged

Fix tree averaging issue #712 #713

merged 5 commits into from
Jul 26, 2016

Conversation

KEHANG
Copy link
Member

@KEHANG KEHANG commented Jul 14, 2016

This PR tries to fix the issue #712 by

  • generating all combinations of cross-level rules
  • when calculating the estimated rules, the averaging scheme is only choose distance-1 rules

@mention-bot
Copy link

@KEHANG, thanks for your PR! By analyzing the annotation information on this pull request, we identified @jwallen, @connie and @rwest to be potential reviewers

@rwest
Copy link
Member

rwest commented Jul 14, 2016

No time to look into it further myself now, but I think this is being killed because it's run out of memory on travis, not because it's run out of time.

@connie connie force-pushed the fix_tree_averaging branch 2 times, most recently from 37ea75c to 79966f6 Compare July 21, 2016 19:38
@connie
Copy link
Member

connie commented Jul 21, 2016

In addition to kehang's notes on this pull request, I have also updated:

  • The storage of kinetics comments for averaged rules is streamlined. Previous storage of fully recursive kinetics comments into the rule entries that are averaged during fillRulesByAveragingUp spent more than 5 GB memory during RMG runs. Therefore, I have simplified it to listing the distance 1 children that contribute to kinetics only. In order to find the actual sources of the original kinetics, one would need to look into the code further. This means that verbose comments are no longer as verbose as before. To turn back on the truly recursive comments, you can go into the code to uncomment a few lines.
  • Euclidean distance algorithm implemented for the selection of the best kinetics to use in estimateKinetics. Previously, we went with the Manhattan distance, which meant an entry that was a distance of (3,0) away was equally favorable as an entry of distance (2,1) away. Now, switching to a euclidean distance norm we will choose the (2,1) entry over the (3,0) entry. This is because we saw in the models that higher top level nodes were becoming increasingly incorporated. I think we want the algorithm to prevent top level nodes from appearing as infrequently as possible.

image + image = image + image

C2H3(13) + C2H3(13) ⇔ C2H4(8) + C#C(25)

Before:

! Template reaction: Disproportionation
! Flux pairs: C2H3(13), C2H4(8); C2H3(13), C#C(25); 
! Estimated using average of templates (Y_rad_birad_trirad_quadrad;Cd_Cdrad) + (Y_rad;CH_d_Rrad) + (Cd_pri_rad;XH_Rrad) for rate rule
! (Cd_pri_rad;Cd_Cdrad)
! Multiplied by reaction path degeneracy 2
C2H3(13)+C2H3(13)=C2H4(8)+C#C(25)                   6.005e+11 0.167     1.127    

After:

! Template reaction: Disproportionation
! Flux pairs: C2H3(13), C2H4(8); C2H3(13), C#C(25); 
! Estimated using template (Y_rad;CH_d_Rrad) for rate rule (Cd_pri_rad;Cd_Cdrad)
! Multiplied by reaction path degeneracy 2
C2H3(13)+C2H3(13)=C#C(25)+C2H4(8)                   6.447e+06 1.902     -1.131   

Although this pull request increases the time required to run the minimal example, we believe that the kinetics estimation is in fact better than it was previously. Running the RMG-database/testing/evaluateKinetics.py may help elucidate how the averaging works as well: ReactionMechanismGenerator/RMG-database#117

connie added 5 commits July 26, 2016 16:16
…mplate

As agreed upon by Prof. Green, we will no
longer average strict children, instead we will
average together manhattan distance 1 nodes and traverse
the tree completely before stopping
…t when there's just one entry

Deep copy the kinetics even if one entry is used
in the average, because we are going to be modifying the comments
and we don't want to modify the original object.

Use just the children's template labels for an averaged rule's comments.
This modifies the commenting that we used to do, which
traces the entire averaging history by recursively
appending to the kinetics comments in order to
view the actual original templates that factor into
the average.  Instead this new method of writing
the kinetics comments just lists the distance 1 children that
in fact has kinetics.  In order to do more detailed analysis, one
would have to go inside to uncomment the debug lines
that use the former kinetics comment style that retains
full information.  However, this is too memory intensive
and difficult to read for normal RMG jobs, and wil print
out too much to verbose chemkin output using the original method.
Previously, we were using a distance algorithm that choose
the manhattan distance.  Now we move to use the Euclidian norm
distance instead in hopes of decreasing the usage of very general nodes

Speed up norm distance calculation by storing the node's level
in the Entry() object when parsing group tree
@connie connie force-pushed the fix_tree_averaging branch from 79966f6 to b21bc48 Compare July 26, 2016 20:29
@connie connie merged commit 3211415 into master Jul 26, 2016
@connie connie deleted the fix_tree_averaging branch July 27, 2016 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants