-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New closure methods #131
base: main
Are you sure you want to change the base?
New closure methods #131
Conversation
* Note that the last sentence in the `Initialization` section was completely removed. It was unclear what this sentence was meant to say, but did not make sense in the context of the surrounding paragraph.
* Some additional explanation is necessary in this section. Specifically, does the winner of a tournament undergo exactly one of these operations, or does it undergo potentially all of these ops, with the probability of each op being performed given by its respective parameter?
Integers are not inexact, and thus do not have an abs_x2 = np.abs(x2, dtype=np.float64) If we always cast to float64, then we could define |
|
@trevorstephens, any input on this? |
I'll be going through the issue tracker tomorrow! Sorry, I've been quite
busy the past couple weeks and haven't had a chance to go through the
recent input, always thankful for the community involvement and wish I had
more time!
…On Fri, Feb 15, 2019 at 3:51 PM Ben Price ***@***.***> wrote:
@trevorstephens <https://github.com/trevorstephens>, any input on this?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#131 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE-C4DOKh1UhYywdADGh-IBEkWDqRnWqks5vNjzGgaJpZM4ay1Ud>
.
|
@@ -14,6 +14,7 @@ | |||
|
|||
__all__ = ['make_function'] | |||
|
|||
_EPS = np.finfo(np.float64).eps # Used by protected functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would mean that results may differ from machine to machine I guess. Don't really like this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_EPS = np.finfo(np.float64).eps # Used by protected functions | |
_EPS = 1e-6 # or any other value | |
# Consider printing a warning if default value of _EPS is smaller than | |
# machine epsilon of whatever dtype is being used (e.g. np.float16) |
This solution still provides the benefit of returning more reasonable values as we approach singularities. I could also then remove the explicit conversion to np.float64
in the definitions of the protected functions--inputs would be implicity converted upon addition of _EPS
. Testing of protected functions would need to be updated with lower tolerance.
Hey @pricebenjamin , appreciate the input. The solution here has the immediate red flag that the But perhaps we should back up a little though and discuss why this solution is better? Is a more mathematically exact formulation around singularities desirable? Feels like close to infinity is still kind of as bad as actual infinity when it comes to fitness in most cases? |
Also can we remove the documentation changes from this PR and put them in #129 @pricebenjamin ? I am very hesitant about this change due to the potential machine dependencies as it stands, but each PR should be a single issue as well. |
All of the documentation changes are actually in #129. I made the mistake of not creating the documentation branch separately, and so those changes are automatically in this branch. Should I pull the original documentation into this branch? Or would it be possible to complete #129 first? |
Ah yeah no worries then |
Understood.
These new protected functions have a much smaller range over which they're degenerate (with 64-bit precision, these regions are nearly non-existent), which consequently decreases the likelihood that programs will contain degenerate nodes. However, it may be the case that Edit: Perhaps more importantly, see the answer to your last question.
Having formulas which match expectations is desirable, in my opinion. If it doesn't impact performance (which I could try and test more thoroughly), is there a downside to making them more exact?
Part of the issue with the old implementation is that the functions were not producing large results when they should have been. For example, |
Also, I greatly appreciate the input @trevorstephens! |
Though I welcome the input and discussion, I am very reluctant to merge this PR. I'm a bit unconvinced by this "reasonable value" point of view. What is reasonable? Why is the current implementation unreasonable? Does a more pure division mean that the results will be more reasonable? or better? Is there any research on this front? (if so then happy to read it) While I understand that the change is at first glance minor, it does completely break compatibility for all prior releases, so that needs a bit of pause. Since the advantages of the change are unknown, disadvantages equally unknown, and this change can be completely implemented by the user easily using the existing custom functions functionality, I think I'm leaning towards a no on this one. Do appreciate the input, but not convinced that the benefits are proven here, and the negatives are non-trivial |
I totally understand the reluctance 👍 I'll see if I can find some resources that discuss the issue of closure.
Could you elaborate on this? I understand that code written for prior versions would, most likely, produce different results, but I would expect that this method produces fewer degenerate nodes and ultimately better programs (specifically because it correctly penalizes parameters near singularities). If you're interested, I could work on thoroughly comparing this implementation to the previous. It may take a few weeks, though. |
@pricebenjamin -- exactly. Code written to run in older versions of the package will produce different results in the new version. This makes people less likely to upgrade versions and thus not adopt new features, or be surprised by different results. I like to avoid these things wherever possible. You're welcome to make any comparisons that you wish, I can't promise that I will merge this though due to the compatibility issue. |
What might at least be considerable is the way the closure is performed. I personally don't care so much about how large the EPS is but how it is applied.
I fail to see the reason why the GP system needs to generate a constant 1 in the first place. It is also not explained in the field guide. Example A: In this certain case the current gp system would never be able to learn the original function. Example B: As Example A and Example B highlight both implementations have their drawbacks.
|
That's fairly a persuasive case @wulfihm ... Will think about it |
This PR is intended to solve issue #130. Performance tests should be conducted. There's probably a better solution for determining
eps
, but this method determines the smallesteps
available for the inferreddtype
. It may fail ifx1.dtype != x2.dtype
. (How often would that be the case?)