-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Comparison to nupic #792
Comments
htm.core is a rework of NuPic so that it will work. I think you will find that nearly everything you want to do with NuPic should be covered by htm.core. NuPic on the otherhand is basically broken. @breznak will have to address performance of htm.core but I expect it to be comparable (or perhaps even better) than NuPic. |
@steinroe glad for your interest!
In terms of feature-fullness and active support, htm.core has now much surpased its parent, numenta's nupic(.core).
yes, this is an open problem. Theoretically htm.core should be better/or same/or slightly worse to Numenta's nupic. We did a couple of improvements as well as a couple of "regressions" in the name of biological plausibility. You could start by looking at the Changelog.md, the git diff is already too wild. Yet, the so-far performance on NAB of htm.core is much worse. (Note, such regression is not observed on "sine", or "hotgym" data.). My gut feeling is it's just "some parameter that is off".
you would'nt have to dive deep, or we'd be here to help you.. So my recommendation would be: Give htm.core a trial period (say 1-5 weeks) and try to find where the cuplit is. Doing so would help the community and would be a significant result for your thesis. I could help you in trialing down the process in NAB so we can locate the err. |
@breznak and @dkeeney thanks for the quick replies! Regarding the Spatial Pooler, htm.core differentiates from nupic by three parameters: The second thing I did was a direct comparison of the anomaly scores yielded by nupic and by htm.core on the same datasets. You can find the output in the attached pdf. There are two things that aren’t right:
What would you say, is it rather because of the params of the encoders or params of the algorithm itself? Also, there are differences in the parameters of the RSDE encoder. Only the resolution parameter is also used in nupic. Or did I miss something here? |
Very nice analysis @steinroe !! 👍
first, let me correct my mistake, the file I wanted to point you to is About the API (and implementation differences):
cool. If you don't have own framework for optimization, I suggest looking at
This is tricky, but can be computed rather precisely. I'll have to look at this deeper again...
So this depends on expected avg number of ON bits from encoder, number of synapses and range of input field each dendrite covers, how noisy is the problem.
depends on
Seems your optimizer prefered that variant. (note, it might thus also just got stuck in a local optima). I'd try something as "global" and "25% of input field" as reasonable defaults.
This is lower-bounded by SP's size aka numColumns. (density * numCols >= some meaningful min value).
great graphs!, some notes on that later. This leads me to a decomposition. The chain is:
actually, there's a small drop at that time.
The RDSE encoder is rather well tested, so I wouldn't expect a hidden bug there. Maybe unusable default params. It seems to me the the HTM didn't learn at all in most cases (except the "flatmiddle")
I think it'd be in the params. But cannot tell whether enc/sp/tm. It's all tied together. Great job investigating so far. I'll be looking at it tonight as well. Please let us know if you find something else or if we can help explain something. |
@steinroe Update:
PS: also I'd suggest using the following branch, it has some nice prints and comments to it. |
CC @Zbysekz as the author of HTMpandaVis, do you think you could help us debugging the issue? I'd really appreciate that! TL;DR: minor parameters and changes were surely done to htm.core. Compared to Numenta's Nupic, our results on NAB really suck now. I'm guessing it should be matter of incorrect params. There look into the representations with the visualizer would be really helpful. |
@breznak Thanks for the detailed review!
Sorry, I forgot to mention that I turned that off for both detectors - HTMCore and Nupic.
I saw that right after I was done with my optimization. Going to try yours too very soon! Thanks for the information on the parameters. I guess from your explanation this is very likely some parameters off. I will look into the default parameters of both to get complete comparison of what could be different. I setup a repo with the stuff I used to plot both detectors against each other, as I didn't want to bother with all the NAB stuff for that. For the HTMCore detector I actually used yours from htm-community/NAB#15. For nupic, I setup a little server with a (very very) simple API to use the original nupic detector. I just removed the base class so I put the min/max stuff directly into the detector and removed the spatial anomaly detector in both.
That appears weird to me... I am going to look into the outputs of the scores in detail, thanks for that!!!!
True! I guess we could be able to find the differences in the param settings better. |
I've confirmed that on the "nojump" data, the err still persists. HTMcore does not detect anything, numenta does have a peak. This could be 2 things:
|
This looks good! I might be interested for that for community/NAB to provide (old) numenta detectors. Numenta/NAB switched to docker for the old py2 support, and this seems a good way to interface that! I'll try your repo, thanks 👍 |
Sorry for the late reply. I am currently in progress of doing an in-depth comparison between the parameters and there are definitely some differences. You can find the table in the Readme here: While most differences could be easily resolved, I need your input on the params of the RDSE Encoder. While HTMCore does have size and sparsity, nupic has w, n and offset set with default parameters. The descriptions seem similar, however I am not sure if e.g. size and w (which is probably short for width) mean the same. Do you have an idea here @breznak ? This is the relevant section of the table, sorry for its size. The value columns show the values which are set by the respective detectors in NAB. If the cell is empty, the default value is used.
|
Nice analysis!! Sounds like a good idea to try that out. I will do that after I am done with the parameter comparison. |
That's a good idea to write such a comparison of params/API. I'd like the result to be published as a part of the repo here 👍
I can help on those:
I'm not 100% sure about offset without looking, but it'd be used in resultion imho. |
RDSE:
|
Thanks!
Let me try with the new parameter settings first. If that does not help, that might be another way to check wether its the encoder.
Yes I saw that and I am calculating the resolution for htmcore encoder the same way to have a fair comparison. The second parameter that is new for htm.core is the
Why was this removed from htm.core? |
I got the htmcore detector running with numenta/NAB.
... |
I will create a PR once I am done :) Is the table format readable or should I rather make it textual? |
the table is good! Might decide to drop the unimportant ones (verbosity, name) for clarity. but that's just a detail. |
yes, I proposed the removal. The reasons were nice, but not crucial, and now I'm suspecting this could be a lead.. The motivation for One the other hand, "a 'strong' column's receptive field grows" is also a good biological concept.
This would be a significant result, if one can be proven "better" (dominating) over the other. |
As the original sp also has localAreaDensity I would propose to first try out the original nupic detector with localAreaDensity instead of numActiveColumnsPerInhArea to see wether it has such an impact. I guess that would be faster than bringing it back. |
Good News! Using the Numenta parameters with localAreaDensity of 0.1 I achieve a score of 49.9 on NAB. At least some improvement. Going to try out the swarm algorithm to optimise localAreaDensity now.
These are the params:
|
another thing that I wondered about: In the detecor code there is a fixed param 999999999 defined when setting the infos for tm and sp. Shouldn't this be encodingWidth? Or is it the potentialRadius?
|
no, this is unimportant. the metric is only used of our info, does not affect the computation. |
Alright, it seems to have a significant impact. Running the Numenta detectors with The only question remaining is now if tuning Here is the code: https://github.com/steinroe/NAB/tree/test_numenta_localAreaDensity
|
Used Bayesian optimization, but nevertheless these are the results:
for the following params:
I created a PR htm-community/NAB/pull/25 for the updated params. These are the logs for seed fixed to 5, where I achieved a standard score of 60 as maximum. @breznak What would you suggest how to proceed from here? |
Wow, these are very nice results! I'm going to merge NAB.
Interestingly, this is what's claimed by the HTM theory (2%) as observed in the cortex.
compared to the Numenta results:
I'd suggest:
|
Q: do we want to keep comparable params & scores. Or just aim for the best score? Or both, separately? |
Yes, but they still win with
For this test I just optimised the localAreaDensity keeping the others constant but it can be multi parametric.
Alright, I will work on that. Do you have a feeling about which params may be important / optimizable?
Perfect, thanks for your work!
The "problem" is that your framework calls the optimization function in parallel, so my current setup with bayesian optimisation where I simply write to and read from a params.json file won't work. I will think about how to set that up that and come back to you as soon as I have a solution that works.
My gut says this may be the problem, as the scoring results seems fine. We should debug this.
I would say we just aim for the best score, as even the bug fixes in this fork may influence the best param setting. I don't know if its useful to keep a second set of params as a comparison between the two would be only fair if both use the best possible params. |
I like this, this is a really nice trick to get it done with NAB.
good, bcs I thought it didn't run in parallel. We could do something like writing to
great. I was thinking you're developing from scatch. It's better that you could be using our existing tools!
even better! this looks really convenient.
I'll try working on constraining the params, and along with that we can come up with a subset and its ranges.
|
We would probably still get interferences when the detector reads the params file. I think locking the parallel runs of NAB into a container is the cleanest and safest method. The only downside is probably performance, as we have to give docker host a fair share of our host resources.
Bayesian Optimization does that automatically by randomly choosing some parameters in the beginning and from time to time. Additionally, we can probe some settings (lets say 1024, 2048, ... 8192) so the algorithm knows of the effect these settings have. Its quite good in avoiding local maxima out of the box (at least to my experience)
Sure, will do.
I think we could put the image on docker hub so the user does not have to build the image. With a minor change to the script we could pull the image from docker hub. Then, The workflow would be as follows: For using the htm.core optimization frameworkRequirements: Docker Desktop, htmcore, requirements of script such as docker
For using the pure python script with bayesian optRequirements: Docker Desktop, requirements of script such as docker and bayesian opt
EDIT: Where to put the scripts? Another repo? Or just as branch? |
ok, I agree. We need to get there with minimal manhours, so this works as intended.
interesting. nice if we can "hint" to try these datapoints.
maybe not necessarily DockerHub, but a docker file in the repo (NAB) would be great 👍 htmcore is already dependency of (our) NAB, so that's no problem.
I'd make this part of community/NAB. So submit as PR/branch, and we'll merge directly to master (this enhancement is unrelated to "fixing htmcore scores") |
Sounds good to me! Do we want to enable optimisation on NAB in general or only for htmcore? |
I'd say minimal changes first - we need to solve topic of this issue. So just HTMcore for now. Because I'm still unsure how to proceed with
but
|
Alright, I will update the optimisation PR with a proposal on how to make it as user friendly as possible. |
Hello, sorry for the delay. I am not sure if that is the script that you guys are using to run it. |
hi Zbysekz!
that'd be great step! EDIT:
|
Ok in this PR is the modified htm core detector. |
About the jupyter notebook "Plot Result - numenta.ipynb" As you guys, i also don't get the FP/TP... score. It is something written by https://github.com/pasindubawantha Just remove this piece of code and thats ALL. It gets the TP,TN,FB,FN from resulting file htm_core_standard_scores.csv (why to calculate that again?)
|
true that, the code seems really hacky, and as you say, the scores are there already. Off with it. |
I'm really happy how this comes together, fixes, research, optimization, visualization,... Thanks a lot guys!! 🍾 |
For the third point of #792 (comment)
I agree, the expected anomaly window (aka label) have wrong values. EDIT: also i am little bit afraid, if this whole evaluation/optimalization calculates also with the anomaly scores at the begining.. it seems that yes. IMHO evaluating, that HTM system gives me anomaly on data that never saw, is fundamentally wrong. Is there some learning period at the beggining? Do you guys know something about this? |
NAB has some probationary period which is used by the anomaly likelihood. It basically returns a score of 0.5 as long as this probationary period is over. Is that what you mean? |
We'll have to do it manually. First, we should see if it's a concern for other datasets (realWorld anomalies etc) and/or if we can distinguish a "proper window".
But guess by hand is feasible. |
as Phillip says, anomaly is set to some predefined value (0.5) for the
But to be fair, this might not be such an issue:
|
The probationary percent starts in runner.py as hardcoded parameter of value 0.15.
Where is that weird constant 5000. Probably due to limit calculation... |
thanks for pointing to the code in question.
it shouldn't be. The one in detector is basically cosmetic. It switches when we think anomaly scores start to make sense. We should set it to the 15% in NAB. @steinroe I recall in some example you use that "probatory period" as a param to be optimized. Baisically that should be unimportant, and could be removed from (optimized) params. Off-Topic: @Zbysekz seems you've tapped into the NAB code, I'd welcome another review/opinion on htm-community/NAB#21 (NAB from numenta, vs community) |
Hi everyone, sorry for my inactivity. I had / have to work on other stuff to get going with my thesis. What are the open tasks at the moment? |
@steinroe no problem,
Other relevant tasks (where any help is welcome, hoping to write papers):
|
FYI: We now have working pipy releases, https://github.com/htm-community/htm.core/releases/tag/v2.1.15 Is there any progress on the NAB results? @steinroe |
Hey @breznak, @psteinroe, @dkeeney and @Zbysekz, I hope you're doing well and thanks so much for your work on this thread!! It has helped me validate my 'htm_streamer' module (that I showed you @dkeeney and @breznak earlier this year). I'm curious - is anyone still curious why htm.core scores lower on NAB than the other HTM implementations (htm.java and Numenta)?? My friend and I would be extremely curious for anyone intuitions on this, and I'd gladly arrange a quick call if anyone's game. |
@gotham29 you should use TM, which worked well for us over 1 year ago |
Hi @Thanh-Binh, thanks!! When you say 'TM' do you mean the NumentaTM detector? Thanks again for your thoughts!! |
Hi @gotham29 |
Gotcha @Thanh-Binh, I'm very glad to hear it worked well for you! |
@gotham29 as far as I know HTM.core should be the best now. Maybe we need to find an optimal parameter set for its TM! |
@Thanh-Binh yes I agree! Though I wonder why HTM.core's TM would need any different param set than NuPIC's TM? |
Hi everyone,
thanks for the great work on htm.core! I am doing some research on leveraging HTM for anomaly detection and I am wondering wether I should use htm.core or nupic. Is there any comparison in terms of performance?
@breznak You described some issues in PR #15. What would you say - continue with htm.core or rather get nupic running? Its just a masters thesis so I won't be able to dive deep into htm to improve the implementation...
The text was updated successfully, but these errors were encountered: