-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update dense implementation to "New NNUE architecture and net" #201
base: master
Are you sure you want to change the base?
Conversation
@Sopel97 |
probably |
I managed to get sparse implementation working. Now no path for AVX2 and AVX512 though as there's no need (AVX2 could probably work like AVX512 did previously but idk). That means that the one big thing left is getting dense to work on >=AVX2 |
@Sopel97 nnue.c: In function 'init_weights': warnings with sparse=yes nnue.c: In function 'verify_net': |
@Sopel97 nnue.c:226:34: error: conflicting types for 'clipped_t' |
4c71e9d
to
71e7d1f
Compare
@Sopel97 |
seems that there are issues with >=AVX2 both for sparse=no and sparse=yes. However, sparse=yes doesn't use AVX2 so the error is not in inference; probably some ifdef I missed. |
More patches applied on top here https://github.com/Sopel97/Cfish/tree/sf_2021-07-23, but the issues outlined in this PR still persist. |
@Sopel97 I tried compiling your latest code with both GCC and Clang, but the bench doesn't match each time. Also the engine has become really slow. |
@MagicianofRiga Incorrect bench only for AVX2/BMI2 and AVX512 builds. |
Is this bench correct?Total time (ms) : 5801 Command - make build ARCH=x86-64-sse41-popcnt COMP=gcc sparse=yes |
no, this patch mirrors official-stockfish/Stockfish@e8d64af |
@MagicianofRiga Don't save excluded move eval in TT |
official-stockfish/Stockfish@e8d64af
Overall I've had quite a bit of trouble with this as there are many hidden assumptions in the code, as well as magic constants sprinkled everywhere. But I've got the dense nnue part working at least. The 16->32 layer is just interpreted and evaluated as 32->32 as it's small and makes it easier.
What still needs to be done
tested archs (make build sparse=no ARCH=...) (can't really test others):
@syzygy1 I'd love you to look at this and we can maybe figure something out for the sparse case, as I really want cfish to get up to date. I tried removing AVX2 and AVX512 paths for sparse, hardcoding the output count to 16 for it, and using dense layers layer, but got a lot of trouble with it, due to different types being used for different archs and whether sparse is used, and different permutations that I don't understand and so on... Right now the nnue-sparse might not be that much of a speedup, but if everything goes well on my end we might end up with similar codepath in stockfish and 1024x2-64-64-1 nets or something like this, so I'd like this path to get preserved in some form.