parallel automatic differentiation engine written in thrust c++
This is a passion project i made to feel like i understand in internal structure of deep learning frameworks ,it's minimal since i made it just as a learning practice and not for production. Althought it's a bit lacking and also buggy since i don't have a GPU that supports atomic operations or have a basic computations API so it's not as optimized as it should be. There is also the issue with putting all the code in one file but tbf i don't care i fucking hate header files.