-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct adjoint for truncated SVD #15
Conversation
…dd small test script
I think this is ready to merge from my side. The I scrapped the |
I updated to the newest KrylovKit version that uses VectorInterface.jl to simplify the coupled linear problem that arises in the SVD adjoint. This allows us to just use a |
Jutho/KrylovKit.jl#83 |
I was thinking about this again, so let me list a few TODOs that should still be implemented before merging:
|
My take on this: Symmetric tensors +
|
I think your take is pretty much spot on. For the case of symmetric tensor + I'm currently writing such a SVD interface with a wrapper function. I think that makes the most sense because that would make user-specified custom SVD functions easy to implement (and that's what I frequently need at the moment). Regarding the input arguments, I would try to keep it quite general. I will push something next week and maybe try to finish the PR then. |
Great! I think the KrylovKit AD PR is getting its final touches, so by then this should be merged and AD might work out of the box. Let me know if I can do something! |
… SVD adjoint script to test
I added some wrapper structs and a wrapper function for the SVD. I thought it would be a useful thing to have extra structs for the SVD algorithm since you can store the different algorithm parameters and still have a single I'm not sure yet how to implement the adjoint for the Also, both the Note that I also included the I could use some help or input on how to reuse the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple comments in the code, but in general I quite like the idea.
Some more generic remarks:
I think, if possible, we should decouple the forwards and backwards implementations as much as possible. This means using something like hook_pullback
, but also that we can move the Lorentzian broadening question into the configuration of the backwards algorithm instead of the forwards. (Am I correct in assuming that the broadening only makes sense when performing AD, and you would otherwise rather have the broadening turned off?) This would also mean that we automatically have the broadening only for the final iteration which is used in the fixed point differentiation, while the bulk of the CTMRG iterations just uses a full decomposition.
I dont like the term svdwrap
, but I guess that can be changed however we like. I wouldnt mind reusing either svdsolve
or tsvd
without importing it, i.e. TensorKit.tsvd would not be the same as PEPSKit.tsvd, which would only matter within this package, and not change it outwards, and then afterwards we could merge the two?
src/utility/svd.jl
Outdated
alg::KrylovKit.GKL = KrylovKit.GKL(; tol=1e-14, krylovdim=25) | ||
howmany::Int = 20 | ||
lorentz_broad::Float64 = 0.0 | ||
alg_rrule::Union{GMRES,BiCGStab,Arnoldi} = GMRES(; tol=1e-14) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep this alg_rrule
separate, maybe re-using the hook_pullback
interface, to decouple the implementation of the forward and backward rules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in terms of usability, I think the SVD algorithm struct would be the best way to set the SVD adjoint parameters. So if I were to use hook_pullback
in this case, I would need to store the alg_rrule
and lorentz_broad
settings in the CTMRG
struct, which I also don't like that much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, that's a very good point.
How about a wrapper then?
struct SVDrrule{F,T,R,E<:Real}
svd_alg::F
truncation_alg::T
rrule_alg::R
lorentz_broadening::E
end
In this case, we still at least semantically separate the rrule from the forward pass (which I really like), while also grouping them together, which I also quite like. It also looks to me to be quite flexible.
I think we should probably investigate (not just here, but throughout PEPSKit), if we want to include the truncation algorithm here or as a separate field. My guess would be that it's actually not a bad idea to keep them separate, considering we presumably might want to have varying truncspace
settings throughout the unit cell, without changing the SVD algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now added such a wrapper. It leads to the cool thing that you can easily choose the rrule with the corresponding algorithm struct and this decouples the forward SVD algorithm and its reverse rule. I decided to hide the lorentz_broadening
away inside the rrule_alg
since in general Lorentzian broadening might not be needed in every SVD implementation.
The only thing I'm unsure about is the naming; maybe something like SVDWithRrule
, SVDWithAdjoint
, DiffSVD
or DifferentiableSVD
would be more descriptive?
Codecov ReportAttention: Patch coverage is
|
as a small heads-up, the latest version of tensorkit now is less strict with the |
Thanks for the heads-up, that is good timing! I will work on the PR tomorrow, so the TensorKit update comes in handy. |
What I just realized: We can get rid of |
It might conflict with the MPSKit compat entries, but I'll try and get a release out for that one asap |
…te svdwrap wrapper and instead directly use tsvd, add KrylovKit compat entry
… SVD test, add docs
Will add a |
Alright, I think I would be fine with merging the PR like this (once the tests work). Maybe we could first merge #49 and get the tests to run there, and then merge master into this branch (such that the tests still hopefully work). |
@lkdvos I'm not sure how to make the PR compatible with Julia 1.8 since I have to use |
I think in 1.8, these extensions are just loaded as regular modules, so you can just import them accordingly. For reference, I did a similar thing here |
Gave it a try and I'm not so sure, why I can't access the extensions from the main module. I don't quite understand why the TensorOperations example works. I might need some help here |
I'm super confused myself now. I have no problem getting it to work from the REPL on 1.8, and apparently it's also fine from a package extension, but somehow the construction in the main package of a module seems to not work. Technically, that construction is also just very illegal, Extensions should not define new functions, so accessing them from a different package is just going against all guidelines. I have an idea on how to avoid the entire problem, but the simplest solution is to just restrict to julia 1.9 😁 I am definitely okay with this restriction, it seems like we will have to do this for TensorKit in the near future as well, as some multithreaded implementation there requires this. |
Oh well, what a mess. Thanks for digging deep :-) I would also be very okay with restricting to v1.9. |
So, in principle, we could support 1.8 by computing the rrule block-per-block in the forwards pass using the normal chainrules interface, and then in the pullback applying these rules block per block. I don't think this makes any difference performance-wise, but I am definitely a little lazy right now. @leburgel , do you have any strong feelings against restricting to 1.9? |
Alright that would be possible, but it seems like quite a big hassle for supporting a non-LTS version... I'd still go with restricting to v1.9 :) |
I have no objections whatsoever to restricting to 1.9, go for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tests are green, good to go for me
Just noticed a duplicate line, so I removed it. If the tests run, this is also good to go for me! |
This PR implements the correct adjoint for a truncated SVD, as explained in this pre-print. To that end, we use KrylovKit's iterative
svdsolve
which is wrapped in the functionitersvd
, so that a custom rrule can defined locally.I will also add a small test script that compares against the previous adjoint on some gauge-invariant loss function.