Correct adjoint for truncated SVD #15

pbrehmer · 2024-03-04T18:11:57Z

This PR implements the correct adjoint for a truncated SVD, as explained in this pre-print. To that end, we use KrylovKit's iterative svdsolve which is wrapped in the function itersvd, so that a custom rrule can defined locally.

I will also add a small test script that compares against the previous adjoint on some gauge-invariant loss function.

…dd small test script

…ion differentiable

pbrehmer · 2024-03-05T15:14:46Z

I think this is ready to merge from my side. The itersvd adjoint seems to be compatible with the TensorKit.tsvd adjoint for the loss function I tested it on. So this seems to work.

I scrapped the test_rrule bit for now since I thought we could add it later once we have unit tests running. Then I'll also add the rrule for AbstractMatrix.

pbrehmer · 2024-03-26T13:49:12Z

I updated to the newest KrylovKit version that uses VectorInterface.jl to simplify the coupled linear problem that arises in the SVD adjoint. This allows us to just use a Tuple and to get rid of all the reshapes, so this is much nicer now.

lkdvos · 2024-04-04T11:58:00Z

Jutho/KrylovKit.jl#83
Just wanted to link these PRs

pbrehmer · 2024-05-28T13:44:32Z

I was thinking about this again, so let me list a few TODOs that should still be implemented before merging:

Make itersvd compatible with symmetric tensors
Write some interface which makes it possible to choose different SVD functions during CTMRG. It could be as easy as overloading the alg kwarg in tsvd
Rewrite test_svd_adjoint.jl as an actual test

lkdvos · 2024-05-29T14:19:42Z

My take on this:

Symmetric tensors + `KrylovKit.svdsolve`

In the case of an explicit input tensor, the mapping is straightforward, as this can be achieved simply block-by-block:

foreach blocksector(t) do c
   Us, Ss, Vs, info = svdsolve(block(t, c), num, which, alg)
   block(U, c)[:, 1:num] .= stack(Us)
    block(S, c)[diagind(Ss)] .= Ss
    block(v, c)[1:num, :] .= stack(Vs)'
end

The annoying thing is to determine how you want to specify num. As far as I can tell, the easiest way of achieving this is just through a truncspace(V)-like specification. This then immediately fixes the structure of U, S, V, so filling in the blocks is also straightforward (I am skipping over all details here).

Symmetric tensors + `KrylovKit.svdsolve` + arbitrary function

In principle, it should also be possible to supply an implicit tensormap (e.g. as the contraction of a network instead of the dense tensormap). This still works, but now we need:

the "row spaces" or "codomain" of the initial guess, or an explicit initial guess
the number of svdvalues per sector, which directly translates to inner spaces between U,S, V

The question here is whether or not we want to allow the initial guess to be an arbitrary tensormap, or restrict it to a tensor. In the latter case, it is quite straightforward to then build U and V, as this is just catdomain and catcodomain, but in the former there is an additional permutation involved. I guess this is still possible, but there are definitely some subtleties to consider

Writing an interface

I guess the easiest is to figure out the implementation of the above first, and then check what input arguments we need. Then we just define a wrapper function in PEPSKit that dispatches to either KrylovKit or TensorKit, depending on if alg isa SVD/SDD or alg isa GKL.

In principle, I would be very happy to have this wrapper in a package extension of TensorKit, as a very similar thing exists for eigsolve and expsolve, but I would think here is a good place to have a first implementation, which we could consider migrating afterwards

pbrehmer · 2024-05-31T08:33:18Z

I think your take is pretty much spot on. For the case of symmetric tensor + KrylovKit.svdsolve that's what I would have implemented. The case of an arbitrary function is not yet super relevant for PEPSKit I think (apparently it doesn't make too much of a performance difference, unless you go to really large dimensions) but of course it should be implemented eventually.

I'm currently writing such a SVD interface with a wrapper function. I think that makes the most sense because that would make user-specified custom SVD functions easy to implement (and that's what I frequently need at the moment). Regarding the input arguments, I would try to keep it quite general. I will push something next week and maybe try to finish the PR then.

lkdvos · 2024-05-31T09:30:11Z

Great! I think the KrylovKit AD PR is getting its final touches, so by then this should be merged and AD might work out of the box. Let me know if I can do something!

… SVD adjoint script to test

pbrehmer · 2024-06-20T15:27:40Z

I added some wrapper structs and a wrapper function for the SVD. I thought it would be a useful thing to have extra structs for the SVD algorithm since you can store the different algorithm parameters and still have a single svdwrap interface that will get called inside the CTMRG routine.

I'm not sure yet how to implement the adjoint for the IterSVD which uses KrylovKit.GKL. The problem with the adjoint inside KrylovKit is that it uses vectors of vectors under the hood instead of matrices to perform all operations. I wonder what the best way is to convert back and forth from Matrix{ComplexF64} to Vector{Vector{ComplexF64}}.

Also, both the tsvd! and KrylovKit.svdsolve adjoint lack the possibility to do Lorentzian broadening to avoid divergences in case of degenerate singular values. However, this will be necessary in some PEPS simulations. So here I'm also not sure what the best way is to proceed; either we implement the broadening directly in TensorKit and KrylovKit, or we need a custom adjoint inside PEPSKit.

Note that I also included the OldSVD with the previously used (wrong) adjoint. For now it is useful for testing purposes, I think, but we can still throw it out later.

I could use some help or input on how to reuse the KrylovKit.svdsolve adjoint and also how to realize the Lorentzian broadening. Of course we could also go for a different interface altogether, I'd happy about suggestions! @lkdvos

…left_move

lkdvos

I left a couple comments in the code, but in general I quite like the idea.

Some more generic remarks:
I think, if possible, we should decouple the forwards and backwards implementations as much as possible. This means using something like hook_pullback, but also that we can move the Lorentzian broadening question into the configuration of the backwards algorithm instead of the forwards. (Am I correct in assuming that the broadening only makes sense when performing AD, and you would otherwise rather have the broadening turned off?) This would also mean that we automatically have the broadening only for the final iteration which is used in the fixed point differentiation, while the bulk of the CTMRG iterations just uses a full decomposition.

I dont like the term svdwrap, but I guess that can be changed however we like. I wouldnt mind reusing either svdsolve or tsvd without importing it, i.e. TensorKit.tsvd would not be the same as PEPSKit.tsvd, which would only matter within this package, and not change it outwards, and then afterwards we could merge the two?

src/algorithms/ctmrg.jl

src/utility/svd.jl

lkdvos · 2024-06-28T10:07:58Z

src/utility/svd.jl

+    alg::KrylovKit.GKL = KrylovKit.GKL(; tol=1e-14, krylovdim=25)
+    howmany::Int = 20
+    lorentz_broad::Float64 = 0.0
+    alg_rrule::Union{GMRES,BiCGStab,Arnoldi} = GMRES(; tol=1e-14)


I would keep this alg_rrule separate, maybe re-using the hook_pullback interface, to decouple the implementation of the forward and backward rules

But in terms of usability, I think the SVD algorithm struct would be the best way to set the SVD adjoint parameters. So if I were to use hook_pullback in this case, I would need to store the alg_rrule and lorentz_broad settings in the CTMRG struct, which I also don't like that much.

Ah I see, that's a very good point.

How about a wrapper then?

struct SVDrrule{F,T,R,E<:Real} svd_alg::F truncation_alg::T rrule_alg::R lorentz_broadening::E end

In this case, we still at least semantically separate the rrule from the forward pass (which I really like), while also grouping them together, which I also quite like. It also looks to me to be quite flexible.

I think we should probably investigate (not just here, but throughout PEPSKit), if we want to include the truncation algorithm here or as a separate field. My guess would be that it's actually not a bad idea to keep them separate, considering we presumably might want to have varying truncspace settings throughout the unit cell, without changing the SVD algorithm.

I have now added such a wrapper. It leads to the cool thing that you can easily choose the rrule with the corresponding algorithm struct and this decouples the forward SVD algorithm and its reverse rule. I decided to hide the lorentz_broadening away inside the rrule_alg since in general Lorentzian broadening might not be needed in every SVD implementation.

The only thing I'm unsure about is the naming; maybe something like SVDWithRrule, SVDWithAdjoint, DiffSVD or DifferentiableSVD would be more descriptive?

codecov · 2024-06-28T12:30:32Z

Codecov Report

Attention: Patch coverage is 89.14729% with 14 lines in your changes missing coverage. Please review.

Files	Coverage Δ
src/PEPSKit.jl	`100.00% <ø> (ø)`
src/algorithms/ctmrg.jl	`94.81% <100.00%> (+0.13%)`	⬆️
src/utility/svd.jl	`87.93% <87.93%> (ø)`

... and 1 file with indirect coverage changes

lkdvos · 2024-07-02T14:18:03Z

as a small heads-up, the latest version of tensorkit now is less strict with the tsvd types: Jutho/TensorKit.jl#134 (comment)

pbrehmer · 2024-07-02T15:11:59Z

Thanks for the heads-up, that is good timing! I will work on the PR tomorrow, so the TensorKit update comes in handy.

pbrehmer · 2024-07-04T09:21:42Z

What I just realized: We can get rid of eigsolve.jl since that is now implemented in the new KrylovKit versions. I'll take care of that in this PR. (And of course we should adjust the compat entry for KrylovKit then.)

lkdvos · 2024-07-04T09:34:38Z

It might conflict with the MPSKit compat entries, but I'll try and get a release out for that one asap

…te svdwrap wrapper and instead directly use tsvd, add KrylovKit compat entry

src/utility/svd.jl

… SVD test, add docs

pbrehmer · 2024-07-08T17:06:30Z

Will add a IterSVD test for symmetric tensors tomorrow and try to fix the other tests. Then the PR should be finally done (hopefully).

pbrehmer · 2024-07-09T08:47:38Z

Alright, I think I would be fine with merging the PR like this (once the tests work). Maybe we could first merge #49 and get the tests to run there, and then merge master into this branch (such that the tests still hopefully work).

src/utility/svd.jl

test/ctmrg/svd_wrapper.jl

pbrehmer · 2024-07-09T14:41:42Z

@lkdvos I'm not sure how to make the PR compatible with Julia 1.8 since I have to use Base.get_extension for the sparse SVD adjoint in order to access functions from the KrylovKit AD package extension. AFAIK package extensions were only added after v1.8 - so do you think there is a way around it somehow, without losing backwards compatibility? Other than that the tests should run!

lkdvos · 2024-07-09T14:45:03Z

@lkdvos I'm not sure how to make the PR compatible with Julia 1.8 since I have to use Base.get_extension for the sparse SVD adjoint in order to access functions from the KrylovKit AD package extension. AFAIK package extensions were only added after v1.8 - so do you think there is a way around it somehow, without losing backwards compatibility? Other than that the tests should run!

I think in 1.8, these extensions are just loaded as regular modules, so you can just import them accordingly. For reference, I did a similar thing here

pbrehmer · 2024-07-09T15:30:35Z

I think in 1.8, these extensions are just loaded as regular modules, so you can just import them accordingly

Gave it a try and I'm not so sure, why I can't access the extensions from the main module. I don't quite understand why the TensorOperations example works. I might need some help here

lkdvos · 2024-07-09T16:35:36Z

I'm super confused myself now. I have no problem getting it to work from the REPL on 1.8, and apparently it's also fine from a package extension, but somehow the construction in the main package of a module seems to not work.
I am doing some further digging, but it seems like this is really a limitation of how package extensions and "Requires" work, and in particular from what I am gathering this code is only included after the main module is loaded, so there really is no way to access it.

Technically, that construction is also just very illegal, Extensions should not define new functions, so accessing them from a different package is just going against all guidelines. I have an idea on how to avoid the entire problem, but the simplest solution is to just restrict to julia 1.9 😁 I am definitely okay with this restriction, it seems like we will have to do this for TensorKit in the near future as well, as some multithreaded implementation there requires this.

pbrehmer · 2024-07-09T16:39:33Z

Oh well, what a mess. Thanks for digging deep :-) I would also be very okay with restricting to v1.9.

lkdvos · 2024-07-09T17:05:21Z

So, in principle, we could support 1.8 by computing the rrule block-per-block in the forwards pass using the normal chainrules interface, and then in the pullback applying these rules block per block. I don't think this makes any difference performance-wise, but I am definitely a little lazy right now. @leburgel , do you have any strong feelings against restricting to 1.9?

pbrehmer · 2024-07-09T17:13:19Z

Alright that would be possible, but it seems like quite a big hassle for supporting a non-LTS version... I'd still go with restricting to v1.9 :)

leburgel · 2024-07-10T08:14:45Z

So, in principle, we could support 1.8 by computing the rrule block-per-block in the forwards pass using the normal chainrules interface, and then in the pullback applying these rules block per block. I don't think this makes any difference performance-wise, but I am definitely a little lazy right now. @leburgel , do you have any strong feelings against restricting to 1.9?

I have no objections whatsoever to restricting to 1.9, go for it.

lkdvos

If the tests are green, good to go for me

pbrehmer · 2024-07-10T08:45:02Z

Just noticed a duplicate line, so I removed it. If the tests run, this is also good to go for me!

Add truncated SVD adjoint with wrapper for KrylovKit iterative SVD, a…

c390bec

…dd small test script

pbrehmer marked this pull request as draft March 4, 2024 18:12

pbrehmer added 2 commits March 5, 2024 11:16

Use KrylovKit.linsolve for truncation linear problem, make loss funct…

51507ce

…ion differentiable

Improve loss function, compare SVD gradient with TensorKit.tsvd gradient

d3a31fb

pbrehmer marked this pull request as ready for review March 5, 2024 15:11

pbrehmer mentioned this pull request Mar 21, 2024

Update gauge fixing to work with symmetries #19

Merged

Update SVD adjoint linear problem to use Tuple and remove reshapes

8ac307e

Fix ZeroTangent case for linear problem

80db1c0

pbrehmer mentioned this pull request Apr 23, 2024

Add svdsolve AD rule Jutho/KrylovKit.jl#84

Merged

Add SVD wrapper structs and function, utilize tsvd machinery, convert…

ae96298

… SVD adjoint script to test

Copy ctmrg.jl from master, add svdalg field to CTMRG, use svdwrap in …

78ab152

…left_move

lkdvos reviewed Jun 28, 2024

View reviewed changes

Merge branch 'master' into svd_adjoint

dc6ad2b

pbrehmer added 2 commits July 4, 2024 18:03

Use KrylovKit implementation of eigsolve instead of eigsolve.jl, dele…

fc6600e

…te svdwrap wrapper and instead directly use tsvd, add KrylovKit compat entry

Add IterSVD _tsvd! method and adjoint using KrylovKit.svdsolve adjoint

823f52e

lkdvos reviewed Jul 5, 2024

View reviewed changes

src/utility/svd.jl Outdated Show resolved Hide resolved

Add PEPSKit.tsvd wrapper, fix IterSVD adjoint

a615b4a

lkdvos reviewed Jul 8, 2024

View reviewed changes

src/utility/svd.jl Outdated Show resolved Hide resolved

Add SVDrrule wrapper, add separate adjoint structs and rrules, update…

538652d

… SVD test, add docs

Add IterSVD test for symmetric tensor with fallback

7032ed0

pbrehmer added 3 commits July 9, 2024 15:35

Merge branch 'master' into svd_adjoint

66a827e

Formatting

d09561f

Fix missing cnext in ctmrg, update README example

b3a0726

lkdvos reviewed Jul 9, 2024

View reviewed changes

src/utility/svd.jl Outdated Show resolved Hide resolved

src/utility/svd.jl Outdated Show resolved Hide resolved

src/utility/svd.jl Outdated Show resolved Hide resolved

src/utility/svd.jl Outdated Show resolved Hide resolved

test/ctmrg/svd_wrapper.jl Outdated Show resolved Hide resolved

Rename DenseSVDAdjoint, update svd_wrapper test

89ae0a4

Make CRCExt extension backwards compatible with v1.8

25d198c

leburgel mentioned this pull request Jul 9, 2024

Extend CTMRGEnv constructor for nontrivial unit cells #52

Merged

Replace SVDrrule with SVDAdjoint, clean up adjoint algorithms

6b818e7

Small cleanup

bb96664

Update minimal julia version 1.9

fa7a56a

lkdvos previously approved these changes Jul 10, 2024

View reviewed changes

Remove duplicate line in left_move

6df8efc

pbrehmer dismissed lkdvos’s stale review via 6df8efc July 10, 2024 08:43

lkdvos approved these changes Jul 10, 2024

View reviewed changes

lkdvos merged commit 5cd0304 into master Jul 10, 2024
7 of 9 checks passed

lkdvos deleted the svd_adjoint branch July 10, 2024 10:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct adjoint for truncated SVD #15

Correct adjoint for truncated SVD #15

pbrehmer commented Mar 4, 2024 •

edited

Loading

pbrehmer commented Mar 5, 2024 •

edited

Loading

pbrehmer commented Mar 26, 2024

lkdvos commented Apr 4, 2024

pbrehmer commented May 28, 2024

lkdvos commented May 29, 2024

pbrehmer commented May 31, 2024

lkdvos commented May 31, 2024

pbrehmer commented Jun 20, 2024 •

edited

Loading

lkdvos left a comment

lkdvos Jun 28, 2024

pbrehmer Jul 5, 2024

lkdvos Jul 8, 2024

pbrehmer Jul 8, 2024

codecov bot commented Jun 28, 2024 •

edited

Loading

lkdvos commented Jul 2, 2024

pbrehmer commented Jul 2, 2024

pbrehmer commented Jul 4, 2024

lkdvos commented Jul 4, 2024

pbrehmer commented Jul 8, 2024

pbrehmer commented Jul 9, 2024

pbrehmer commented Jul 9, 2024 •

edited

Loading

lkdvos commented Jul 9, 2024

pbrehmer commented Jul 9, 2024

lkdvos commented Jul 9, 2024 •

edited

Loading

pbrehmer commented Jul 9, 2024

lkdvos commented Jul 9, 2024

pbrehmer commented Jul 9, 2024

leburgel commented Jul 10, 2024

lkdvos left a comment

pbrehmer commented Jul 10, 2024

Correct adjoint for truncated SVD #15

Correct adjoint for truncated SVD #15

Conversation

pbrehmer commented Mar 4, 2024 • edited Loading

pbrehmer commented Mar 5, 2024 • edited Loading

pbrehmer commented Mar 26, 2024

lkdvos commented Apr 4, 2024

pbrehmer commented May 28, 2024

lkdvos commented May 29, 2024

Symmetric tensors + KrylovKit.svdsolve

Symmetric tensors + KrylovKit.svdsolve + arbitrary function

Writing an interface

pbrehmer commented May 31, 2024

lkdvos commented May 31, 2024

pbrehmer commented Jun 20, 2024 • edited Loading

lkdvos left a comment

Choose a reason for hiding this comment

lkdvos Jun 28, 2024

Choose a reason for hiding this comment

pbrehmer Jul 5, 2024

Choose a reason for hiding this comment

lkdvos Jul 8, 2024

Choose a reason for hiding this comment

pbrehmer Jul 8, 2024

Choose a reason for hiding this comment

codecov bot commented Jun 28, 2024 • edited Loading

Codecov Report

lkdvos commented Jul 2, 2024

pbrehmer commented Jul 2, 2024

pbrehmer commented Jul 4, 2024

lkdvos commented Jul 4, 2024

pbrehmer commented Jul 8, 2024

pbrehmer commented Jul 9, 2024

pbrehmer commented Jul 9, 2024 • edited Loading

lkdvos commented Jul 9, 2024

pbrehmer commented Jul 9, 2024

lkdvos commented Jul 9, 2024 • edited Loading

pbrehmer commented Jul 9, 2024

lkdvos commented Jul 9, 2024

pbrehmer commented Jul 9, 2024

leburgel commented Jul 10, 2024

lkdvos left a comment

Choose a reason for hiding this comment

pbrehmer commented Jul 10, 2024

pbrehmer commented Mar 4, 2024 •

edited

Loading

pbrehmer commented Mar 5, 2024 •

edited

Loading

Symmetric tensors + `KrylovKit.svdsolve`

Symmetric tensors + `KrylovKit.svdsolve` + arbitrary function

pbrehmer commented Jun 20, 2024 •

edited

Loading

codecov bot commented Jun 28, 2024 •

edited

Loading

pbrehmer commented Jul 9, 2024 •

edited

Loading

lkdvos commented Jul 9, 2024 •

edited

Loading