Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements for AD-VUMPS support #82

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

XingyuZhang2018
Copy link

@XingyuZhang2018 XingyuZhang2018 commented Oct 30, 2024

Changes Made:

  • Added AD-VUMPS Support: Implemented AD-VUMPS support independently, eliminating the dependency on MPSKit.jl.
  • Removed Redundant Files: Deleted multiple files that previously relied on MPSKit.jl, streamlining the codebase.
  • Reorganized Defaults: Moved default settings to a utility module for better organization and maintenance.

Notable Feature:
iPEPS Optimization: The codebase continues to support iPEPS optimization. Refer to the example script test/heisenberg_vumps.jl for usage.

Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some very impressive work, I'll try and go over this in more detail asap.

Some general remarks:

While I do feel like it is probably best to have an independent VUMPS implementation for making things work with AD, I am not so keen on completely removing MPSKit support. There are many functions that are still shared, and I would like to keep things somewhat consistent between the two packages. Additionally, I do think that if you want to just contract, without optimizing, there is quite a bit of functionality in MPSKit that would now no longer be there. Is there any way you could see these things co-existing instead?
Additionally, I think you now removed the PEPO support, while this was actually working before :)

From a coherence point of view, it would be nice if you could more closely resemble the style of naming things we had before. In the same regards, I think it is a lot more convenient if there is only a single InfiniteMPS type, or at least if there is a way of converting from and to that. Maybe BoundaryMPSEnvs seems a bit more in line with CTMRGEnvs?

Am I correct in seeing that the current AD implementation you have just calls the CTMRG one, and thus probably does not work?

As mentioned, I'll try and go over the code in a lot more detail asap, but since it is quite a large amount of code this might take me a while.

@XingyuZhang2018
Copy link
Author

I agree that we can keep some functions from MPSKit. The reason I removed them is that I found the complexity overwhelming (feel free to reintroduce PEPO 😄), and I was frustrated with the repetition of function names like leading_boundary.

Regarding naming conventions, I’m flexible and open to modifications if you think it’s necessary.

I believe the current implementation calls VUMPS in a backward manner instead of using CTMRG.

Additionally, I want to clarify that for the gradient test, I didn’t conduct it in the proper way, but I ensured that every function is differentiable. Ultimately, I tested the ground energy using AD-VUMPS with the Heisenberg model.

@Yue-Zhengyuan
Copy link
Contributor

@XingyuZhang2018 A somewhat off-topic question: will this implementation be the successor of your TeneT.jl (used in PRB.108.085103)?

Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a lot of comments, keep in mind that this is not critique, some of these really are just questions about things I honestly do not know.

I would really prefer to not fully drop the interplay with MPSKit for now though, it seems like this is removing and duplicating quite a bit of the code that can be shared between the two. This being said, I would be open in the future for a different hierarchy between the two, where we might have a separate core package that defines the types, and MPSKit and PEPSKit would not depend on each other, but still share the relevant functionality, and work together through extensions.

@@ -7,12 +7,14 @@ version = "0.2.1"
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"
ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
Compat = "34da2185-b29b-5c13-b0c7-acf172513d20"
InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, this is not being used right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize I added it. Maybe it can be removed if other packages don't rely on it.

KrylovKit = "0b1a1467-8014-51b9-945f-bf0ae24f4b77"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
LoggingExtras = "e6f89c97-d47a-5376-807f-9c37f3926c36"
MPSKit = "bb1c41ca-d63c-52ed-829e-0820dda26502"
MPSKitModels = "ca635005-6f8c-4cd1-b51d-8491250ef2ab"
OptimKit = "77e91f04-9b3b-57a6-a776-40b61faaebe0"
Parameters = "d96e819e-fc66-5662-9728-84c9c7592b0a"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so familiar with this package, as far as I know this was mostly used for the @kwdef, which is now a part of Base julia.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have employed the @unpack macro to manage constructors, a practice I find convenient and concise.

└── ALdᵢᵣⱼ ── └──
```
"""
function Lmap(Li::Vector{<:AbstractTensorMap},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not the same as ρmap defined above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I created ρmap, but later I realized that the index of Ad might vary in certain scenarios. As a result, I will eliminate ρmap and substitute it with Lmap.

@@ -92,15 +92,15 @@ const SequentialCTMRG = CTMRG{:sequential}
const SimultaneousCTMRG = CTMRG{:simultaneous}

"""
MPSKit.leading_boundary([envinit], state, alg::CTMRG)
leading_boundary([envinit], state, alg::CTMRG)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could obtain the same behavior by adding import MPSKit: leading_boundary, if you prefer to not have to type that :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that. Here I just think we should use an independent function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you want them to be independent? They both mean the same thing, and otherwise you would create name clashes if someone imports both MPSKit and PEPSKit. Given how closely related these things are, I think it is valid to give them the same function, but I am open to suggestions

@@ -110,22 +110,22 @@ function MPSKit.leading_boundary(envinit, state, alg::CTMRG)
log = ignore_derivatives(() -> MPSKit.IterLog("CTMRG"))

return LoggingExtras.withlevel(; alg.verbosity) do
ctmrg_loginit!(log, η, N)
contr_loginit!(log, η, N)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is really a necessary change, I am happy to discuss renaming things, but this should probably happen in a separate PR (and open discussion)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another open discussion is fine.

return AL, Le, λ
end

function getLsped(Le::Matrix{<:AbstractTensorMap}, A::Matrix{<:AbstractTensorMap}, AL::Matrix{<:AbstractTensorMap}; verbosity = Defaults.verbosity, kwargs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does sped stand for? Maybe it is clearer to just write out the function names

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLsped implies obtaining L quickly. Maybe it should be renamed.

Comment on lines 296 to 297
λLs, FL1s, info = eigsolve(FLi -> FLmap(FLi, ALu[i,:], ALd[ir,:], ipeps.A[i,:], adjoint.(ipeps.A[i,:])),
FL[i,:], 1, :LM; maxiter=100, ishermitian = false, kwargs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is different from what we have in the MPSKit implementation. There, we transfer through an entire unitcell, and ask that this converges, instead of transfering over a single site and asking that all tensors converge. I would be quite interested in investigating if there is any difference, both in speed and convergence

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously examined the speed and convergence. From my observations, they appear to be identical. However, I prefer this implementation because it processes the entire environment simultaneously.

λLs, FL1s, info = eigsolve(FLi -> FLmap(FLi, ALu[i,:], ALd[ir,:], ipeps.A[i,:], adjoint.(ipeps.A[i,:])),
FL[i,:], 1, :LM; maxiter=100, ishermitian = false, kwargs...)
verbosity >= 1 && info.converged == 0 && @warn "leftenv not converged"
λL[i], FL′[i,:] = selectpos(λLs, FL1s, Nj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you selectpos here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fixes some gauges for large unit cells. But I have to admit it doesn't always work.

Comment on lines +138 to +139
Base.:+(ipeps::InfinitePEPS, x::NamedTuple) = InfinitePEPS(ipeps.A .+ x.A)
Base.:+(x::NamedTuple, ipeps::InfinitePEPS) = InfinitePEPS(ipeps.A .+ x.A)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that is required because of Zygote? I think if you are a bit careful with also defining rrules for constructors and getproperty, you can avoid the case where it generates weird tangent objects.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. This implementation is unsatisfactory for me. If you have any ideas on how to improve this, please provide an implementation. Thank you.

Comment on lines +72 to +77
function ChainRulesCore.rrule(::typeof(_fit_spaces), y::AbstractTensorMap, x::AbstractTensorMap)
function pullback(Δ)
return NoTangent(), _fit_spaces(Δ, y), NoTangent()
end
return _fit_spaces(y, x), pullback
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely convinced that this is correct, could you test something like this using a finitedifference approach?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tested it.

@XingyuZhang2018
Copy link
Author

@XingyuZhang2018 A somewhat off-topic question: will this implementation be the successor of your TeneT.jl (used in PRB.108.085103)?

If the GPU support surpasses that of TeneT.jl, I would anticipate it to be an improvement. Otherwise, it might focus on symmetry implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants