Issue with tutorial's learning rate with Jastrow Ansatz in the Ground state:Heisenberg model example #1888

wttai004 · 2024-08-05T20:23:45Z

wttai004
Aug 5, 2024

I observed issues with the learning while going through the NetKet tutorial of the 1D Heisenberg model. I am sharing this in case future learnings encounter similar issues while running the tutorial

For convenience, I'm copying the code here:

import os
os.environ["JAX_PLATFORM_NAME"] = "cpu"

# Import netket library
import netket as nk

# Import Json, this will be needed to load log files
import json

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import time

# Define a 1d chain
L = 22
g = nk.graph.Hypercube(length=L, n_dim=1, pbc=True)

# Define the Hilbert space based on this graph
# We impose to have a fixed total magnetization of zero 
hi = nk.hilbert.Spin(s=0.5, total_sz=0, N=g.n_nodes)

# calling the Heisenberg Hamiltonian
ha = nk.operator.Heisenberg(hilbert=hi, graph=g)

import flax.linen as nn
import jax.numpy as jnp
import jax

class Jastrow(nn.Module):
    @nn.compact
    def __call__(self, x):
        # sometimes we call this function with a 1D input, sometimes with a 2D.
        # We promote all inputs to 2D to make the following code simpler.
        x = jnp.atleast_2d(x)
        # We vmap along the 0-th axis of the input
        # This will automatically convert a function working on vectors to one working
        # on matrices.
        return jax.vmap(self.evaluate_single, in_axes=(0))(x)
        
    def evaluate_single(self, x):
        # We create the parameter v, which is a vector of length N_sites 
        v_bias = self.param(
            "visible_bias", nn.initializers.normal(), (x.shape[-1],), complex
        )
    
        # The Jastrow matrix is a N_sites x N_sites complex-valued matrix
        J = self.param(
            "kernel", nn.initializers.normal(), (x.shape[-1],x.shape[-1]), complex
        )
        
        # In python @ symbolises matrix multiplication
        return x.T@J@x + jnp.dot(x, v_bias)

ma = Jastrow()

# Build the sampler
sa = nk.sampler.MetropolisExchange(hilbert=hi,graph=g)

# Optimizer
op = nk.optimizer.Sgd(learning_rate=0.1)

# Stochastic Reconfiguration
sr = nk.optimizer.SR(diag_shift=0.1)

# The variational state
vs = nk.vqs.MCState(sa, ma, n_samples=1000)

# The ground-state optimization loop
gs = nk.VMC(
    hamiltonian=ha,
    optimizer=op,
    preconditioner=sr,
    variational_state=vs)

start = time.time()
gs.run(300, out='Jastrow')
end = time.time()

print('### Jastrow calculation')
print('Has',nk.jax.tree_size(vs.parameters),'parameters')
print('The Jastrow calculation took',end-start,'seconds')

Running this example, the Jastrow Ansatz's energy seems stuck in a local minimum of around -4 to -7 (varies each run) that is much above exact diagonalization's energy prediction (-39.1)

A very simple fix is to change the learning rate to 0.01 instead of 0.1 in the line

op = nk.optimizer.Sgd(learning_rate=0.1)

This would ensure that the Jastrow Ansatz's energy also converge to -39.1

gcarleo · 2024-08-05T20:25:53Z

gcarleo
Aug 5, 2024
Maintainer

yes indeed it seems the learning rate used in the tutorial was way too big, would you consider doing a PR to modify it ?

…

On Mon, Aug 5, 2024, 16:24 wttai004 ***@***.***> wrote: I observed issues with the learning while going through the NetKet tutorial <https://netket.readthedocs.io/en/latest/tutorials/gs-heisenberg.html> of the 1D Heisenberg model. For convenience, I'm copying the code here: import os os.environ["JAX_PLATFORM_NAME"] = "cpu" # Import netket library import netket as nk # Import Json, this will be needed to load log files import json # Helper libraries import numpy as np import matplotlib.pyplot as plt import time # Define a 1d chain L = 22 g = nk.graph.Hypercube(length=L, n_dim=1, pbc=True) # Define the Hilbert space based on this graph # We impose to have a fixed total magnetization of zero hi = nk.hilbert.Spin(s=0.5, total_sz=0, N=g.n_nodes) # calling the Heisenberg Hamiltonian ha = nk.operator.Heisenberg(hilbert=hi, graph=g) import flax.linen as nn import jax.numpy as jnp import jax class Jastrow(nn.Module): @nn.compact def __call__(self, x): # sometimes we call this function with a 1D input, sometimes with a 2D. # We promote all inputs to 2D to make the following code simpler. x = jnp.atleast_2d(x) # We vmap along the 0-th axis of the input # This will automatically convert a function working on vectors to one working # on matrices. return jax.vmap(self.evaluate_single, in_axes=(0))(x) def evaluate_single(self, x): # We create the parameter v, which is a vector of length N_sites v_bias = self.param( "visible_bias", nn.initializers.normal(), (x.shape[-1],), complex ) # The Jastrow matrix is a N_sites x N_sites complex-valued matrix J = self.param( "kernel", nn.initializers.normal(), (x.shape[-1],x.shape[-1]), complex ) # In python @ symbolises matrix multiplication return ***@***@***.*** + jnp.dot(x, v_bias) ma = Jastrow() # Build the sampler sa = nk.sampler.MetropolisExchange(hilbert=hi,graph=g) # Optimizer op = nk.optimizer.Sgd(learning_rate=0.1) # Stochastic Reconfiguration sr = nk.optimizer.SR(diag_shift=0.1) # The variational state vs = nk.vqs.MCState(sa, ma, n_samples=1000) # The ground-state optimization loop gs = nk.VMC( hamiltonian=ha, optimizer=op, preconditioner=sr, variational_state=vs) start = time.time() gs.run(300, out='Jastrow') end = time.time() print('### Jastrow calculation') print('Has',nk.jax.tree_size(vs.parameters),'parameters') print('The Jastrow calculation took',end-start,'seconds') Running this example, the Jastrow Ansatz's energy seems stuck in a local minimum of around -4 to -7 (varies each run) that is much above exact diagonalization's energy prediction (-39.1) A very simple fix is to change the learning rate to 0.01 instead of 0.1. This would ensure that the Jastrow Ansatz's energy also converge to -39.1 I am sharing this in case future learnings encounter similar issues while running the tutorial — Reply to this email directly, view it on GitHub <#1888>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGWYRBGNQL2IRM4DIR3H5ULZP7NOPAVCNFSM6AAAAABMA7CY56VHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZXGAYTKOBRGE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

1 reply

wttai004 Aug 5, 2024
Author

I've also found convergence issue with the RBM with lattice symmetries example. I should have created a pull request that changes the Heisenberg model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NetKet

Issue with tutorial's learning rate with Jastrow Ansatz in the Ground state:Heisenberg model example #1888

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

NetKet

Issue with tutorial's learning rate with Jastrow Ansatz in the Ground state:Heisenberg model example #1888

wttai004 Aug 5, 2024

Replies: 1 comment · 1 reply

gcarleo Aug 5, 2024 Maintainer

wttai004 Aug 5, 2024 Author

wttai004
Aug 5, 2024

Replies: 1 comment 1 reply

gcarleo
Aug 5, 2024
Maintainer

wttai004 Aug 5, 2024
Author