Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practices of the run_batch method #28

Open
maolinml opened this issue Jul 8, 2024 · 0 comments
Open

Best practices of the run_batch method #28

maolinml opened this issue Jul 8, 2024 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@maolinml
Copy link

maolinml commented Jul 8, 2024

What did you find confusing? Please describe.
A clear and concise description of what you found confusing. Ex. I tried to [...] but I didn't understand how to [...]

When use the run_batch method for running a series of GHZ circuits in parallel, it yields longer runtime compared to using the run method to run them sequentially. I would like to understand the practices of the run_batch method. This seems to be a common issue for both simulator-v1 and v2.

Describe how documentation can be improved
A clear and concise description of where documentation was lacking and how it can be improved.

From the benchmark below, it is unclear if run_batch method would provide speed up for running certain types of circuits.

Additional context
Add any other context or screenshots about the documentation request here.

For the benchmark below, we consider

  1. Two types of circuits, a) 16-qubit GHZ circuit and b) 16x16 square circuits with only single-qubit gates. For each type of circuits, we consider running them repeatedly for 30 times
  2. Both simulator_v1 and simulator_v2
  3. Both run and run_batch methods

The takeaway from the benchmark

  1. For a fixed type of circuit with fixed method [either run or run_batch], simulator-v2 yields shorter runtime compared to simulator-v1, which is good.
  2. For the square circuit, for both simulator-v1 and v2, the run_batch method yields shorter runtime compared to the run method, which is good.
  3. For the GHZ circuit, for both simulator-v1 and v2, the run_batch method yields longer runtime compared to the run method, which is not expected.

Below is the script used for the benchmarking

from braket.circuits import Circuit
import numpy as np


def ghz_circuit(n_qubits):
    circuit = Circuit()
    circuit.h(0)
    for ii in range(0, n_qubits-1):
        circuit.cnot(control=ii, target=ii+1)
    return circuit

def square_circuit(n_qubits):
    circuit = Circuit()

    for ii in range(0, n_qubits-1):
        for jj in range(n_qubits):
            circuit.u(ii, np.random.rand(), np.random.rand(), np.random.rand())
    return circuit

from braket.devices import LocalSimulator
simulator_v1 = LocalSimulator()
simulator_v2 = LocalSimulator("braket_sv_v2")

import time

def run_circ_with_simulator(simulator, circs, num_repeats, mode, shots=100):
    
    ts = []
    for _ in range(num_repeats):
        t1 = time.time()
        if mode == "run":
            task = [simulator.run(circ, shots=shots) for circ in circs]
        else:
            task = simulator.run_batch(circs, shots=shots)
          
        ts.append(time.time() - t1)
        
    # Remove the first run to avoid possible precompilation
    # and average the runtime for the rest
    
    ts = ts[1:]
    t = sum(ts)/len(ts)
    return t

num_qubits = 16 # 20 will freeze the nbi
num_circuits = 30

import os
assert num_circuits < os.cpu_count()

ghz = ghz_circuit(num_qubits)
sqc = square_circuit(num_qubits)

ghzs = [ghz for _ in range(num_circuits)]
sqcs = [sqc for _ in range(num_circuits)]

num_repeats = 10

print("Test for `run` and `run_batch` for simulator v1 and v2")
print(f"We use ghz circuit, with 2-qubit gates, and square circuits (sqc) with only 1-qubit gates.")

print(f"Either ghz or sqc has {num_circuits} circuits, and each circuit has {num_qubits} qubits")
print(f"The runtime is average over {num_repeats-1} runs.")
print()

# Run ghz with simulator_v1 with "run"
t_ghz_v1_run = run_circ_with_simulator(simulator_v1, ghzs, num_repeats, "run")
print(f"Run ghz with simulator_v1 with `run`.       {t_ghz_v1_run}", flush=True)

# Run ghz with simulator_v1 with "run_batch"
t_ghz_v1_run_batch = run_circ_with_simulator(simulator_v1, ghzs, num_repeats, "run_batch")
print(f"Run ghz with simulator_v1 with `run_batch`. {t_ghz_v1_run_batch}", flush=True)

# Run ghz with simulator_v1 with "run"
t_sqc_v1_run = run_circ_with_simulator(simulator_v1, sqcs, num_repeats, "run")
print(f"Run sqc with simulator_v1 with `run`.       {t_sqc_v1_run}", flush=True)

# Run ghz with simulator_v1 with "run_batch"
t_sqc_v1_run_batch = run_circ_with_simulator(simulator_v1, sqcs, num_repeats, "run_batch")
print(f"Run sqc with simulator_v1 with `run_batch`. {t_sqc_v1_run_batch}", flush=True)

print()
# Run ghz with simulator_v2 with "run"
t_ghz_v2_run = run_circ_with_simulator(simulator_v2, ghzs, num_repeats, "run")
print(f"Run ghz with simulator_v2 with `run`.       {t_ghz_v2_run}", flush=True)

# Run ghz with simulator_v2 with "run_batch"
t_ghz_v2_run_batch = run_circ_with_simulator(simulator_v2, ghzs, num_repeats, "run_batch")
print(f"Run ghz with simulator_v2 with `run_batch`. {t_ghz_v2_run_batch}", flush=True)

# Run ghz with simulator_v2 with "run"
t_sqc_v2_run = run_circ_with_simulator(simulator_v2, sqcs, num_repeats, "run")
print(f"Run sqc with simulator_v2 with `run`.       {t_sqc_v2_run}", flush=True)

# Run ghz with simulator_v2 with "run_batch"
t_sqc_v2_run_batch = run_circ_with_simulator(simulator_v2, sqcs, num_repeats, "run_batch")
print(f"Run sqc with simulator_v2 with `run_batch`. {t_ghz_v2_run_batch}", flush=True)

This is the result

Test for `run` and `run_batch` for simulator v1 and v2
We use ghz circuit, with 2-qubit gates, and square circuits (sqc) with only 1-qubit gates.
Either ghz or sqc has 30 circuits, and each circuit has 16 qubits
The runtime is average over 9 runs.

Run ghz with simulator_v1 with `run`.       0.8401916821797689
Run ghz with simulator_v1 with `run_batch`. 1.3532552984025743
Run sqc with simulator_v1 with `run`.       6.863542768690321
Run sqc with simulator_v1 with `run_batch`. 1.5802200370364718

Run ghz with simulator_v2 with `run`.       0.3054380946689182
Run ghz with simulator_v2 with `run_batch`. 0.4409244855244954
Run sqc with simulator_v2 with `run`.       1.4414687421586778
Run sqc with simulator_v2 with `run_batch`. 0.4409244855244954

The above result is obtained through a notebook instance with instance type "ml.m5.12xlarge".

@maolinml maolinml added the documentation Improvements or additions to documentation label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant