Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat: Benchmarking Workflow] add stuff for a benchmarking workflow #5839

Merged
merged 118 commits into from
Dec 12, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
4c185f0
add poc for benchmarking workflow.
sayakpaul Nov 17, 2023
945ab17
import
sayakpaul Nov 17, 2023
b4debda
fix argument
sayakpaul Nov 17, 2023
22966a1
fix: argument
sayakpaul Nov 17, 2023
12424a3
fix: path
sayakpaul Nov 17, 2023
122d5d9
fix
sayakpaul Nov 17, 2023
c20d254
fix
sayakpaul Nov 17, 2023
43544ed
path
sayakpaul Nov 17, 2023
ada65b3
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Nov 27, 2023
f59e315
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Nov 28, 2023
3c05e41
output csv files.
sayakpaul Nov 28, 2023
24b68fd
workflow cleanup
sayakpaul Nov 28, 2023
8e2088e
append token
sayakpaul Nov 28, 2023
01584c7
add utility to push to hf dataset
sayakpaul Nov 28, 2023
853035b
fix: kw arg
sayakpaul Nov 28, 2023
46aaf96
better reporting
sayakpaul Nov 28, 2023
d626eef
fix: headers
sayakpaul Nov 28, 2023
ab12fe6
better formatting of the numbers.
sayakpaul Nov 28, 2023
1bb531e
better type annotation
sayakpaul Nov 28, 2023
2df4aba
fix: formatting
sayakpaul Nov 28, 2023
939fe5c
moentarily disable check
sayakpaul Nov 28, 2023
3a18e29
push results.
sayakpaul Nov 28, 2023
71279b6
remove disable check
sayakpaul Nov 28, 2023
ea1e57e
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Nov 29, 2023
3c8cc38
introduce base classes.
sayakpaul Nov 29, 2023
9683cd7
img2img class
sayakpaul Nov 29, 2023
274b9e1
add inpainting pipeline
sayakpaul Nov 29, 2023
2b5b8ae
intoduce base benchmark class.
sayakpaul Nov 29, 2023
0b54a6a
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Nov 30, 2023
66b159a
add img2img and inpainting
sayakpaul Nov 30, 2023
01addbd
feat: utility to compare changes
sayakpaul Dec 1, 2023
c30cab6
fix
sayakpaul Dec 1, 2023
689b9f7
fix import
sayakpaul Dec 1, 2023
d046a25
add args
sayakpaul Dec 1, 2023
71f6bd9
basepath
sayakpaul Dec 1, 2023
295cf30
better exception handling
sayakpaul Dec 1, 2023
b5e2371
better path handling
sayakpaul Dec 1, 2023
e7aed9e
fix
sayakpaul Dec 1, 2023
8eb8baf
fix
sayakpaul Dec 1, 2023
aac35c3
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 1, 2023
3cb02f8
remove
sayakpaul Dec 1, 2023
60c980c
ifx
sayakpaul Dec 1, 2023
1e7db92
Merge branch 'main' into feat/ci-benchmarking
patrickvonplaten Dec 1, 2023
cd91b62
fix
sayakpaul Dec 1, 2023
38b8708
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 2, 2023
aeefb55
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 4, 2023
1782d5a
add: support for controlnet.
sayakpaul Dec 4, 2023
df5dead
image_url -> url
sayakpaul Dec 4, 2023
c6c545c
move images to huggingface hub
sayakpaul Dec 4, 2023
b358c87
correct urls.
sayakpaul Dec 4, 2023
93b491b
root_ckpt
sayakpaul Dec 4, 2023
131bfce
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 4, 2023
13a86dc
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 4, 2023
748f6dc
flush before benchmarking
sayakpaul Dec 4, 2023
5d5d5fd
don't install accelerate from source
sayakpaul Dec 4, 2023
4651082
add runner
sayakpaul Dec 4, 2023
8e80579
simplify Diffusers Benchmarking step
sayakpaul Dec 4, 2023
d49ad65
change runner
sayakpaul Dec 4, 2023
7c7846b
fix: subprocess call.
sayakpaul Dec 4, 2023
5dbcbf5
filter percentage values
sayakpaul Dec 4, 2023
cb8572a
fix controlnet benchmark
sayakpaul Dec 4, 2023
6dec96c
add t2i adapters.
sayakpaul Dec 4, 2023
86d597f
fix filter columns
sayakpaul Dec 4, 2023
fa7bfe1
fix t2i adapter benchmark
sayakpaul Dec 4, 2023
59df524
fix init.
sayakpaul Dec 4, 2023
3cd0f59
fix
sayakpaul Dec 4, 2023
8583db8
remove safetensors flag
sayakpaul Dec 4, 2023
6b9bf4a
fix args print
sayakpaul Dec 4, 2023
38160f1
fix
sayakpaul Dec 4, 2023
e6116b0
feat: run_command
sayakpaul Dec 4, 2023
d98fbe1
add adapter resolution mapping
sayakpaul Dec 4, 2023
c93278d
benchmark t2i adapter fix.
sayakpaul Dec 4, 2023
924096f
fix adapter input
sayakpaul Dec 4, 2023
628591d
fix
sayakpaul Dec 4, 2023
0f4ae4e
convert to L.
sayakpaul Dec 4, 2023
de739fa
add flush() add appropriate places
sayakpaul Dec 4, 2023
cb9f9c6
better filtering
sayakpaul Dec 4, 2023
d7aee28
okay
sayakpaul Dec 4, 2023
385ffbb
get env for torch
sayakpaul Dec 4, 2023
611ae13
convert to float
sayakpaul Dec 4, 2023
b3a91d8
fix
sayakpaul Dec 4, 2023
e55913e
filter out nans.
sayakpaul Dec 5, 2023
dc3063a
better coment
sayakpaul Dec 5, 2023
63aee79
sdxl
sayakpaul Dec 5, 2023
9a9d5ea
sdxl for other benchmarks.
sayakpaul Dec 5, 2023
3d66747
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 5, 2023
c8f6eef
fix: condition
sayakpaul Dec 5, 2023
4a67437
fix: condition for inpainting
sayakpaul Dec 5, 2023
e94b895
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 5, 2023
eedf218
fix: mapping for resolution
sayakpaul Dec 5, 2023
e300038
fix
sayakpaul Dec 5, 2023
60614f5
include kandinsky and wuerstchen
sayakpaul Dec 5, 2023
b394168
fix: Wuerstchen
sayakpaul Dec 5, 2023
70f3556
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 5, 2023
b7eb3fb
Empty-Commit
sayakpaul Dec 5, 2023
63a61bd
Merge branch 'main' into feat/ci-benchmarking
DN6 Dec 7, 2023
821726d
[Community] AnimateDiff + Controlnet Pipeline (#5928)
a-r-r-o-w Dec 7, 2023
3dc2362
EulerDiscreteScheduler add `rescale_betas_zero_snr` (#6024)
Beinsezii Dec 7, 2023
26a8c00
Revert "[Community] AnimateDiff + Controlnet Pipeline (#5928)"
sayakpaul Dec 7, 2023
8db59d7
Revert "EulerDiscreteScheduler add `rescale_betas_zero_snr` (#6024)"
sayakpaul Dec 7, 2023
f76ba5b
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 7, 2023
4e7fb4d
add SDXL turbo
sayakpaul Dec 7, 2023
e2df761
add lcm lora to the mix as well.
sayakpaul Dec 7, 2023
2588853
fix
sayakpaul Dec 7, 2023
81d56de
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 7, 2023
a7fd2c3
increase steps to 2 when running turbo i2i
sayakpaul Dec 7, 2023
191ebf6
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 8, 2023
b878a29
debug
sayakpaul Dec 8, 2023
1389d0e
debug
sayakpaul Dec 8, 2023
b2d35be
debug
sayakpaul Dec 8, 2023
d78609d
fix for good
sayakpaul Dec 8, 2023
b3897f8
fix and isolate better
sayakpaul Dec 8, 2023
8289baa
fuse lora so that torch compile works with peft
sayakpaul Dec 8, 2023
dd54366
fix: LCMLoRA
sayakpaul Dec 8, 2023
d6966b4
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 8, 2023
51acace
better identification for LCM
sayakpaul Dec 8, 2023
65b97e8
Merge branch 'main' into feat/ci-benchmarking
sayakpaul Dec 9, 2023
80e8311
change to cron job
sayakpaul Dec 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
name: Benchmarking tests

on:
pull_request:
branches:
- main
push:
branches:
- ci-*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should ideally be done weekly or bi-weekly and only on main. For testing, I am running it on PRs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think bi-weekly (twice a month?) Works here. Or even monthly is a good cadence.

Copy link
Member Author

@sayakpaul sayakpaul Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Will keep it to bi-weekly. Will incorporate that change once we're one with how we want to report the benchmarks and the pipelines we want to benchmark.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change this to a cron job

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this will be changed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to cron job.


env:
DIFFUSERS_IS_CI: yes
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
PIPELINE_USAGE_CUTOFF: 50000

jobs:
torch_pipelines_cuda_benchmark_tests:
name: Torch Core Pipelines CUDA Benchmarking Tests
sayakpaul marked this conversation as resolved.
Show resolved Hide resolved
strategy:
fail-fast: false
max-parallel: 1
runs-on: docker-gpu
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
sayakpaul marked this conversation as resolved.
Show resolved Hide resolved
mkdir benchmark_outputs
- name: Environment
run: |
python utils/print_env.py
- name: Stable Diffusion Benchmarking Tests
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
cd benchmarks && python benchmark_sd.py && \
python benchmark_sd.py --batch_size 4 && \
python benchmark_sd.py --run_compile && \
python benchmark_sd.py --batch_size 4 --run_compile

sayakpaul marked this conversation as resolved.
Show resolved Hide resolved
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: benchmark_test_reports
path: benchmark_outputs
sayakpaul marked this conversation as resolved.
Show resolved Hide resolved
60 changes: 60 additions & 0 deletions benchmarks/benchmark_sd.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import argparse
import os
import torch
from diffusers import DiffusionPipeline
from .benchmark_utils import benchmark_fn, bytes_to_giga_bytes, BenchmarkInfo, generate_markdown_table

CKPT = "CompVis/stable-diffusion-v1-4"
sayakpaul marked this conversation as resolved.
Show resolved Hide resolved
PROMPT = "ghibli style, a fantasy landscape with castles"
BASE_PATH = "benchmark_outputs"


def load_pipeline(run_compile=False, with_tensorrt=False):
pipe = DiffusionPipeline.from_pretrained(
CKPT, torch_dtype=torch.float16, use_safetensors=True
)
pipe = pipe.to("cuda")

if run_compile:
pipe.unet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

pipe.set_progress_bar_config(disable=True)
return pipe


def run_inference(pipe, args):
_ = pipe(
prompt=PROMPT,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)

def main(args):
pipeline = load_pipeline(
run_compile=args.run_compile, with_tensorrt=args.with_tensorrt
)

time = benchmark_fn(run_inference, pipeline, args) # in seconds.
memory = bytes_to_giga_bytes(torch.cuda.max_memory_allocated()) # in GBs.
benchmark_info = BenchmarkInfo(time=time, memory=memory)

markdown_report = ""
markdown_report = generate_markdown_table(pipeline_name=CKPT, args=args, benchmark_info=benchmark_info, markdown_report=markdown_report)
return markdown_report

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
markdown_report = main(args)

name = CKPT + f"-batch_sze@{args.batch_size}-num_inference_steps@{args.num_inference_steps}--run_compile@{args.run_compile}"
filepath = os.path.join(BASE_PATH, name)
with open(filepath, "w") as f:
f.write(markdown_report)


46 changes: 46 additions & 0 deletions benchmarks/benchmark_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import gc
import torch
import torch.utils.benchmark as benchmark
from dataclasses import dataclass
import argparse

@dataclass
class BenchmarkInfo:
time: float
memory: float


def flush():
gc.collect()
torch.cuda.empty_cache()

def bytes_to_giga_bytes(bytes):
return bytes / 1024 / 1024 / 1024


# Adapted from
# https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html
def benchmark_fn(f, *args, **kwargs):
t0 = benchmark.Timer(
stmt="f(*args, **kwargs)", globals={"args": args, "kwargs": kwargs, "f": f}
)
return f"{(t0.blocked_autorange().mean):.3f}"

def generate_markdown_table(pipeline_name: str, args: argparse.Namespace, benchmark_info: BenchmarkInfo) -> str:
headers = ["**Parameter**", "**Value**"]
data = [
["Batch Size", args.batch_size],
["Number of Inference Steps", args.num_inference_steps],
["Run Compile", args.run_compile],
["Time (seconds)", benchmark_info.time],
["Memory (GBs)", benchmark_info.memory]
]

# Formatting the table.
markdown_table = f"## {pipeline_name}\n\n"
markdown_table += "| " + " | ".join(headers) + " |\n"
markdown_table += "|-" + "-|-".join(['' for _ in headers]) + "-|\n"
for row in data:
markdown_table += "| " + " | ".join(str(item) for item in row) + " |\n"

return markdown_table
Loading