From aa4ca74487a7a578b8d19a3460a5553d415ff648 Mon Sep 17 00:00:00 2001 From: Tom Stesco Date: Fri, 20 Dec 2024 23:44:23 -0500 Subject: [PATCH] update llama 3.1 70b v0 tt-metal and vllm commit refs in docs (#16246) ### What's changed - update Llama 3.1 70b v0 release tt-metal commit to https://github.com/tenstorrent/tt-metal/tree/v0.54.0-rc2 - update Tenstorrent vLLM repo commit for Llama 3.1 70b v0 release to https://github.com/tenstorrent/vllm/tree/953161188c50f10da95a88ab305e23977ebd3750 ### Checklist - [ ] Post commit CI passes - [ ] New/Existing tests provide coverage for changes --- README.md | 2 +- models/demos/t3000/llama3_70b/README.md | 2 +- models/demos/t3000/llama3_70b/setup_llama.sh | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index a4e687fb6d0..5ad22f0b7c5 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ | [Llama 3.2 1B](./models/demos/llama3) | 32 | [n150](https://tenstorrent.com/hardware/wormhole) | 56 | 55.1 | 160 | 1763.2 | [v0.53.1-rc23](https://github.com/tenstorrent/tt-metal/tree/v0.53.1-rc23) | | | [Llama 3.2 3B](./models/demos/llama3) | 32 | [n150](https://tenstorrent.com/hardware/wormhole) | 96 | 35.0 | 60 | 1120.0 | [v0.53.1-rc23](https://github.com/tenstorrent/tt-metal/tree/v0.53.1-rc23) | | | [Falcon 7B (DP=8)](./models/demos/t3000/falcon7b) | 256 | [QuietBox](https://tenstorrent.com/hardware/tt-quietbox) | 97 | 14.6 | 26 | 3737.6 | [v0.53.0-rc44](https://github.com/tenstorrent/tt-metal/tree/v0.53.0-rc44) | | -| [Llama 3.1 70B (TP=8)](./models/demos/t3000/llama3_70b) | 32 | [QuietBox](https://tenstorrent.com/hardware/tt-quietbox) | 190 | 15.1 | 20 | 483.2 | [v0.53.0-rc36](https://github.com/tenstorrent/tt-metal/tree/v0.53.0-rc36) | [384f179](https://github.com/tenstorrent/vllm/tree/384f1790c3be16e1d1b10de07252be2e66d00935) | +| [Llama 3.1 70B (TP=8)](./models/demos/t3000/llama3_70b) | 32 | [QuietBox](https://tenstorrent.com/hardware/tt-quietbox) | 190 | 15.1 | 20 | 483.2 | [v0.54.0-rc2](https://github.com/tenstorrent/tt-metal/tree/v0.54.0-rc2) | [9531611](https://github.com/tenstorrent/vllm/tree/953161188c50f10da95a88ab305e23977ebd3750) | | [Falcon 40B (TP=8)](./models/demos/t3000/falcon40b) | 32 | [QuietBox](https://tenstorrent.com/hardware/tt-quietbox) | | 5.3 | 36 | 169.6 | [v0.53.1-rc23](https://github.com/tenstorrent/tt-metal/tree/v0.53.1-rc23) | | | [Mixtral 8x7B (TP=8)](./models/demos/t3000/mixtral8x7b) | 32 | [QuietBox](https://tenstorrent.com/hardware/tt-quietbox) | 230 | 14.6 | 33 | 467.2 | [v0.53.0-rc44](https://github.com/tenstorrent/tt-metal/tree/v0.53.0-rc44) | | | [Falcon 7B (DP=32)](./models/demos/tg/falcon7b) | 1024 | [Galaxy](https://tenstorrent.com/hardware/galaxy) | 242 | 4.4 | 26 | 4505.6 | [v0.53.0-rc33](https://github.com/tenstorrent/tt-metal/tree/v0.53.0-rc33) | | diff --git a/models/demos/t3000/llama3_70b/README.md b/models/demos/t3000/llama3_70b/README.md index 80f344040d4..0ca8239208e 100644 --- a/models/demos/t3000/llama3_70b/README.md +++ b/models/demos/t3000/llama3_70b/README.md @@ -18,7 +18,7 @@ Where, `TT_METAL_COMMIT_SHA_OR_TAG` and `TT_VLLM_COMMIT_SHA_OR_TAG` are found in Example: ```bash -./models/demos/t3000/llama3_70b/setup_llama.sh llama-3.1-70b-instruct v0.53.0-rc36 384f1790c3be16e1d1b10de07252be2e66d00935 +./models/demos/t3000/llama3_70b/setup_llama.sh llama-3.1-70b-instruct v0.54.0-rc2 953161188c50f10da95a88ab305e23977ebd3750 ``` Follow prompts as they come up in CLI to select appropriate weights for Llama 3.1 70B Instruct. diff --git a/models/demos/t3000/llama3_70b/setup_llama.sh b/models/demos/t3000/llama3_70b/setup_llama.sh index 636ce070b2b..0c6a7b52375 100644 --- a/models/demos/t3000/llama3_70b/setup_llama.sh +++ b/models/demos/t3000/llama3_70b/setup_llama.sh @@ -36,7 +36,7 @@ Examples: $0 llama-3.1-70b-instruct main dev # Deploy with specific commit SHAs - $0 llama-3.1-70b-instruct v0.53.0-rc36 384f1790c3be16e1d1b10de07252be2e66d00935 + $0 llama-3.1-70b-instruct v0.54.0-rc2 953161188c50f10da95a88ab305e23977ebd3750 EOF exit 0