Skip to content

Actions: tenstorrent/vllm

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
25 workflow runs
25 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Minor update to note about llama70b implemetations in tt-metal readme
PR Reminder Comment Bot #20: Pull request #52 opened by skhorasganiTT
January 10, 2025 19:37 13s
January 10, 2025 19:37 13s
Update tt-metal and vllm commits in readme and note extra llama requirements
PR Reminder Comment Bot #19: Pull request #51 opened by skhorasganiTT
January 9, 2025 23:06 11s
January 9, 2025 23:06 11s
Fix num kv heads per device calculation, update readme to include TG and old Llama70b instructions
PR Reminder Comment Bot #18: Pull request #50 opened by skhorasganiTT
January 3, 2025 20:28 9s
January 3, 2025 20:28 9s
Update vLLM and tt-metal commits in README.md
PR Reminder Comment Bot #17: Pull request #49 opened by skhorasganiTT
December 24, 2024 23:27 10s
December 24, 2024 23:27 10s
Add support for TT Llama3 text models (1B,3B,8B,70B-new)
PR Reminder Comment Bot #16: Pull request #48 opened by skhorasganiTT
December 24, 2024 01:34 9s
December 24, 2024 01:34 9s
Update tt-metal commit in README.md
PR Reminder Comment Bot #15: Pull request #47 opened by skhorasganiTT
December 20, 2024 18:04 10s
December 20, 2024 18:04 10s
Update tt-metal commit in README.md
PR Reminder Comment Bot #14: Pull request #46 opened by skhorasganiTT
December 18, 2024 22:18 12s
December 18, 2024 22:18 12s
[TT] Add support for multi-modal llama, different TT devices, paged cross attention
PR Reminder Comment Bot #13: Pull request #45 opened by skhorasganiTT
December 16, 2024 22:41 12s
December 16, 2024 22:41 12s
Update vLLM and tt-metal commits in tt-metal README.md, add VLLM_RPC_TIMEOUT to server command
PR Reminder Comment Bot #12: Pull request #43 opened by skhorasganiTT
December 11, 2024 20:05 11s
December 11, 2024 20:05 11s
Update dispatch_core_config() API in tt_worker.py post tt-metal commit 9b09061
PR Reminder Comment Bot #10: Pull request #38 opened by milank94
December 2, 2024 14:50 15s
December 2, 2024 14:50 15s
Update vLLM commit in tt-metal readme
PR Reminder Comment Bot #9: Pull request #34 opened by skhorasganiTT
November 6, 2024 14:45 12s
November 6, 2024 14:45 12s
Update TTModelRunner due to decode rope changes for llama70b
PR Reminder Comment Bot #8: Pull request #33 opened by skhorasganiTT
November 6, 2024 14:42 11s
November 6, 2024 14:42 11s
[Bugfix] #31 _make_sampler_output return expected SequenceOutput output_token: int
PR Reminder Comment Bot #7: Pull request #32 opened by tstescoTT
October 31, 2024 03:55 14s
October 31, 2024 03:55 14s
Import tt-metal model via pythonpath instead of symlink
PR Reminder Comment Bot #6: Pull request #30 opened by skhorasganiTT
October 29, 2024 17:21 11s
October 29, 2024 17:21 11s
Update vLLM commit in tt_metal README.md
PR Reminder Comment Bot #5: Pull request #28 opened by skhorasganiTT
October 28, 2024 20:28 13s
October 28, 2024 20:28 13s
[Hardware][Tenstorrent] Modify offline_inference_tt.py to include max_tokens arg
PR Reminder Comment Bot #2: Pull request #25 opened by milank94
October 21, 2024 11:17 10s
October 21, 2024 11:17 10s
Update vLLM commit in README.md
PR Reminder Comment Bot #1: Pull request #24 opened by skhorasganiTT
October 17, 2024 22:44 10s
October 17, 2024 22:44 10s
[Bugfix] Print warnings related to mistral_common tokenizer only on…
Lint GitHub Actions workflows #1: Commit d615b5c pushed by skhorasganiTT
October 17, 2024 22:36 16s main
October 17, 2024 22:36 16s
October 17, 2024 22:36 26s
October 17, 2024 22:36 39s
October 17, 2024 22:36 2m 27s
[Bugfix] Print warnings related to mistral_common tokenizer only on…
clang-format #1: Commit d615b5c pushed by skhorasganiTT
October 17, 2024 22:36 17s main
October 17, 2024 22:36 17s