Skip to content

Commit

Permalink
clean up model_runner.py
Browse files Browse the repository at this point in the history
  • Loading branch information
jiazhan-msft committed Aug 25, 2024
1 parent cc5cbe8 commit 4be6309
Showing 1 changed file with 1 addition and 2 deletions.
3 changes: 1 addition & 2 deletions vllm/worker/model_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -1218,8 +1218,7 @@ def capture_model(self, kv_caches: List[List[torch.Tensor]]) -> None:
# Prepare dummy inputs. These will be reused for all batch sizes.
max_batch_size = max(_BATCH_SIZES_TO_CAPTURE)
input_tokens = torch.zeros(max_batch_size, dtype=torch.long).cuda()
input_positions = torch.zeros(max_batch_size, dtype=torch.long).cuda()

input_positions = torch.zeros(max_batch_size, dtype=torch.long).cuda()
# Prepare dummy previous_hidden_states only if needed by the model.
# This is used by draft models such as EAGLE.
previous_hidden_states = None
Expand Down

0 comments on commit 4be6309

Please sign in to comment.