-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Lora config to arg list in Neo sharding script& its integ test change #2552
Conversation
enforce_eager: bool = str( | ||
self.properties.get("option.enforce_eager", | ||
False)).lower() == "true" | ||
"False")).lower() == "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're changing to the string "False", we don't need to wrap this in a str() conversion.
Let's use the following:
enforce_eager = self.properties.get("option.enforce_eager", "true").lower() == "true"
I believe the default value should be true for this to keep in line with our existing behavior (i should have caught that in the original PR)
max_rolling_batch_size = int( | ||
self.properties.get("option.max_rolling_batch_size", 256)) | ||
max_model_len = self.properties.get("option.max_model_len", None) | ||
if max_model_len is not None: | ||
max_model_len = int(max_model_len) | ||
|
||
# LoraConfigs | ||
enable_lora: bool = str( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only be passing lora configs to the engine if the user specifies them. I'd prefer we rewrite this section as
lora_kwargs = {}
if self.properties.get("option.enable_lora"):
lora_kwargs["enable_lora"] = self.properties.get("option.enable_lora").lower() == "true"
if self.properties.get("option.fully_sharded_loras"):
lora_kwargs["fully_shareded_loras"] = option.get("option.fully_sharded_loras").lower() == "true"
engine_args = VllmEngineArgs(
<prior configs>
**lora_kwargs
)
...
and so on. This way we're only providing the lora configs to the engine if they have been specified
Description