You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current Situation
When loading a model in LMS, users must manually specify the context length using the --context-length flag. If this flag is omitted, the context length defaults to the model's configuration value. However, there is no direct way to specify that the model should be loaded with its maximum supported context length.
This functionality is already available in the LMStudio GUI, where users can select the maximum supported token limit when loading a model. However, there is no equivalent option for command-line usage.
Proposed Solution
Introduce a --context-length max option, similar to the existing --gpu flag. This option would automatically set the context length to the maximum value supported by the model, based on its internal configuration.
Application and Benefits
This feature would be particularly useful in setups where models are loaded and unloaded iteratively for automated evaluations (e.g., via Python subprocess). Currently, users must manually determine and pass the maximum context length for each model, which is redundant (since LMS already has this information) and cumbersome (as it requires digging through model configs for every new model).
By implementing a max option, users could streamline workflows, reducing configuration overhead and enhancing usability. It would save time and simplify testing and evaluation pipelines involving multiple models.
The text was updated successfully, but these errors were encountered:
Current Situation
When loading a model in LMS, users must manually specify the context length using the
--context-length
flag. If this flag is omitted, the context length defaults to the model's configuration value. However, there is no direct way to specify that the model should be loaded with its maximum supported context length.This functionality is already available in the LMStudio GUI, where users can select the maximum supported token limit when loading a model. However, there is no equivalent option for command-line usage.
Proposed Solution
Introduce a
--context-length max
option, similar to the existing--gpu
flag. This option would automatically set the context length to the maximum value supported by the model, based on its internal configuration.Application and Benefits
This feature would be particularly useful in setups where models are loaded and unloaded iteratively for automated evaluations (e.g., via Python subprocess). Currently, users must manually determine and pass the maximum context length for each model, which is redundant (since LMS already has this information) and cumbersome (as it requires digging through model configs for every new model).
By implementing a
max
option, users could streamline workflows, reducing configuration overhead and enhancing usability. It would save time and simplify testing and evaluation pipelines involving multiple models.The text was updated successfully, but these errors were encountered: