Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lmi] log warnings for unused generation parameters across all rollin… #1686

Merged
merged 1 commit into from
Mar 28, 2024

Conversation

siddvenk
Copy link
Contributor

…g batch backends instead of throwing errors

Description

This PR standardizes the behavior across all rolling batch backends to log warnings for unused generation parameters. Before, some backends would throw errors, and some would succeed.

I have also logged the unused parameters, and the supported set of parameters. The supported set of parameters does include some stuff that the user should not set, so it's not perfect. I'm ok to remove that portion for now and just log the unused params to the warning.

Testing Samples

Using the following request:

curl -X POST http://localhost:8080/invocations -H 'Content-Type: application/json' -d '{"inputs": "Hello there", "parameters": {"badkwarg": 4}, "stream": false}'

vllm (would fail before)

response: {"generated_text": ", I’m a 20 year old guy from the UK. I’ve been playing guitar for about 10 years now and"}
logs: The following parameters are not supported by vllm: {'badkwarg'}. The supported parameters are {'include_stop_str_in_output', 'length_penalty', 'min_p', 'presence_penalty', 'temperature', 'prompt_logprobs', 'best_of', 'skip_special_tokens', 'early_stopping', 'stop_token_ids', 'repetition_penalty', 'use_beam_search', 'logprobs', 'logits_processors', 'top_k', 'stop', 'top_p', 'ignore_eos', 'n', 'frequency_penalty', 'seed', 'max_tokens', 'spaces_between_special_tokens'}

lmi-dist (would fail before)

response: {"generated_text": ", I’m a 20 year old guy from the UK. I’ve been playing guitar for about 10 years now and"}
logs: The following parameters are not supported by lmi-dist: {'badkwarg'}. The supported parameters are {'typical_p', 'max_tokens', 'truncate', 'ignore_eos', 'length_penalty', 'top_p', 'repetition_penalty', 'best_of', 'include_stop_str_in_output', 'stop_token_ids', 'top_k', 'prompt_logprobs', 'use_beam_search', 'spaces_between_special_tokens', 'logprobs', 'stop', 'skip_special_tokens', 'seed', 'frequency_penalty', 'temperature', 'min_p', 'logits_processors', 'sampling_params', 'presence_penalty', 'n', 'early_stopping'}

hf-accelerate w/scheduler

response: {"generated_text": ", I’m a 20 year old guy from the UK. I’ve been playing guitar for about 10 years now and"}
logs: The following parameters are not supported by huggingface accelerate rolling batch: {'seed', 'badkwarg'}. The supported parameters are {'topp', 'eos_token_id', 'use_lru_kv_cache', 'sampling', 'topk', 'alpha', '_max_seqlen', 'beam', 'max_new_seqlen', 'temperature', 'pad_token_id'}

deepspeed

response: {"generated_text": ", I’m a newbie here. I’m a 20 year old guy from the UK. I’ve been a fan"}
logs: The following parameters are not supported by deepspeed: {'badkwarg'}. The supported parameters are {'watermark', 'repetition_penalty', 'temperature', 'max_new_tokens', 'top_p', 'do_sample', 'seed', 'typical_p', 'stop_sequences', 'top_k', 'ignore_eos_token'}

tnx

response: {"generated_text": ", I’m a newbie here. I’m a 20 year old guy from the UK. I’ve been a fan"}
logs: The following parameters are not supported by neuron: {'badkwarg'}. The supported parameters are {'epsilon_cutoff', 'bos_token_id', 'begin_suppress_tokens', 'num_assistant_tokens_schedule', 'forced_eos_token_id', 'do_sample', '_commit_hash', 'max_new_tokens', 'output_attentions', 'num_assistant_tokens', 'encoder_no_repeat_ngram_size', 'use_cache', 'length_penalty', 'low_memory', 'min_length', 'encoder_repetition_penalty', 'output_scores', 'exponential_decay_length_penalty', 'return_dict_in_generate', '_from_model_config', 'num_beam_groups', 'decoder_start_token_id', 'pad_token_id', 'top_k', 'repetition_penalty', 'top_p', 'num_beams', 'force_words_ids', 'forced_decoder_ids', 'num_return_sequences', 'diversity_penalty', 'sequence_bias', 'min_new_tokens', 'renormalize_logits', 'guidance_scale', 'temperature', 'remove_invalid_values', 'eos_token_id', 'early_stopping', 'no_repeat_ngram_size', 'generation_kwargs', 'constraints', 'bad_words_ids', 'output_hidden_states', 'eta_cutoff', 'transformers_version', 'penalty_alpha', 'max_time', 'max_length', 'forced_bos_token_id', 'suppress_tokens', 'typical_p'}

optimum:

response: {"generated_text": ", I’m a newbie here. I’m a 20 year old guy from the UK. I’ve been a fan"}
logs: The following parameters are not supported by neuron: {'badkwarg'}. The supported parameters are {'forced_eos_token_id', 'penalty_alpha', 'num_beams', 'forced_decoder_ids', 'sequence_bias', 'output_hidden_states', 'typical_p', 'pad_token_id', 'temperature', '_commit_hash', 'bad_words_ids', 'top_p', 'suppress_tokens', 'generation_kwargs', 'constraints', 'exponential_decay_length_penalty', 'begin_suppress_tokens', 'length_penalty', 'num_return_sequences', 'min_length', 'encoder_no_repeat_ngram_size', 'force_words_ids', 'max_time', 'num_beam_groups', 'low_memory', 'no_repeat_ngram_size', 'top_k', 'guidance_scale', 'max_new_tokens', 'diversity_penalty', 'encoder_repetition_penalty', 'do_sample', 'max_length', 'decoder_start_token_id', 'output_attentions', 'min_new_tokens', 'early_stopping', 'output_scores', 'num_assistant_tokens_schedule', 'repetition_penalty', 'forced_bos_token_id', 'bos_token_id', 'eos_token_id', 'use_cache', 'renormalize_logits', 'remove_invalid_values', 'return_dict_in_generate', 'num_assistant_tokens', '_from_model_config', 'eta_cutoff', 'transformers_version', 'epsilon_cutoff'}

@siddvenk siddvenk requested review from zachgk, frankfliu and a team as code owners March 28, 2024 00:03
@siddvenk siddvenk merged commit c0d6e23 into deepjavalibrary:master Mar 28, 2024
8 checks passed
@siddvenk siddvenk deleted the generation-params branch March 28, 2024 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants