Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script for doing bulk generation against an endpoint #765

Merged
merged 14 commits into from
Nov 29, 2023

Conversation

aspfohl
Copy link
Contributor

@aspfohl aspfohl commented Nov 28, 2023

As titled. Example usage:
Screenshot 2023-11-28 at 11 16 25 AM

Current limitations:

  • Could hit oom if too much data loaded
  • Assumes data is pre-loaded locally supports remote data too!
  • Only supports text completion for now (TODO: use SDK?)

Here's an example mcli yaml that works:

name: mpt7b-generate
image: mosaicml/pytorch:latest
# compute:
#   gpus: 1
#   cluster: TODO
command: |-
  for i in {1..20}; do curl http://0.0.0.0:8080/v2/ping && echo 'endpoint is up' && break || echo 'sleeping' && sleep 10; done
  python llm-foundry/scripts/inference/endpoint_generate.py --prompts "The lazy dog jumped over" "The best banana bread recipe is"

integrations:
- integration_type: git_repo
  git_repo: mosaicml/llm-foundry
  pip_install: .
- integration_type: pip_packages
  packages:
    - aiohttp
    - ratelimit

dependent_deployment:
  model:
    download_parameters:
      hf_path: mosaicml/mpt-7b-instruct
  env_variables:
    - key: MODEL_BACKEND_OVERRIDE
      value: "vllm"

env_variables:
  - key: ENDPOINT_URL
    value: "http://0.0.0.0:8080/v2/completions"

@aspfohl aspfohl marked this pull request as ready for review November 28, 2023 21:06
@aspfohl aspfohl requested a review from alextrott16 November 28, 2023 21:06
Copy link

@linden-li linden-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for adding this. This should hopefully be less painful once we add batched inference support.

scripts/inference/endpoint_generate.py Outdated Show resolved Hide resolved
scripts/inference/endpoint_generate.py Outdated Show resolved Hide resolved
scripts/inference/endpoint_generate.py Outdated Show resolved Hide resolved
scripts/inference/endpoint_generate.py Outdated Show resolved Hide resolved
@aspfohl aspfohl requested a review from dakinggg November 29, 2023 00:32
@aspfohl aspfohl merged commit 3a96b69 into main Nov 29, 2023
10 checks passed
@aspfohl aspfohl deleted the anna/endpoint-generate branch November 29, 2023 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants