Release v0.0.0beta20 · scaleapi/llm-engine

What's Changed

Patch post_file client method by @song-william in #323
Add pod disruption budget to all endpoints by @yunfeng-scale in #328
create celery worker with inference worker profile by @saiatmakuri in #327
Bump http forwarder request CPU by @yunfeng-scale in #330
[Docs] Clarify get-events API usage by @seanshi-scale in #320
Enable additional Datadog tagging for jobs by @song-william in #324
fix celery worker profile for s3 access by @saiatmakuri in #333
Hardcode number of forwarder workers by @yunfeng-scale in #334
Standardize logging initialization by @song-william in #337
Fix up the mammoth max length issue. by @sam-scale in #335
Add docs for Model.create, update default values and fix per_worker concurrency by @yunfeng-scale in #332
updating docs to add codellama models by @ian-scale in #343
Add PodDisruptionBudget to model engine by @yunfeng-scale in #342
Allow auth to accept API keys by @saiatmakuri in #326
Add job_name in build logs for easier debugging by @song-william in #340
Make PDB optional by @yunfeng-scale in #344
Revert "fix celery worker profile for s3 access" by @yixu34 in #345
Revert "Revert "fix celery worker profile for s3 access"" by @saiatmakuri in #346
Pass file ID to fine-tuning script by @squeakymouse in #347
llama should have None max length by @sam-scale in #348
taking out codellama13b and 34b by @ian-scale in #349
Change DATADOG_TRACE_ENABLED to DD_TRACE_ENABLED by @edwardpark97 in #350
Allow fine-tuning hyperparameter to be Dict by @squeakymouse in #353
adding real auth to integration tests by @ian-scale in #352
add new llm-jp models to llm-engine by @ian-scale in #354
Generalize SQS region by @jaisanliang in #355
Track LLM Metrics by @saiatmakuri in #356
Remove extra trace facet "launch.resource_name" by @saiatmakuri in #359
Ianmacleod/add codellama instruct, refactor codellama models by @ian-scale in #360
Various changes/bugfixes to chart/code to streamline deployment on different forms of infra by @seanshi-scale in #339
Add PR template by @song-william in #341
Unmount aws config from root by @song-william in #361
Implement automated code coverage for CI by @tiffzhao5 in #362
Download only known files by @squeakymouse in #364
Documentation fix by @squeakymouse in #365
Change more AWS config mount paths by @squeakymouse in #367
Validating inference framework image tags by @tiffzhao5 in #357
Ianmacleod/add codellama 34b by @ian-scale in #369
Better error when model is not ready for predictions by @tiffzhao5 in #368
Improve metrics route team tags by @saiatmakuri in #371
Enable custom istio metric tags with Telemetry API by @song-william in #373
Use Variable name for Telemetry Helm Resources by @song-william in #374
Forward HTTP status code for sync requests by @yunfeng-scale in #375
Integrate TensorRT-LLM by @yunfeng-scale in #358
Fine-tuning e2e integration test by @tiffzhao5 in #372
Found a bug in the codellama vllm model_len logic. by @sam-scale in #380
Fix sample.yaml by @yunfeng-scale in #381
count prompt tokens by @saiatmakuri in #366
Fix integration test by @yunfeng-scale in #383
emit metrics on token counts by @saiatmakuri in #382
Increase llama-2 max_input_tokens by @sam-scale in #384
Revert "Found a bug in the codellama vllm model_len logic." by @yunfeng-scale in #386
Some updates to integration tests by @yunfeng-scale in #385
Celery autoscaler by @squeakymouse in #378
Don't install Celery autoscaler for test deployments by @squeakymouse in #388
LLM update API route by @squeakymouse in #387
adding zephyr 7b by @ian-scale in #389
update tensor-rt llm in enum by @ian-scale in #390
pypi version bump by @ian-scale in #391

New Contributors

@edwardpark97 made their first contribution in #350
@jaisanliang made their first contribution in #355
@tiffzhao5 made their first contribution in #362

Full Changelog: v0.0.0beta19...v0.0.0beta20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.0beta20

What's Changed

New Contributors

Contributors