Skip to content

Commit

Permalink
Update new layer diagram (#371)
Browse files Browse the repository at this point in the history
Signed-off-by: Dan Sun <[email protected]>
  • Loading branch information
yuzisun authored Jun 9, 2024
1 parent f41252e commit fb8dbde
Show file tree
Hide file tree
Showing 5 changed files with 12 additions and 14 deletions.
4 changes: 2 additions & 2 deletions docs/modelserving/v1beta1/llm/huggingface/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ curl -H "content-type:application/json" -H "Host: ${SERVICE_HOSTNAME}" -v http:/
```
!!! success "Expected Output"

```{ .bash .no-copy }
```{ .json .no-copy }
{"id":"cmpl-7c654258ab4d4f18b31f47b553439d96","choices":[{"finish_reason":"length","index":0,"logprobs":null,"text":"<generated_text>"}],"created":1715353182,"model":"llama3","system_fingerprint":null,"object":"text_completion","usage":{"completion_tokens":26,"prompt_tokens":4,"total_tokens":30}}
```

Expand All @@ -74,7 +74,7 @@ curl -H "content-type:application/json" -H "Host: ${SERVICE_HOSTNAME}" -v http:/
```
!!! success "Expected Output"

```{ .bash .no-copy }
```{ .json .no-copy }
{"id":"cmpl-87ee252062934e2f8f918dce011e8484","choices":[{"finish_reason":"length","index":0,"message":{"content":"<generated_response>","tool_calls":null,"role":"assistant","function_call":null},"logprobs":null}],"created":1715353461,"model":"llama3","system_fingerprint":null,"object":"chat.completion","usage":{"completion_tokens":30,"prompt_tokens":3,"total_tokens":33}}
```

Expand Down
8 changes: 4 additions & 4 deletions docs/modelserving/v1beta1/llm/huggingface/python_runtime.yaml
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: huggingface-llama2
name: huggingface-llama3
spec:
predictor:
model:
modelFormat:
name: huggingface
args:
- --model_name=llama2
- --model_id=meta-llama/Llama-2-7b-chat-hf
- --model_name=llama3
- --model_id=meta-llama/meta-llama-3-8b-instruct
- --tensor_input_names=input_ids
- --disable_vllm
- --backend=huggingface
resources:
limits:
cpu: "6"
Expand Down
6 changes: 3 additions & 3 deletions docs/modelserving/v1beta1/llm/huggingface/vllm_runtime.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: huggingface-llama2
name: huggingface-llama3
spec:
predictor:
model:
modelFormat:
name: huggingface
args:
- --model_name=llama2
- --model_id=meta-llama/Llama-2-7b-chat-hf
- --model_name=llama3
- --model_id=meta-llama/meta-llama-3-8b-instruct
resources:
limits:
cpu: "6"
Expand Down
6 changes: 2 additions & 4 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,8 @@ nav:
- Hugging Face: modelserving/v1beta1/triton/huggingface/README.md
- AMD: modelserving/v1beta1/amd/README.md
- LLM Runtime:
- TorchServe LLM:
- Bloom7b1: modelserving/v1beta1/llm/torchserve/accelerate/README.md
- Hugging Face LLM:
- Llama2: modelserving/v1beta1/llm/huggingface/README.md
- Hugging Face LLM: modelserving/v1beta1/llm/huggingface/README.md
- TorchServe LLM: modelserving/v1beta1/llm/torchserve/accelerate/README.md
- How to write a custom predictor: modelserving/v1beta1/custom/custom_model/README.md
- Multi Model Serving:
- Overview:
Expand Down
2 changes: 1 addition & 1 deletion overrides/home.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ <h2>Highly scalable and standards based
</p>
</div>
<div>
<img src="./images/kserve_layer.png" />
<img src="./images/kserve_new.png" />
</div>
</div>
</section>
Expand Down

0 comments on commit fb8dbde

Please sign in to comment.