diff --git a/website/docs/installation/modal/index.md b/website/docs/installation/modal/index.md
index a120065d436b..1fff6396b9db 100644
--- a/website/docs/installation/modal/index.md
+++ b/website/docs/installation/modal/index.md
@@ -130,36 +130,6 @@ def app():
 
 Once we deploy this model with `modal serve app.py`, it will output the url of the web endpoint, in a form of `https://<USERNAME>--tabby-server-starcoder-1b-app-dev.modal.run`.
 
-To test if the server is working, you can send a post request to the web endpoint.
-
-```shell
-curl --location 'https://<USERNAME>--tabby-server-starcoder-1b-app-dev.modal.run/v1/completions' \
---header 'Content-Type: application/json' \
---data '{
-  "language": "python",
-  "segments": {
-    "prefix": "def fib(n):\n    ",
-    "suffix": "\n        return fib(n - 1) + fib(n - 2)"
-  }
-}'
-```
-
-If you can get json response like in the following case, the app server is up and have fun!
-
-```json
-{
-    "id": "cmpl-4196b0c7-f417-4c48-9329-4a56aa86baea",
-    "choices": [
-        {
-            "index": 0,
-            "text": "if n == 0:\n        return 0\n    elif n == 1:\n        return 1\n    else:"
-        }
-    ]
-}
-```
-
-
-
 ![App Running](./app-running.png)
 
 Now it can be used as tabby server url in tabby editor extensions!
diff --git a/website/docs/installation/skypilot/index.md b/website/docs/installation/skypilot/index.md
index c3e9a55c8cf1..d596b84c6c2d 100644
--- a/website/docs/installation/skypilot/index.md
+++ b/website/docs/installation/skypilot/index.md
@@ -21,11 +21,11 @@ resources:
 
 Skypilot supports GPU from various cloud vendors. Please refer to the official [Skypilot documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html) for detailed installation instructions.
 
-As Tabby exposes its health check at `/v1/health`, we can define the following service configuration:
+Tabby exposes its health check at the `/metrics` endpoint, which also serves as a prometrics endpoint. Therefore, we can define the following readiness probe:
 
 ```yaml
 service:
-  readiness_probe: /v1/health
+  readiness_probe: /metrics
   replicas: 1
 ```
 
@@ -52,7 +52,7 @@ This finishes launching SkyServe's control VM which runs a load balancer for thi
 When you execute the following command, you'll encounter a message indicating that the replica is not ready:
 
 ```bash
-$ curl -L 'http://44.203.34.65:30001/v1/health'
+$ curl -L 'http://44.203.34.65:30001/metrics'
 
 {"detail":"No available replicas. Use \"sky serve status [SERVICE_NAME]\" to check the replica status."}%
 ```
@@ -68,22 +68,4 @@ Once the service is ready, you will see something like the following:
 
 ![tabby ready](./tabby-ready.png)
 
-SkyServe uses a redirect load balancer at its front, so the `-L` command is necessary if you would like to test the completion api with `curl`.
-
-```bash
-$ curl -L -X 'POST' \
-  'http://44.203.34.65:30001/v1/completions' \
-  -H 'accept: application/json' \
-  -H 'Content-Type: application/json' \
-  -d '{
-  "language": "python",
-  "segments": {
-    "prefix": "def fib(n):\n    ",
-    "suffix": "\n        return fib(n - 1) + fib(n - 2)"
-  }
-}'
-
-{"id":"cmpl-ba9aae81-ed9c-419b-9616-fceb92cdbe79","choices":[{"index":0,"text":"    if n <= 1:\n            return n"}]}
-```
-
 Now, you can utilize the load balancer URL (`http://44.203.34.65:30001` in this case) within Tabby editor extensions. Please refer to [`tabby.yaml`](https://github.com/TabbyML/tabby/blob/main/website/docs/installation/skypilot/tabby.yaml) for the full configuration used in this tutorial.
diff --git a/website/docs/installation/skypilot/tabby.yaml b/website/docs/installation/skypilot/tabby.yaml
index 3540754b773b..630bb5ec6fc1 100644
--- a/website/docs/installation/skypilot/tabby.yaml
+++ b/website/docs/installation/skypilot/tabby.yaml
@@ -6,7 +6,7 @@ resources:
   # accelerators: {T4:1, L4:1, A100:1, A10G:1}
 
 service:
-  readiness_probe: /v1/health
+  readiness_probe: /metrics
   replicas: 1
 
 run: |