diff --git a/website/docs/installation/modal/index.md b/website/docs/installation/modal/index.md index a120065d436b..1fff6396b9db 100644 --- a/website/docs/installation/modal/index.md +++ b/website/docs/installation/modal/index.md @@ -130,36 +130,6 @@ def app(): Once we deploy this model with `modal serve app.py`, it will output the url of the web endpoint, in a form of `https://--tabby-server-starcoder-1b-app-dev.modal.run`. -To test if the server is working, you can send a post request to the web endpoint. - -```shell -curl --location 'https://--tabby-server-starcoder-1b-app-dev.modal.run/v1/completions' \ ---header 'Content-Type: application/json' \ ---data '{ - "language": "python", - "segments": { - "prefix": "def fib(n):\n ", - "suffix": "\n return fib(n - 1) + fib(n - 2)" - } -}' -``` - -If you can get json response like in the following case, the app server is up and have fun! - -```json -{ - "id": "cmpl-4196b0c7-f417-4c48-9329-4a56aa86baea", - "choices": [ - { - "index": 0, - "text": "if n == 0:\n return 0\n elif n == 1:\n return 1\n else:" - } - ] -} -``` - - - ![App Running](./app-running.png) Now it can be used as tabby server url in tabby editor extensions! diff --git a/website/docs/installation/skypilot/index.md b/website/docs/installation/skypilot/index.md index c3e9a55c8cf1..d596b84c6c2d 100644 --- a/website/docs/installation/skypilot/index.md +++ b/website/docs/installation/skypilot/index.md @@ -21,11 +21,11 @@ resources: Skypilot supports GPU from various cloud vendors. Please refer to the official [Skypilot documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html) for detailed installation instructions. -As Tabby exposes its health check at `/v1/health`, we can define the following service configuration: +Tabby exposes its health check at the `/metrics` endpoint, which also serves as a prometrics endpoint. Therefore, we can define the following readiness probe: ```yaml service: - readiness_probe: /v1/health + readiness_probe: /metrics replicas: 1 ``` @@ -52,7 +52,7 @@ This finishes launching SkyServe's control VM which runs a load balancer for thi When you execute the following command, you'll encounter a message indicating that the replica is not ready: ```bash -$ curl -L 'http://44.203.34.65:30001/v1/health' +$ curl -L 'http://44.203.34.65:30001/metrics' {"detail":"No available replicas. Use \"sky serve status [SERVICE_NAME]\" to check the replica status."}% ``` @@ -68,22 +68,4 @@ Once the service is ready, you will see something like the following: ![tabby ready](./tabby-ready.png) -SkyServe uses a redirect load balancer at its front, so the `-L` command is necessary if you would like to test the completion api with `curl`. - -```bash -$ curl -L -X 'POST' \ - 'http://44.203.34.65:30001/v1/completions' \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "language": "python", - "segments": { - "prefix": "def fib(n):\n ", - "suffix": "\n return fib(n - 1) + fib(n - 2)" - } -}' - -{"id":"cmpl-ba9aae81-ed9c-419b-9616-fceb92cdbe79","choices":[{"index":0,"text":" if n <= 1:\n return n"}]} -``` - Now, you can utilize the load balancer URL (`http://44.203.34.65:30001` in this case) within Tabby editor extensions. Please refer to [`tabby.yaml`](https://github.com/TabbyML/tabby/blob/main/website/docs/installation/skypilot/tabby.yaml) for the full configuration used in this tutorial. diff --git a/website/docs/installation/skypilot/tabby.yaml b/website/docs/installation/skypilot/tabby.yaml index 3540754b773b..630bb5ec6fc1 100644 --- a/website/docs/installation/skypilot/tabby.yaml +++ b/website/docs/installation/skypilot/tabby.yaml @@ -6,7 +6,7 @@ resources: # accelerators: {T4:1, L4:1, A100:1, A10G:1} service: - readiness_probe: /v1/health + readiness_probe: /metrics replicas: 1 run: |