diff --git a/notebooks/model_deployment.ipynb b/notebooks/model_deployment.ipynb new file mode 100644 index 0000000..0c4c76d --- /dev/null +++ b/notebooks/model_deployment.ipynb @@ -0,0 +1,265 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Deploying a model on Arcee Cloud\n", + "\n", + "In this notebook, you will learn how to deploy a model on Arcee Cloud. This could be a pre-trained model available off-the-shelf, or a model you have tailored to your needs with a combination of merging, continuous pretraining and alignment.\n", + "\n", + "You can run this demo for free thanks to the Arcee free tier. Your endpoint will be shut down automatically after 2 hours.\n", + "\n", + "The Arcee documentation is available at [docs.arcee.ai](https://docs.arcee.ai/deployment/start-deployment)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "\n", + "Please [sign up](https://app.arcee.ai/account/signup) to Arcee Cloud and create an [API key](https://docs.arcee.ai/getting-arcee-api-key/getting-arcee-api-key).\n", + "\n", + "Then, please update the cell below with your API key. Remember to keep this key safe, and **DON'T COMMIT IT to one of your repositories**." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%env ARCEE_API_KEY=YOUR_API_KEY" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Create a new Python environment (optional but recommended) and install the [arcee-python](https://github.com/arcee-ai/arcee-python)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Uncomment the next three lines to create a virtual environment\n", + "#!pip install -q virtualenv\n", + "#!virtualenv -q arcee-cloud\n", + "#!source arcee-cloud/bin/activate\n", + "\n", + "%pip install -q arcee-py" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import arcee" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Deploying a model\n", + "\n", + "Let's pick the model we'd like to deploy, and set the name of this deployment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "model_name = \"Llama-3-8B-Instruct\"\n", + "deployment_name = \"My Llama-3-8B-Instruct deployment\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We're now ready to deploy the model. We'll use the `start_deployment()` API and simply pass the model and deployment names.\n", + "\n", + "Here, we deploy an off-the-shelf model. For a pretrained or a merged model, we would respectively use the `pretraining` or the the `merging` parameter in stead of the `alignment` parameter." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "help(arcee.start_deployment)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "response = arcee.start_deployment(deployment_name=deployment_name, alignment=model_name)\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's wait for the endpoint to be provisioned. It should only takes a few minutes.\n", + "\n", + "The `deployment_status` API lets us query the current state of the endpoint." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "help(arcee.deployment_status)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from time import sleep\n", + "\n", + "while True:\n", + " response = arcee.deployment_status(deployment_name)\n", + " if response[\"deployment_processing_state\"] == \"pending\":\n", + " print(\"Deployment is progress. Waiting 30 seconds before checking again.\")\n", + " sleep(30)\n", + " else:\n", + " print(response)\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Generating text with our model\n", + "\n", + "Now, let's test the endpoint with a simple prompt.\n", + "\n", + "The `generate()` API requires the deployment name and the prompt." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "help(arcee.generate)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "response = arcee.generate(deployment_name=deployment_name, query=\"How did Alan Turing break the Enigma code?\")\n", + "print(response[\"text\"])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "query = \"Please write a marketing pitch for a new SaaS AI platform called Arcee Cloud. \\\n", + " Arcee Cloud makes it simple for enterprise users to tailor open-source small language models to their own domain knowledge, \\\n", + " in order to build high-quality, cost-effective and secure AI solutions. Focus on facts, don't make up numbers.\\\n", + " We will send this pitch by email to business and technical decision-makers, so make it sound exciting and convincing. \\\n", + " The contact email is sales@arcee.ai. Feel free to use emojis as appropriate.\"\n", + "\n", + "response = arcee.generate(deployment_name=deployment_name, query=query)\n", + "print(response[\"text\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Stopping our deployment\n", + "\n", + "Once we're done working with our model, we should stop the deployment to avoid unwanted charges.\n", + "\n", + "The `stop_deployment()` API only requires the deployment name." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "help(arcee.stop_deployment)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "arcee.stop_deployment(deployment_name=deployment_name)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "arcee.deployment_status(deployment_name)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This concludes the model deployment demonstration. Thank you for your time!\n", + "\n", + "If you'd like to know more about using Arcee Cloud in your organization, please visit the [Arcee website](https://www.arcee.ai), or contact [sales@arcee.ai](mailto:sales@arcee.ai).\n", + "\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.4" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}