diff --git a/llm-models/safeguard/llamaguard/Llama_Guard_Demo_with_Databricks_marketplace_simplified_pii_detect.ipynb b/llm-models/safeguard/llamaguard/Llama_Guard_Demo_with_Databricks_marketplace_simplified_pii_detect.ipynb
new file mode 100644
index 0000000..14cb5f8
--- /dev/null
+++ b/llm-models/safeguard/llamaguard/Llama_Guard_Demo_with_Databricks_marketplace_simplified_pii_detect.ipynb
@@ -0,0 +1,2292 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "5c726947-5baf-43e6-9dbd-e671b43d8999",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "ro49xTeE_wci"
+   },
+   "source": [
+    "# Databricks foundation models and Llama Guard integration\n",
+    "\n",
+    "**IMPORTANT** The Llama Guard integration is in **Private Preview**. To enroll in the Private Preview, reach out to your Databricks account team.\n",
+    "\n",
+    "Meta's [Purple LLaMA](https://ai.meta.com/blog/purple-llama-open-trust-safety-generative-ai/) project has introduced Llama Guard, a robust 7 billion parameter model designed for chat moderation. This innovative model, comprehensively detailed in its [model card](https://huggingface.co/meta-llama/LlamaGuard-7b), plays a pivotal role in enhancing the safety and quality of interactions with conversational AI models.\n",
+    "\n",
+    "<!-- At Databricks/MosaicML, our commitment to responsible AI adoption is underscored by integrating tools like Llama Guard. We are excited to announce the availability of a demo Llama Guard endpoint at `https://models.hosted-on.mosaicml.hosting/llamaguard-7b/v2/chat`. This endpoint is readily accessible to the global AI community and does not require any authorization. -->\n",
+    "\n",
+    "### Explore the demo\n",
+    "\n",
+    "This interactive demo enables you to:\n",
+    "\n",
+    "1. **Engage with Llama Guard**: Utilize the model for effective prompt and response filtering, ensuring a safer chat experience.\n",
+    "2. **Custom taxonomy configuration**: Seamlessly define and implement your own taxonomy criteria specific to your needs with Llama Guard.\n",
+    "3. **Comprehensive integration**: Establish a robust end-to-end safety pipeline by integrating Llama Guard with your chat model, enhancing overall model performance and user safety.\n",
+    "\n",
+    "### Before you begin\n",
+    "\n",
+    "- Before continuing through this notebook you are required to go this [link](/marketplace/consumer/listings/9cd61515-663a-4d71-b1a7-758458b68dff) and click on  `Get instant access`. This allows you to accept the terms of service of the model providers and register the model in a UC catalog. \n",
+    "- Reach out to your Databricks account team to enroll in the Private Preview.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "e6f423ef-3b94-473a-ac31-e37aa35dbad3",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "## Set up authentication"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "88179980-1d41-4e4e-8353-6a4ae8577c22",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "\n",
+    "This guide outlines the steps to configure your PAT using Databricks Secrets and the Databricks CLI.\n",
+    "\n",
+    "1. [**Install Databricks CLI**:](https://docs.databricks.com/en/dev-tools/cli/install.html#install-or-update-the-databricks-cli)\n",
+    "   - Run `curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sudo sh` on your laptop or cluster terminal.\n",
+    "\n",
+    "2. **Configure the Databricks CLI**:\n",
+    "   - Use `databricks configure --token` and input your workspace URL and a [Personal Access Token (PAT)](https://docs.databricks.com/en/dev-tools/auth/pat.html#databricks-personal-access-tokens-for-workspace-users) from your Databricks profile.\n",
+    "\n",
+    "3. **Create a secret scope**: \n",
+    "   - Create a secret scope named `fm_demo` with `databricks secrets create-scope fm_demo`.\n",
+    "\n",
+    "4. **Save service principal secret**: \n",
+    "   - Store your service principal secret in the `fm_demo` scope using `databricks secrets put-secret fm_demo sp_token`. This is necessary for the Model Endpoint's authentication. For a demo or test, one of your PAT tokens can be used."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "be31b086-dd96-488d-8e71-76a4f5535b0d",
+     "showTitle": false,
+     "title": ""
+    },
+    "jupyter": {
+     "outputs_hidden": true
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001B[43mNote: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.\u001B[0m\nRequirement already satisfied: databricks-sdk in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (0.20.0)\nRequirement already satisfied: mlflow==2.10.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (2.10.0)\nRequirement already satisfied: pydantic==2.6.1 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (2.6.1)\nRequirement already satisfied: CloudPickle==3.0.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (3.0.0)\nRequirement already satisfied: presidio_analyzer in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (2.2.353)\nRequirement already satisfied: presidio_anonymizer in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (2.2.353)\nRequirement already satisfied: importlib-metadata!=4.7.0,<8,>=3.7.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (4.11.3)\nRequirement already satisfied: requests<3,>=2.17.3 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (2.28.1)\nRequirement already satisfied: markdown<4,>=3.3 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (3.3.4)\nRequirement already satisfied: databricks-cli<1,>=0.8.7 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (0.17.7)\nRequirement already satisfied: sqlalchemy<3,>=1.4.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (1.4.39)\nRequirement already satisfied: pyyaml<7,>=5.1 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (6.0)\nRequirement already satisfied: sqlparse<1,>=0.4.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (0.4.2)\nRequirement already satisfied: pytz<2024 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (2022.1)\nRequirement already satisfied: gitpython<4,>=2.1.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (3.1.27)\nRequirement already satisfied: pyarrow<16,>=4.0.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (8.0.0)\nRequirement already satisfied: querystring-parser<2 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from mlflow==2.10.0) (1.2.4)\nRequirement already satisfied: alembic!=1.10.0,<2 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from mlflow==2.10.0) (1.13.1)\nRequirement already satisfied: scikit-learn<2 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (1.1.1)\nRequirement already satisfied: entrypoints<1 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (0.4)\nRequirement already satisfied: gunicorn<22 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (20.1.0)\nRequirement already satisfied: Flask<4 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (1.1.2+db1)\nRequirement already satisfied: scipy<2 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (1.9.1)\nRequirement already satisfied: numpy<2 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (1.21.5)\nRequirement already satisfied: protobuf<5,>=3.12.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (3.19.4)\nRequirement already satisfied: packaging<24 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (21.3)\nRequirement already satisfied: click<9,>=7.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (8.0.4)\nRequirement already satisfied: docker<8,>=4.0.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from mlflow==2.10.0) (7.0.0)\nRequirement already satisfied: pandas<3 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (1.4.4)\nRequirement already satisfied: Jinja2<4,>=2.11 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (2.11.3)\nRequirement already satisfied: matplotlib<4 in /databricks/python3/lib/python3.10/site-packages (from mlflow==2.10.0) (3.5.2)\nRequirement already satisfied: pydantic-core==2.16.2 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from pydantic==2.6.1) (2.16.2)\nRequirement already satisfied: typing-extensions>=4.6.1 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from pydantic==2.6.1) (4.10.0)\nRequirement already satisfied: annotated-types>=0.4.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from pydantic==2.6.1) (0.6.0)\nRequirement already satisfied: google-auth~=2.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from databricks-sdk) (2.28.1)\nRequirement already satisfied: regex in /databricks/python3/lib/python3.10/site-packages (from presidio_analyzer) (2022.7.9)\nRequirement already satisfied: tldextract in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from presidio_analyzer) (5.1.1)\nRequirement already satisfied: spacy<4.0.0,>=3.4.4 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from presidio_analyzer) (3.7.4)\nRequirement already satisfied: phonenumbers<9.0.0,>=8.12 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from presidio_analyzer) (8.13.31)\nRequirement already satisfied: pycryptodome>=3.10.1 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from presidio_anonymizer) (3.20.0)\nRequirement already satisfied: Mako in /databricks/python3/lib/python3.10/site-packages (from alembic!=1.10.0,<2->mlflow==2.10.0) (1.2.0)\nRequirement already satisfied: urllib3<2.0.0,>=1.26.7 in /databricks/python3/lib/python3.10/site-packages (from databricks-cli<1,>=0.8.7->mlflow==2.10.0) (1.26.11)\nRequirement already satisfied: oauthlib>=3.1.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow==2.10.0) (3.2.0)\nRequirement already satisfied: pyjwt>=1.7.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow==2.10.0) (2.3.0)\nRequirement already satisfied: six>=1.10.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow==2.10.0) (1.16.0)\nRequirement already satisfied: tabulate>=0.7.7 in /databricks/python3/lib/python3.10/site-packages (from databricks-cli<1,>=0.8.7->mlflow==2.10.0) (0.8.10)\nRequirement already satisfied: itsdangerous>=0.24 in /databricks/python3/lib/python3.10/site-packages (from Flask<4->mlflow==2.10.0) (2.0.1)\nRequirement already satisfied: Werkzeug>=0.15 in /databricks/python3/lib/python3.10/site-packages (from Flask<4->mlflow==2.10.0) (2.0.3)\nRequirement already satisfied: gitdb<5,>=4.0.1 in /databricks/python3/lib/python3.10/site-packages (from gitpython<4,>=2.1.0->mlflow==2.10.0) (4.0.10)\nRequirement already satisfied: pyasn1-modules>=0.2.1 in /databricks/python3/lib/python3.10/site-packages (from google-auth~=2.0->databricks-sdk) (0.2.8)\nRequirement already satisfied: rsa<5,>=3.1.4 in /databricks/python3/lib/python3.10/site-packages (from google-auth~=2.0->databricks-sdk) (4.9)\nRequirement already satisfied: cachetools<6.0,>=2.0.0 in /databricks/python3/lib/python3.10/site-packages (from google-auth~=2.0->databricks-sdk) (4.2.4)\nRequirement already satisfied: setuptools>=3.0 in /databricks/python3/lib/python3.10/site-packages (from gunicorn<22->mlflow==2.10.0) (63.4.1)\nRequirement already satisfied: zipp>=0.5 in /databricks/python3/lib/python3.10/site-packages (from importlib-metadata!=4.7.0,<8,>=3.7.0->mlflow==2.10.0) (3.8.0)\nRequirement already satisfied: MarkupSafe>=0.23 in /databricks/python3/lib/python3.10/site-packages (from Jinja2<4,>=2.11->mlflow==2.10.0) (2.0.1)\nRequirement already satisfied: python-dateutil>=2.7 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow==2.10.0) (2.8.2)\nRequirement already satisfied: pyparsing>=2.2.1 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow==2.10.0) (3.0.9)\nRequirement already satisfied: kiwisolver>=1.0.1 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow==2.10.0) (1.4.2)\nRequirement already satisfied: pillow>=6.2.0 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow==2.10.0) (9.2.0)\nRequirement already satisfied: cycler>=0.10 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow==2.10.0) (0.11.0)\nRequirement already satisfied: fonttools>=4.22.0 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow==2.10.0) (4.25.0)\nRequirement already satisfied: certifi>=2017.4.17 in /databricks/python3/lib/python3.10/site-packages (from requests<3,>=2.17.3->mlflow==2.10.0) (2022.9.14)\nRequirement already satisfied: charset-normalizer<3,>=2 in /databricks/python3/lib/python3.10/site-packages (from requests<3,>=2.17.3->mlflow==2.10.0) (2.0.4)\nRequirement already satisfied: idna<4,>=2.5 in /databricks/python3/lib/python3.10/site-packages (from requests<3,>=2.17.3->mlflow==2.10.0) (3.3)\nRequirement already satisfied: threadpoolctl>=2.0.0 in /databricks/python3/lib/python3.10/site-packages (from scikit-learn<2->mlflow==2.10.0) (2.2.0)\nRequirement already satisfied: joblib>=1.0.0 in /databricks/python3/lib/python3.10/site-packages (from scikit-learn<2->mlflow==2.10.0) (1.2.0)\nRequirement already satisfied: thinc<8.3.0,>=8.2.2 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (8.2.3)\nRequirement already satisfied: tqdm<5.0.0,>=4.38.0 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (4.64.1)\nRequirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (3.0.12)\nRequirement already satisfied: smart-open<7.0.0,>=5.2.1 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (5.2.1)\nRequirement already satisfied: typer<0.10.0,>=0.3.0 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (0.7.0)\nRequirement already satisfied: langcodes<4.0.0,>=3.2.0 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (3.3.0)\nRequirement already satisfied: wasabi<1.2.0,>=0.9.1 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (1.1.2)\nRequirement already satisfied: preshed<3.1.0,>=3.0.2 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (3.0.8)\nRequirement already satisfied: srsly<3.0.0,>=2.4.3 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (2.4.7)\nRequirement already satisfied: catalogue<2.1.0,>=2.0.6 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (2.0.9)\nRequirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (1.0.4)\nRequirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (1.0.9)\nRequirement already satisfied: weasel<0.4.0,>=0.1.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (0.3.4)\nRequirement already satisfied: cymem<2.1.0,>=2.0.2 in /databricks/python3/lib/python3.10/site-packages (from spacy<4.0.0,>=3.4.4->presidio_analyzer) (2.0.7)\nRequirement already satisfied: greenlet!=0.4.17 in /databricks/python3/lib/python3.10/site-packages (from sqlalchemy<3,>=1.4.0->mlflow==2.10.0) (1.1.1)\nRequirement already satisfied: requests-file>=1.4 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from tldextract->presidio_analyzer) (2.0.0)\nRequirement already satisfied: filelock>=3.0.8 in /databricks/python3/lib/python3.10/site-packages (from tldextract->presidio_analyzer) (3.6.0)\nRequirement already satisfied: smmap<6,>=3.0.1 in /databricks/python3/lib/python3.10/site-packages (from gitdb<5,>=4.0.1->gitpython<4,>=2.1.0->mlflow==2.10.0) (5.0.0)\nRequirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /databricks/python3/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth~=2.0->databricks-sdk) (0.4.8)\nRequirement already satisfied: confection<1.0.0,>=0.0.1 in /databricks/python3/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4.0.0,>=3.4.4->presidio_analyzer) (0.1.1)\nRequirement already satisfied: blis<0.8.0,>=0.7.8 in /databricks/python3/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4.0.0,>=3.4.4->presidio_analyzer) (0.7.10)\nRequirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages (from weasel<0.4.0,>=0.1.0->spacy<4.0.0,>=3.4.4->presidio_analyzer) (0.16.0)\n\u001B[43mNote: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.\u001B[0m\n"
+     ]
+    }
+   ],
+   "source": [
+    "%pip install --upgrade databricks-sdk mlflow==2.10.0 pydantic==2.6.1 CloudPickle==3.0.0 presidio_analyzer presidio_anonymizer\n",
+    "dbutils.library.restartPython()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "492ead35-68b6-42a1-9be5-a265311a0188",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "# Function to set up Databricks environment variables\n",
+    "def setup_databricks_env():\n",
+    "    try:\n",
+    "        # Fetching the Databricks host and token\n",
+    "        databricks_host = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().get()\n",
+    "        databricks_token = dbutils.secrets.get(\"fm_demo\", \"sp_token\")\n",
+    "\n",
+    "        # Setting environment variables for Databricks SDK\n",
+    "        os.environ['DATABRICKS_TOKEN'] = databricks_token\n",
+    "        os.environ['DATABRICKS_HOST'] = databricks_host\n",
+    "    except Exception as e:\n",
+    "        print(\"Error setting up Databricks environment:\", e)\n",
+    "\n",
+    "# Call the function to set up the environment\n",
+    "setup_databricks_env()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "32c8a010-19e4-4427-a01f-c36d5bc3f5db",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "## Configuration settings for deploying models from Databricks Marketplace"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "aa4f44b5-67d6-402b-ad18-458a7e271a72",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "Deploy your own Llama Guard to Databricks [model serving](/ml/endpoints), specifically the Llama2-7b model, which has been instruction-tuned using our comprehensive dataset available in the Databricks [Marketplace](/marketplace). \n",
+    "\n",
+    "One of the primary benefits of utilizing marketplace models within Databricks is the default integration of [optimized model serving](https://docs.databricks.com/en/machine-learning/model-serving/llm-optimized-model-serving.html) when these models are deployed on Databricks endpoints. This feature enhances performance by optimizing resource usage and response times, making it an ideal choice for efficient machine learning operations.\n",
+    "\n",
+    "\n",
+    "<!-- For detailed guidance on using the endpoint, refer to the [Databricks Foundation Model API](https://docs.databricks.com/en/machine-learning/foundation-models/api-reference.html) documentation.--> \n",
+    "You are encouraged to explore the [Deploy provisioned throughput Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/deploy-prov-throughput-foundation-model-apis.html#deploy-provisioned-throughput-foundation-model-apis) for deploying your own instance of the Llama Guard model on Databricks.\n",
+    "\n",
+    "To deploy your model in provisioned throughput mode using the SDK, you must specify min_provisioned_throughput and max_provisioned_throughput fields in your request.\n",
+    "\n",
+    "To identify the suitable range of provisioned throughput for your model, see Get provisioned throughput in increments."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "9b115833-c129-482f-a34a-6af4a45e3001",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Name of the catalog containing the model. Replace with your catalog name if different.\n",
+    "CATALOG_NAME = \"databricks_llama_guard_model\"\n",
+    "\n",
+    "# Name of the model to be used.\n",
+    "MODEL_NAME = \"llamaguard_7b\"\n",
+    "\n",
+    "# Unified path to access the model in the catalog.\n",
+    "MODEL_UC_PATH = f\"{CATALOG_NAME}.models.{MODEL_NAME}\"\n",
+    "\n",
+    "# Version of the model to be loaded for inference. Update to the latest version as needed.\n",
+    "VERSION = \"1\"\n",
+    "\n",
+    "# The name of the endpoint for deploying the model.\n",
+    "LLAMAGUARD_ENDPOINT_NAME = f'{MODEL_NAME}_instruction'"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "d5aa5c8d-78b4-49d8-8d6e-1c95613a84e4",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "##Deploying the model to Model Serving\n",
+    "You can deploy this model directly to a Databricks Model Serving Endpoint ([AWS](https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html)|[Azure](https://learn.microsoft.com/en-us/azure/databricks/machine-learning/model-serving/create-manage-serving-endpoints)).\n",
+    "\n",
+    "Note: Model serving is not supported on GCP. On GCP, Databricks recommends running Batch inference using Spark, as shown below.\n",
+    "\n",
+    "The following are recommended workload types for each model size:\n",
+    "\n",
+    "\n",
+    "| Model Name    | Suggested workload type (AWS) | Suggested workload type (AZURE) |\n",
+    "|---------------|-------------------------------|---------------------------------|\n",
+    "| LlamaGuard_7b |                 |                     |\n",
+    "\n",
+    "\n",
+    "You can create the endpoint by clicking the “Serve this model” button in the model UI. We will be using Databricks [provisioned throughput Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/deploy-prov-throughput-foundation-model-apis.html). Provisioned throughput provides optimized inference for Foundation Models with performance guarantees for production workloads.\n",
+    "\n",
+    "To deploy your model in provisioned throughput mode, you must specify min_provisioned_throughput and max_provisioned_throughput fields in your request.\n",
+    "\n",
+    "You can also create the endpoint with Databricks SDK as follows:\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "992ed9f1-1f00-4566-a9d0-6ca30e5377f8",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import datetime\n",
+    "\n",
+    "from databricks.sdk import WorkspaceClient\n",
+    "from databricks.sdk.service.serving import EndpointCoreConfigInput\n",
+    "w = WorkspaceClient()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "4e89bee5-9e88-47a0-baed-a1ee409e62b4",
+     "showTitle": true,
+     "title": "Create endpoint with config"
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Endpoint with name 'llamaguard_7b_instruction' already exists.\n"
+     ]
+    }
+   ],
+   "source": [
+    "\n",
+    "min_provisioned_throughput = 1940 #The minimum tokens per second that the endpoint can scale down to.\n",
+    "max_provisioned_throughput = 2910 #The maximum tokens per second that the endpoint can scale up to.\n",
+    "\n",
+    "config = EndpointCoreConfigInput.from_dict({\n",
+    "    \"served_entities\": [\n",
+    "        {\n",
+    "             \"entity_name\":  MODEL_UC_PATH, #The name of the entity to be served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC)\n",
+    "             \"entity_version\": VERSION, #The version of the model in Databricks Model Registry to be served or empty if the entity is a FEATURE_SPEC.\n",
+    "             \"min_provisioned_throughput\": min_provisioned_throughput, \n",
+    "             \"max_provisioned_throughput\": max_provisioned_throughput, \n",
+    "        }\n",
+    "    ]\n",
+    "})\n",
+    "try:\n",
+    "    model_details = w.serving_endpoints.create(name=LLAMAGUARD_ENDPOINT_NAME, config=config)\n",
+    "    model_details.result(timeout=datetime.timedelta(minutes=90))\n",
+    "except Exception as e:\n",
+    "    print(f\"{e}\")   "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "32bf0e83-6ef4-4c1e-be30-aa0013d4e54d",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "## Set the default Llama Guard taxonomy\n",
+    "\n",
+    "\n",
+    "To use a Llama Guard model you need to define a taxonomy with violations. You can define a default taxonomy as described below.\n",
+    "\n",
+    "This example uses the MLflow Deployment SDK to query the serving endpoint that serves your foundation model. See [Query foundation models](https://docs.databricks.com/en/machine-learning/model-serving/score-foundation-models.html#language-MLflow%C2%A0Deployments%C2%A0SDK)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "41e1a17c-0f6e-49f5-89a6-ecca6adb0393",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "DEFAULT_UNSAFE_CATEGORY=\"\"\"O1: Unsafe Content.\n",
+    "    - Promoting or assisting in violence or hate.\n",
+    "    - Engaging in or encouraging sexually explicit content.\n",
+    "    - Assisting or encouraging criminal activities, including those related to firearms or illegal weapons or robbery.\n",
+    "    - Encouraging or assisting in the use or production of illegal or controlled substances.\n",
+    "    - Encouraging or detailing self-harm or suicide.\"\"\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "ec394c30-ddc9-441f-8469-8cb3f127000c",
+     "showTitle": true,
+     "title": "Query LlamaGuard Model for Unsafe Content"
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/pydantic/_internal/_config.py:322: UserWarning: Valid config keys have changed in V2:\n* 'schema_extra' has been renamed to 'json_schema_extra'\n  warnings.warn(message, UserWarning)\n"
+     ]
+    }
+   ],
+   "source": [
+    "\n",
+    "import mlflow\n",
+    "import mlflow.deployments\n",
+    "import re\n",
+    "\n",
+    "def query_llamaguard(chat, unsafe_categories=DEFAULT_UNSAFE_CATEGORY):\n",
+    "    \"\"\"\n",
+    "    Queries LlamaGuard model to check if the chat contains unsafe content based on predefined categories.\n",
+    "    \"\"\"\n",
+    "    try:\n",
+    "        client = mlflow.deployments.get_deploy_client(\"databricks\")\n",
+    "\n",
+    "        # Constructing the payload for the LlamaGuard API\n",
+    "\n",
+    "        query_payload = [\n",
+    "           f\"\"\"[INST] Task: Check for unsafe content in 'Agent' messages in conversations according to our safety policy.\n",
+    "          <BEGIN UNSAFE CONTENT CATEGORIES>\n",
+    "          {unsafe_categories}\n",
+    "          <END UNSAFE CONTENT CATEGORIES>\n",
+    "          <BEGIN CONVERSATION>\n",
+    "          {chat}\n",
+    "          <END CONVERSATION>\n",
+    "          Provide your safety assessment for Agent in the above conversation:\n",
+    "          - First line must read 'safe' or 'unsafe'.\n",
+    "          - If unsafe, a second line must include a comma-separated list of violated categories. [/INST]\"\"\"\n",
+    "        ]\n",
+    "\n",
+    "        response = client.predict(\n",
+    "            endpoint=LLAMAGUARD_ENDPOINT_NAME,\n",
+    "            inputs={\"dataframe_split\": {\n",
+    "                    \"index\": [0],\n",
+    "                    \"columns\": [\"prompt\"],\n",
+    "                    \"data\": [query_payload]\n",
+    "                    }\n",
+    "                })\n",
+    "        \n",
+    "        # Extract the desired information from the response object\n",
+    "        prediction = response.predictions[0][\"candidates\"][0][\"text\"].strip()\n",
+    "        is_safe = None if len(prediction.split(\"\\n\")) == 1 else prediction.split(\"\\n\")[1].strip()\n",
+    "        \n",
+    "        return prediction.split(\"\\n\")[0].lower()=='safe', is_safe\n",
+    "    \n",
+    "    except Exception as e:\n",
+    "        raise Exception(f\"Error in querying LlamaGuard model: {str(e)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "2cf9b15b-da8f-49fd-a4a2-8c4750ca0d0d",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "LHmZsckQmZi3"
+   },
+   "source": [
+    "Llama Guard operates as an advanced model designed to categorize user prompts and responses generated by Large Language Models (LLMs) into 'safe' and 'unsafe' classifications. To understand its functionality, consider experimenting with the Llama Guard endpoint using various user prompts. \n",
+    "\n",
+    "Below are two distinct examples of such prompts. One is deemed 'safe,' meaning it adheres to LlamaGuard's guidelines for acceptable content, while the other is classified as 'unsafe,' indicating it contains elements that may violate these guidelines. \n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "10d41317-4984-4dc2-901c-75d7ea913219",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "4R17mpoEXnU4"
+   },
+   "outputs": [],
+   "source": [
+    "safe_user_chat = [\n",
+    "  {\n",
+    "      \"role\": \"user\",\n",
+    "      \"content\": \"I want to love.\"\n",
+    "  }\n",
+    "]\n",
+    "unsafe_user_chat = [\n",
+    "  {\n",
+    "      \"role\": \"user\",\n",
+    "      \"content\": \"How can I rob a bank?\"\n",
+    "  }\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "c2d5ce7b-87a4-4586-8675-f03cbf85f133",
+     "showTitle": false,
+     "title": ""
+    },
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "MvYVjwcP12rv",
+    "outputId": "e7a22dc0-fc86-4fcb-e0e5-7b6b0c72762c"
+   },
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "(True, None)"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_llamaguard(safe_user_chat)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "259cffcc-0742-4610-a1d0-b2f0deb7d19e",
+     "showTitle": false,
+     "title": ""
+    },
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "NIlaa97Bof7N",
+    "outputId": "b382a1bc-6810-4e0f-88c2-c90a9fb18a63"
+   },
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "(False, 'O1')"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_llamaguard(unsafe_user_chat)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "2410f1ba-aea0-48cd-bc61-0a343ea7ef60",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "uEi2XNxToLEk"
+   },
+   "source": [
+    "Now, here are two examples of possible chat model responses to query the LlamaGuard endpoint with. Again, one of these examples is safe while the other is unsafe.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "dcbd75c7-f34c-41b5-8a4a-1d9bf7fab578",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "fPinQL1r-g-T"
+   },
+   "outputs": [],
+   "source": [
+    "safe_assistant_chat = [\n",
+    "  {\n",
+    "      \"role\": \"user\",\n",
+    "      \"content\": \"I want to love.\"\n",
+    "  },\n",
+    "  {\n",
+    "      \"role\": \"assistant\",\n",
+    "      \"content\": \"That is great! I want to love too!\"\n",
+    "  }\n",
+    "]\n",
+    "\n",
+    "unsafe_assistant_chat = [\n",
+    "  {\n",
+    "      \"role\": \"user\",\n",
+    "      \"content\": \"I want to love\"\n",
+    "  },\n",
+    "  {\n",
+    "      \"role\": \"assistant\",\n",
+    "      \"content\": \"I think the only way to rob a bank is to work as robinhood\"\n",
+    "  }\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "7b4dbe3f-b245-48b6-a0dd-e17925dc4bd9",
+     "showTitle": false,
+     "title": ""
+    },
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "icHfLmtJDUr-",
+    "outputId": "8196dc84-a2d6-48cd-faca-36f3d15b10a0"
+   },
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "(False, 'O1')"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_llamaguard(unsafe_assistant_chat)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "1a84319d-76f4-4f34-b557-d680f6234da4",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "DN13S5gHajNc"
+   },
+   "source": [
+    "## Customize Llama Guard taxonomy\n",
+    "\n",
+    "Llama Guard's **taxonomy** allows practitioners to define and categorize safety risks in chat model interactions. You can customize this taxonomy to tailor the safety assessment criteria. While Llama Guard has default `unsafe` categories, you can experiment by adding your own. However, remember to format your custom categories precisely as required for the model to accurately classify the safety risk of input prompts."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "2d92c6af-f83d-4f2b-9aad-49d65c996e7a",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "8_h9Mlw0bW3B"
+   },
+   "outputs": [],
+   "source": [
+    "unsafe_categories = \"\"\"O1: Violence and Hate.\n",
+    "Should not\n",
+    "- Help people plan or engage in violence.\n",
+    "- Encourage people to engage in violence.\n",
+    "- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\n",
+    "- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\n",
+    "Can\n",
+    "- Provide information on violence and discrimination.\n",
+    "- Discuss the topics of hate and violence, and can discuss historical events involving violence.\n",
+    "O2: Sexual Content.\n",
+    "Should not\n",
+    "- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\n",
+    "Can\n",
+    "- Discuss topics related to sexuality and sexual education.\n",
+    "- Provide general information regarding sexual activities.\n",
+    "- Provide guidance on topics related to sexual health.\n",
+    "O3: Criminal Planning.\n",
+    "Should not\n",
+    "- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\n",
+    "- Encourage people to engage in criminal activities or condone criminal behavior.\n",
+    "Can\n",
+    "- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\n",
+    "- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\n",
+    "O4: Guns and Illegal Weapons.\n",
+    "Should not\n",
+    "- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\n",
+    "- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\n",
+    "Can\n",
+    "- Discuss firearms and the arguments for and against firearm ownership.\n",
+    "O5: Regulated or Controlled Substances.\n",
+    "Should not\n",
+    "- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\n",
+    "- Assist or encourage people to create such substances when it is illegal to do so.\n",
+    "Can\n",
+    "- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\n",
+    "- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\n",
+    "O6: Self-Harm.\n",
+    "Should not\n",
+    "- Encourage people to harm themselves.\n",
+    "- Romanticize or condone self-harm or suicide.\n",
+    "- Provide information regarding the methods of suicide or self-harm.\n",
+    "- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\n",
+    "Can\n",
+    "- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\n",
+    "Should\n",
+    "- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\n",
+    "O7: Financial Sensitive Data.\n",
+    "Should not\n",
+    "- Provide any information on credit card number or pin number.\n",
+    "Can\n",
+    "- Provide general information regarding the risk of sharing credit card information with LLMs.\n",
+    "Should\n",
+    "- Provide information regarding proper handling of credit card information with LLMs.\"\"\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "e6c53d4a-cb75-4f14-8603-6acd10bfa811",
+     "showTitle": false,
+     "title": ""
+    },
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "qLSeht11agi9",
+    "outputId": "27bb9e84-78a4-4457-be1f-ed301aaaf302"
+   },
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "(False, 'O3')"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_llamaguard(unsafe_user_chat, unsafe_categories)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "44ea4832-93ba-42a9-a675-89e561bc138f",
+     "showTitle": false,
+     "title": ""
+    },
+    "id": "yCjbxWnXblCq"
+   },
+   "source": [
+    "##Integrate Llama Guard with chat model output\n",
+    "Now let's see how Llama Guard integrates with an actual chat model. Below `query_chat` is a function that calls a chat model Databricks Foundation Modela API and returns the output. `query_chat_safely` runs Llama Guard before and after `query_chat` to implement safety guardrails.\n",
+    "\n",
+    "Our chatbot leverages the **Mixtral 8x7B foundation model** to deliver responses. This model is accessible through the built-in foundation endpoint, available at [/ml/endpoints](/ml/endpoints) and specifically via the `/serving-endpoints/databricks-mixtral-8x7b-instruct/invocations` API. \n",
+    "\n",
+    "In the following cells demonstrate the use of the [Python SDK](https://docs.databricks.com/en/machine-learning/foundation-models/query-foundation-model-apis.html) for querying our Llama-2-70b model accessible through Databricks foundation model APIs.\n",
+    "\n",
+    "### Note:\n",
+    "There are multiple endpoint options and Langchain models available for use:\n",
+    "\n",
+    "1. **Databricks Foundation Models:** This is our choice for the current project.\n",
+    "2. **Your fine-tuned model:** Custom models tailored to specific needs.\n",
+    "3. **External model providers:** Options such as Azure OpenAI for alternative solutions.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "6e4afe3e-4ca0-4a08-bf44-134dfb22d992",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "CHAT_ENDPOINT_NAME = \"databricks-mixtral-8x7b-instruct\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "9a4b6fe5-d259-417b-a8e9-f5b0fc4b938e",
+     "showTitle": true,
+     "title": "Chat model query engine"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def query_chat(chat):\n",
+    "  \"\"\"\n",
+    "    Queries a chat model for a response based on the provided chat input.\n",
+    "\n",
+    "    Args:\n",
+    "        chat : The chat input for which a response is desired.\n",
+    "\n",
+    "    Returns:\n",
+    "        The chat model's response to the input.\n",
+    "\n",
+    "    Raises:\n",
+    "        Exception: If there are issues in querying the chat model or processing the response.\n",
+    "  \"\"\"\n",
+    "  try:\n",
+    "    client = mlflow.deployments.get_deploy_client(\"databricks\")\n",
+    "    response = client.predict(\n",
+    "        endpoint=CHAT_ENDPOINT_NAME,\n",
+    "        inputs={\n",
+    "            \"messages\": chat,\n",
+    "            \"temperature\": 0.1,\n",
+    "            \"max_tokens\": 512\n",
+    "        }\n",
+    "    )\n",
+    "    return response.choices[0][\"message\"][\"content\"]\n",
+    "  except Exception as e:\n",
+    "      raise Exception(f\"Error in querying chat model: {str(e)}\")\n",
+    "\n",
+    "\n",
+    "def query_chat_safely(chat, unsafe_categories):\n",
+    "    \"\"\"\n",
+    "    Queries a chat model safely by checking the safety of both the user's input and the model's response.\n",
+    "    It uses the LlamaGuard model to assess the safety of the chat content.\n",
+    "\n",
+    "    Args:\n",
+    "        chat : The user's chat input.\n",
+    "        unsafe_categories : String of categories used to determine the safety of the chat content.\n",
+    "\n",
+    "    Returns:\n",
+    "        The chat model's response if safe, else a safety warning message.\n",
+    "\n",
+    "    Raises:\n",
+    "        Exception: If there are issues in querying the chat model, processing the response, \n",
+    "                    or assessing the safety of the chat.\n",
+    "    \"\"\"\n",
+    "    try:\n",
+    "        is_safe, reason = query_llamaguard(chat, unsafe_categories)\n",
+    "        if not is_safe:\n",
+    "            category = parse_category(reason, unsafe_categories)\n",
+    "            return f\"User's prompt classified as {category}; fails safety measures.\"\n",
+    "\n",
+    "        model_response = query_chat(chat)\n",
+    "        full_chat = chat + [{\"role\": \"assistant\", \"content\": model_response}]\n",
+    "\n",
+    "        is_safe, reason = query_llamaguard(full_chat, unsafe_categories)\n",
+    "        if not is_safe:\n",
+    "            category = parse_category(reason, unsafe_categories)\n",
+    "            return f\"Model's response classified as {category}; fails safety measures.\"\n",
+    "\n",
+    "        return model_response\n",
+    "    except Exception as e:\n",
+    "        raise Exception(f\"Error in safe query: {str(e)}\")\n",
+    "\n",
+    "def parse_category(code, taxonomy):\n",
+    "    \"\"\"\n",
+    "    Extracts the first sentence of a category description from a taxonomy based on its code.\n",
+    "\n",
+    "    Args:\n",
+    "        code : Category code in the taxonomy (e.g., 'O1').\n",
+    "        taxonomy : Full taxonomy string with categories and descriptions.\n",
+    "\n",
+    "    Returns:\n",
+    "         First sentence of the description or a default message for unknown codes.\n",
+    "    \"\"\"\n",
+    "    pattern = r\"(O\\d+): ([\\s\\S]*?)(?=\\nO\\d+:|\\Z)\"\n",
+    "    taxonomy_mapping = {match[0]: re.split(r'(?<=[.!?])\\s+', match[1].strip(), 1)[0]\n",
+    "                        for match in re.findall(pattern, taxonomy)}\n",
+    "\n",
+    "    return taxonomy_mapping.get(code, \"Unknown category: code not in taxonomy.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "d71e1fa9-a044-4fcf-bfaa-e369eb9f3a7a",
+     "showTitle": false,
+     "title": ""
+    },
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 35
+    },
+    "id": "nffsF-CIg0nQ",
+    "outputId": "8e13a011-2d81-4e82-e613-b31e91d5406e"
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "💛 Loving others can bring great joy and fulfillment to your life. Here are some ways to cultivate love:\n\n1. Practice self-love: Before you can love others, it's essential to love yourself. Treat yourself with kindness, compassion, and respect. Engage in activities that nourish your mind, body, and spirit.\n\n2. Listen actively: Pay attention to what others are saying and show genuine interest in their thoughts and feelings. By doing so, you demonstrate that you value and care for them.\n\n3. Show empathy: Try to understand others' perspectives and experiences. This can help build stronger connections and foster a sense of belonging.\n\n4. Perform acts of kindness: Small gestures, such as holding the door open for someone or offering a compliment, can make a significant difference in someone's day.\n\n5. Communicate openly and honestly: Share your thoughts and feelings with others in a respectful and considerate manner. This can help build trust and deepen relationships.\n\n6. Spend quality time: Dedicate time to those you care about. Engage in activities that you both enjoy and cherish the moments you share together.\n\n7. Cultivate gratitude: Regularly express appreciation for the people and things in your life. This can help foster a positive mindset and strengthen relationships.\n\n8. Seek help when needed: If you're struggling to love or feel loved, consider seeking support from a mental health professional. They can provide guidance and tools to help you navigate your emotions and relationships.\n\nRemember, love is a journey, not a destination. Be patient with yourself and others as you cultivate and nurture loving relationships. 💛\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(query_chat_safely(safe_user_chat, unsafe_categories))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "e01b3c89-2987-4510-bda9-69b2fd0ba8e1",
+     "showTitle": false,
+     "title": ""
+    },
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 35
+    },
+    "id": "GLHWIW59g7S6",
+    "outputId": "08156909-15e5-40c2-9ea3-44458db1a965"
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "User's prompt classified as Criminal Planning.; fails safety measures.\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(query_chat_safely(unsafe_user_chat, unsafe_categories))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "0202f892-782e-4bbb-bc28-0bd0e488ff61",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "\n",
+    "Bringing it all together as a custom model to be deployed to our serving endpoint."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "fd72ba94-5c2b-4608-b669-e067f5b7bd88",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from presidio_analyzer import AnalyzerEngine\n",
+    "from presidio_anonymizer import AnonymizerEngine\n",
+    "\n",
+    "\n",
+    "class CustomChat(mlflow.pyfunc.PythonModel):\n",
+    "  \"\"\"\n",
+    "    A custom chat model that integrates with Databricks and employs safety filters for generating responses.\n",
+    "\n",
+    "    Attributes:\n",
+    "        databricks_host (str): The host URL for the Databricks service.\n",
+    "        default_unsafe_categories (str): Default categories considered unsafe for chat content.\n",
+    "        default_filter_endpoint (str): The endpoint name for the filtering service.\n",
+    "        default_chat_endpoint (str): The endpoint name for the chat service.\n",
+    "  \"\"\" \n",
+    "\n",
+    "  def __init__(self, databricks_host, default_unsafe_categorys=DEFAULT_UNSAFE_CATEGORY, default_filter_endpoint=LLAMAGUARD_ENDPOINT_NAME, default_chat_endpoint=CHAT_ENDPOINT_NAME):\n",
+    "    \"\"\"\n",
+    "    Initializes the CustomChat model with default parameters.\n",
+    "\n",
+    "    Args:\n",
+    "        databricks_host : The host URL of the Databricks service.\n",
+    "        default_unsafe_categories : Default categories considered unsafe for chat content.\n",
+    "        default_filter_endpoint : The endpoint name for the filtering service.\n",
+    "        default_chat_endpoint : The endpoint name for the chat service.\n",
+    "    \"\"\"\n",
+    "    self.databricks_host = databricks_host\n",
+    "    self.default_unsafe_categorys = default_unsafe_categorys\n",
+    "    self.default_filter_endpoint = default_filter_endpoint\n",
+    "    self.default_chat_endpoint = default_chat_endpoint\n",
+    "        \n",
+    "  def load_context(self, context):\n",
+    "        os.environ['DATABRICKS_HOST'] = self.databricks_host\n",
+    "        \n",
+    "  def predict(self, context, model_input):\n",
+    "      \"\"\"\n",
+    "        Generates chat responses for the given input using the specified chat model endpoint, optionally applying a safety filter.\n",
+    "\n",
+    "        Args:\n",
+    "            context: The context object provided by the MLflow runtime.\n",
+    "            model_input : The input data for the model, expecting a \"messages\" key.\n",
+    "\n",
+    "\n",
+    "        Returns:\n",
+    "            list: A list of dictionaries, each representing a chat response with additional metadata.\n",
+    "      \"\"\"\n",
+    "    \n",
+    " \n",
+    "      #safety filter off by default\n",
+    "      enable_safety_filter = model_input.get(\"enable_safety_filter\", [False])[0]\n",
+    "      enable_pii_filter = model_input.get(\"enable_pii_filter\", [False])[0]\n",
+    "\n",
+    "      unsafe_categories = None\n",
+    "      filter_endpoint = None\n",
+    "    \n",
+    "\n",
+    "      if enable_safety_filter:\n",
+    "        unsafe_categories = str(model_input.get(\"unsafe_categories\", [self.default_unsafe_categorys])[0])\n",
+    "        filter_endpoint = str(model_input.get(\"filter_endpoint\", [self.default_filter_endpoint])[0])\n",
+    "    \n",
+    "      chat_endpoint = str(model_input.get(\"chat_endpoint\", [self.default_chat_endpoint])[0])\n",
+    "      messages = list(model_input[\"messages\"][0])\n",
+    "      temperature = float(model_input.get(\"temperature\", [0.1])[0])\n",
+    "      max_tokens = int(model_input.get(\"max_tokens\", [512])[0])\n",
+    "\n",
+    "      response_messages = self._generate_response(\n",
+    "          messages,\n",
+    "          enable_safety_filter=enable_safety_filter,\n",
+    "          enable_pii_filter=enable_pii_filter,\n",
+    "          unsafe_categories=unsafe_categories,\n",
+    "          filter_endpoint=filter_endpoint,\n",
+    "          chat_endpoint=chat_endpoint,\n",
+    "          temperature=temperature,\n",
+    "          max_tokens=max_tokens,\n",
+    "      )\n",
+    "\n",
+    "      return response_messages\n",
+    "  \n",
+    "  def _generate_response(self, messages, **kwargs):\n",
+    "    \"\"\"\n",
+    "        Internal helper method to generate responses for a list of messages, applying safety filters if enabled.\n",
+    "\n",
+    "        Args:\n",
+    "            messages : A list of message dictionaries for which responses are to be generated.\n",
+    "            **kwargs: Keyword arguments containing settings and configurations for response generation.\n",
+    "\n",
+    "        Returns:\n",
+    "             A list of response dictionaries for each input message.\n",
+    "    \"\"\"  \n",
+    "\n",
+    "    enable_safety_filter = kwargs[\"enable_safety_filter\"]\n",
+    "    enable_pii_filter = kwargs[\"enable_pii_filter\"]\n",
+    "    outputs = None\n",
+    "\n",
+    "    if not enable_safety_filter:\n",
+    "      #simply call the query endpoint and construct output\n",
+    "      outputs = [self._query_chat(chat, kwargs[\"chat_endpoint\"], kwargs[\"temperature\"], kwargs[\"max_tokens\"]) for chat in messages]\n",
+    "    else:\n",
+    "      #call safe chat endpoint  \n",
+    "      outputs = [self._query_chat_safely(chat, kwargs[\"unsafe_categories\"], kwargs[\"chat_endpoint\"], kwargs[\"filter_endpoint\"], kwargs[\"temperature\"], kwargs[\"max_tokens\"]) for chat in messages] \n",
+    " \n",
+    "\n",
+    "    responses = []\n",
+    "    for out in outputs:\n",
+    "      try:             \n",
+    "        prompt_tokens = out['usage']['prompt_tokens']\n",
+    "        completion_tokens = out['usage']['completion_tokens']\n",
+    "        total_tokens = out['usage']['total_tokens']\n",
+    "\n",
+    "        response = {\n",
+    "          \"id\": out['id'],\n",
+    "          \"object\": out['object'],\n",
+    "          \"created\": out['created'],\n",
+    "          \"model\": out['model'],\n",
+    "          \"choices\": [\n",
+    "              {\n",
+    "                  \"index\": choice['index'],\n",
+    "                  \"message\": {\n",
+    "                      \"role\": choice['message']['role'],\n",
+    "                      \"content\": self._anonymize_pii(choice['message']['content']) if enable_pii_filter else choice['message']['content']\n",
+    "                  },\n",
+    "                  \"finish_reason\": choice['finish_reason']\n",
+    "              } for choice in out['choices']\n",
+    "          ],\n",
+    "          \"usage\": {\n",
+    "              \"prompt_tokens\": prompt_tokens,\n",
+    "              \"completion_tokens\": completion_tokens,\n",
+    "              \"total_tokens\": total_tokens\n",
+    "          }\n",
+    "        }\n",
+    "      except Exception as e:\n",
+    "          response = {\n",
+    "          \"id\": None,\n",
+    "          \"object\": None,\n",
+    "          \"created\": None,\n",
+    "          \"model\": None,\n",
+    "          \"choices\": [\n",
+    "              {\n",
+    "                  \"index\": None,\n",
+    "                  \"message\": {\n",
+    "                      \"role\": \"Assistant\",\n",
+    "                      \"content\": out\n",
+    "                  },\n",
+    "                  \"finish_reason\": \"Usage Policy Violation\"\n",
+    "              } \n",
+    "          ],\n",
+    "          \"usage\": {\n",
+    "              \"prompt_tokens\": None,\n",
+    "              \"completion_tokens\": None,\n",
+    "              \"total_tokens\": None\n",
+    "          }\n",
+    "        }\n",
+    "      responses.append(response)\n",
+    "    return responses\n",
+    "\n",
+    "  def _query_chat(self, chat, chat_endpoint, temperature, max_tokens):\n",
+    "      \"\"\"\n",
+    "        Queries a chat model endpoint for a response to the given chat input.\n",
+    "\n",
+    "        Args:\n",
+    "            chat : The chat input string.\n",
+    "            chat_endpoint : The chat model endpoint to query.\n",
+    "            temperature : The temperature parameter for the chat model.\n",
+    "            max_tokens : The maximum number of tokens for the chat model response.\n",
+    "\n",
+    "        Returns:\n",
+    "            The chat model's response.\n",
+    "\n",
+    "        Raises:\n",
+    "            Exception: If there's an error querying the chat model or processing the response.\n",
+    "      \"\"\"\n",
+    "      try:\n",
+    "        client = self._get_client()\n",
+    "        response = client.predict(\n",
+    "            endpoint=chat_endpoint,\n",
+    "            inputs={\n",
+    "                'messages': chat,\n",
+    "                'temperature': temperature,\n",
+    "                'max_tokens': max_tokens\n",
+    "            }\n",
+    "        )\n",
+    "        # return response.choices[0][\"message\"][\"content\"]\n",
+    "        return response\n",
+    "      except Exception as e:\n",
+    "          raise Exception(f\"Error in querying chat model: {str(e)}\")\n",
+    "\n",
+    "  def _query_chat_safely(self, chat, unsafe_categories, chat_endpoint, filter_endpoint, temperature, max_tokens):\n",
+    "    \"\"\"\n",
+    "    Safely queries a chat model by first applying a safety filter to the chat input and model's response.\n",
+    "\n",
+    "    Args:\n",
+    "        chat : The user's chat input.\n",
+    "        unsafe_categories : Categories considered unsafe for the chat content.\n",
+    "        chat_endpoint : The chat model endpoint to query.\n",
+    "        filter_endpoint : The safety filter endpoint to query.\n",
+    "        temperature : The temperature parameter for the chat model.\n",
+    "        max_tokens : The maximum number of tokens for the chat model response.\n",
+    "\n",
+    "    Returns:\n",
+    "        The safe chat model's response if the input and response pass the safety filter; otherwise, a safety warning message.\n",
+    "\n",
+    "    Raises:\n",
+    "        Exception: If there's an error in querying the chat model, processing the response, or assessing the safety of the chat.\n",
+    "    \"\"\"\n",
+    "    try:\n",
+    "        is_safe, code, violation_category = self._query_guardmodel(chat, unsafe_categories, filter_endpoint)\n",
+    "        if not is_safe:\n",
+    "            return f\"User's prompt classified as {violation_category}. Fails safety measures.\"\n",
+    "\n",
+    "        model_response = self._query_chat(chat, chat_endpoint, temperature, max_tokens)\n",
+    "        full_chat = chat + [{\"role\": \"assistant\", \"content\": model_response.choices[0][\"message\"][\"content\"]}]\n",
+    "\n",
+    "        is_safe, code, violation_category = self._query_guardmodel(full_chat, unsafe_categories, filter_endpoint)\n",
+    "        if not is_safe:\n",
+    "            return f\"Model's response classified as {violation_category}. Fails safety measures.\"\n",
+    "          \n",
+    "        return model_response\n",
+    "    except Exception as e:\n",
+    "        raise Exception(f\"Error in safe query: {str(e)}\")\n",
+    "\n",
+    "    # Define the PII anonymization function\n",
+    "  def _anonymize_pii(self, text):\n",
+    "    analyzer = AnalyzerEngine()\n",
+    "    anonymizer = AnonymizerEngine()\n",
+    "    analysis_results = analyzer.analyze(text=text, language='en')\n",
+    "    anonymized_results = anonymizer.anonymize(text=text, analyzer_results=analysis_results)\n",
+    "    return anonymized_results.text\n",
+    "\n",
+    "  def _query_guardmodel(self, chat,  unsafe_categories, filter_endpoint):\n",
+    "    \"\"\"\n",
+    "      Queries LlamaGuard model to check if the chat contains unsafe content based on predefined categories.\n",
+    "\n",
+    "      Args:\n",
+    "          chat : List of chat messages for safety analysis.\n",
+    "          unsafe_categories : Custom-defined categories for safety check.\n",
+    "                                            Defaults to 'default_unsafe_categories'.\n",
+    "\n",
+    "      Returns:\n",
+    "          tuple : (is_safe, reason) where 'is_safe' indicates if content is safe,\n",
+    "                                    and 'reason' is the unsafe category, if any.\n",
+    "\n",
+    "      Raises:\n",
+    "          Exception: For issues in querying LlamaGuard model or processing response.\n",
+    "    \"\"\"\n",
+    "    try:\n",
+    "        client = self._get_client()\n",
+    "\n",
+    "        # Constructing the payload for the LlamaGuard API\n",
+    "\n",
+    "        query_payload = [\n",
+    "           f\"\"\"[INST] Task: Check for unsafe content in 'Agent' messages in conversations according to our safety policy.\n",
+    "          <BEGIN UNSAFE CONTENT CATEGORIES>\n",
+    "          {unsafe_categories}\n",
+    "          <END UNSAFE CONTENT CATEGORIES>\n",
+    "          <BEGIN CONVERSATION>\n",
+    "          {chat}\n",
+    "          <END CONVERSATION>\n",
+    "          Provide your safety assessment for Agent in the above conversation:\n",
+    "          - First line must read 'safe' or 'unsafe'.\n",
+    "          - If unsafe, a second line must include a comma-separated list of violated categories. [/INST]\"\"\"\n",
+    "        ]\n",
+    "\n",
+    "        response = client.predict(\n",
+    "            endpoint= filter_endpoint,\n",
+    "            inputs={\"dataframe_split\": {\n",
+    "                    \"index\": [0],\n",
+    "                    \"columns\": [\"prompt\"],\n",
+    "                    \"data\": [query_payload]\n",
+    "                    }\n",
+    "                })\n",
+    "        \n",
+    "        # Extract the desired information from the response object\n",
+    "        prediction = response.predictions[0][\"candidates\"][0][\"text\"].strip()\n",
+    "        violation_code = None if len(prediction.split(\"\\n\")) == 1 else prediction.split(\"\\n\")[1].strip()\n",
+    "        is_safe = prediction.split(\"\\n\")[0].lower()=='safe'\n",
+    "        violation_category = None\n",
+    "\n",
+    "        if not is_safe:\n",
+    "\n",
+    "          pattern = r\"(O\\d+): ([^.!?]*[.!?])\"\n",
+    "          # Find all matches\n",
+    "          matches = re.findall(pattern, unsafe_categories)\n",
+    "          # Convert matches to a dictionary\n",
+    "          categories_dict = {match[0]: match[1].strip() for match in matches}\n",
+    "          violation_category=categories_dict.get(violation_code, \"Unknown category: code not in taxonomy.\")\n",
+    "\n",
+    "        return is_safe, violation_code, violation_category \n",
+    "    \n",
+    "    except Exception as e:\n",
+    "        raise Exception(f\"Error in querying guard model: {str(e)}\")\n",
+    "    \n",
+    "  def _get_client(self):\n",
+    "        # Dynamically create and return the client\n",
+    "        return mlflow.deployments.get_deploy_client(\"databricks\")  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "93437c36-dbcb-460f-aba1-4b3071890495",
+     "showTitle": true,
+     "title": "Chat safety filter schema generator"
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/mlflow/types/schema.py:540: FutureWarning: `optional` is deprecated and will be removed in a future version of MLflow. Use `required` instead.\n  warnings.warn(\n"
+     ]
+    },
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>messages</th>\n",
+       "      <th>enable_safety_filter</th>\n",
+       "      <th>enable_pii_filter</th>\n",
+       "      <th>chat_endpoint</th>\n",
+       "      <th>filter_endpoint</th>\n",
+       "      <th>unsafe_categories</th>\n",
+       "      <th>temperature</th>\n",
+       "      <th>max_tokens</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>[[{'role': 'user', 'content': 'I want to love....</td>\n",
+       "      <td>True</td>\n",
+       "      <td>True</td>\n",
+       "      <td>databricks-mixtral-8x7b-instruct</td>\n",
+       "      <td>llamaguard_7b_instruction</td>\n",
+       "      <td>O1: Violence and Hate.\\nShould not\\n- Help peo...</td>\n",
+       "      <td>0.1</td>\n",
+       "      <td>100</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                            messages  ...  max_tokens\n",
+       "0  [[{'role': 'user', 'content': 'I want to love....  ...         100\n",
+       "\n",
+       "[1 rows x 8 columns]"
+      ]
+     },
+     "execution_count": 20,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from mlflow.models.signature import ModelSignature\n",
+    "from mlflow.types import DataType, Schema, ColSpec\n",
+    "import pandas as pd\n",
+    "\n",
+    "# Define input and output schema\n",
+    "input_schema = Schema(\n",
+    "    [\n",
+    "        ColSpec(DataType.string, \"messages\"),\n",
+    "        ColSpec(DataType.boolean, \"enable_safety_filter\", optional=True),\n",
+    "        ColSpec(DataType.boolean, \"enable_pii_filter\", optional=True),\n",
+    "        ColSpec(DataType.string, \"chat_endpoint\", optional=True),\n",
+    "        ColSpec(DataType.string, \"filter_endpoint\", optional=True),\n",
+    "        ColSpec(DataType.string, \"unsafe_categories\", optional=True),\n",
+    "        ColSpec(DataType.double, \"temperature\", optional=True),\n",
+    "        ColSpec(DataType.long, \"max_tokens\", optional=True),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "output_schema = Schema([ColSpec(DataType.string)])\n",
+    "signature = ModelSignature(inputs=input_schema, outputs=output_schema)\n",
+    "# Define input example\n",
+    "\n",
+    "input_example = pd.DataFrame(\n",
+    "    {\n",
+    "        \"messages\": [[safe_user_chat]],\n",
+    "        \"enable_safety_filter\": [True],\n",
+    "        \"enable_pii_filter\": [True],\n",
+    "        \"chat_endpoint\": [CHAT_ENDPOINT_NAME],\n",
+    "        \"filter_endpoint\": [LLAMAGUARD_ENDPOINT_NAME],\n",
+    "        \"unsafe_categories\": [unsafe_categories],\n",
+    "        \"temperature\": [0.1],\n",
+    "        \"max_tokens\": [100],\n",
+    "    }\n",
+    ")\n",
+    "\n",
+    "input_example"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "38ebc168-1530-45e3-8d06-a1d02cff7686",
+     "showTitle": true,
+     "title": "Python code for SafeChat model"
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Using no filter: \n [\n    {\n        \"id\": \"a082fcae-acb4-4191-9ac3-063fc2351840\",\n        \"object\": \"chat.completion\",\n        \"created\": 1709763962,\n        \"model\": \"mixtral-8x7b-instruct-v0.1\",\n        \"choices\": [\n            {\n                \"index\": 0,\n                \"message\": {\n                    \"role\": \"assistant\",\n                    \"content\": \"\\ud83d\\udc9b Loving others can bring great joy and fulfillment to your life. Here are some ways to cultivate love:\\n\\n1. Cultivate self-love: Begin by loving yourself, accepting your flaws, and taking care of your physical, emotional, and mental well-being.\\n2. Practice active listening: Show genuine interest in others by listening to them attentively and empathetically.\\n3. Show kindness and compassion: Perform small\"\n                },\n                \"finish_reason\": \"length\"\n            }\n        ],\n        \"usage\": {\n            \"prompt_tokens\": 87,\n            \"completion_tokens\": 100,\n            \"total_tokens\": 187\n        }\n    }\n]\n"
+     ]
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "import json\n",
+    "output=None\n",
+    "try:\n",
+    "    # Using default taxonomy\n",
+    "    model = CustomChat(databricks_host=os.environ['DATABRICKS_HOST'])\n",
+    "    model.load_context(None)\n",
+    "    output = model.predict(None, input_example)\n",
+    "    print(f\"Using no filter: \\n {json.dumps(output, indent=4)}\")\n",
+    "except Exception as e:\n",
+    "    # Handle exceptions that may occur during prediction\n",
+    "    print(f\"Error during model prediction: {e}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "cf1772da-8959-45ab-a713-bf9727a20691",
+     "showTitle": true,
+     "title": "Model logging with MLflow"
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/databricks/python/lib/python3.10/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.\n  warnings.warn(\"Setuptools is replacing distutils.\")\n"
+     ]
+    },
+    {
+     "output_type": "display_data",
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "b0c7c13c634c4f1ca9b6ccca036776b9",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Uploading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "import mlflow\n",
+    "\n",
+    "# Log the model with its details such as artifacts, pip requirements and input example\n",
+    "with mlflow.start_run() as run:\n",
+    "    mlflow.pyfunc.log_model(\n",
+    "        \"model\",\n",
+    "        python_model=CustomChat(databricks_host=os.environ['DATABRICKS_HOST']),\n",
+    "        input_example=input_example,\n",
+    "        signature=signature,\n",
+    "        pip_requirements=[\"mlflow==2.10.0\", \"pydantic==2.6.1\", \"CloudPickle==3.0.0\", \"starlette\", \"presidio_analyzer\", \"presidio_anonymizer\"]\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "b6a3821c-133a-4b49-a3f3-85367041a202",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Registered model 'models.default.safe_chat_model' already exists. Creating a new version of this model...\n"
+     ]
+    },
+    {
+     "output_type": "display_data",
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "eb3046408b1149a795369a60516b24e7",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Downloading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "output_type": "display_data",
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "103fca2af6f44d5eaab48d2bbabec840",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Uploading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "output_type": "stream",
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Created version '45' of model 'models.default.safe_chat_model'.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Configure MLflow Python client to register model in Unity Catalog\n",
+    "import mlflow\n",
+    "\n",
+    "mlflow.set_registry_uri(\"databricks-uc\")\n",
+    "\n",
+    "# Register model to Unity Catalog\n",
+    "# This may take 2 minutes to complete\n",
+    "\n",
+    "#change this to the name of the catalog you want to register this model in \n",
+    "SAFE_CHAT_CATALOG_NAME = \"models\"\n",
+    "\n",
+    "#change this to the name of the schema you want to register this model in \n",
+    "SAFE_CHAT_CATALOG_SCHEMA = \"default\"\n",
+    "\n",
+    "#change this to the name of the model you want\n",
+    "SAFE_CHAT_MODEL_NAME = \"safe_chat_model\"\n",
+    "\n",
+    "registered_name = f\"{SAFE_CHAT_CATALOG_NAME}.{SAFE_CHAT_CATALOG_SCHEMA}.{SAFE_CHAT_MODEL_NAME}\"  # Note that the UC model name follows the pattern <catalog_name>.<schema_name>.<model_name>, corresponding to the catalog, schema, and registered model name\n",
+    "\n",
+    "result = mlflow.register_model(\n",
+    "    \"runs:/\" + run.info.run_id + \"/model\",\n",
+    "    registered_name,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "ad1fecc1-7e06-4553-bc81-70173a8c1e03",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "\n",
+    "from mlflow import MlflowClient\n",
+    "\n",
+    "client = MlflowClient()\n",
+    "\n",
+    "# Choose the right model version registered in the above cell.\n",
+    "client.set_registered_model_alias(name=registered_name, alias=\"Champion\", version=result.version)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "c489fd77-27ba-4f5b-90d7-89a03e3f378a",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "display_data",
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "fd1d0a5eec5543f39597c78057f8c159",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Downloading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "import mlflow\n",
+    "\n",
+    "loaded_model = mlflow.pyfunc.load_model(f\"models:/{registered_name}@Champion\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "dcc31986-fef3-4d66-9ea0-857a6854918f",
+     "showTitle": true,
+     "title": "Content moderation model evaluation"
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[{'id': '3ee388f7-c5a1-476d-9735-4da05af07989', 'object': 'chat.completion', 'created': 1709763972, 'model': 'mixtral-8x7b-instruct-v0.1', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': \"🛑 I'm here to promote positive and safe conversations. Encouraging violence or illegal activities, such as starting a riot, is against my guidelines. Let's discuss more constructive topics instead. How about sharing something you're passionate about or asking for advice on a hobby or interest?\"}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 90, 'completion_tokens': 65, 'total_tokens': 155}}] \n\n\n[{'id': None, 'object': None, 'created': None, 'model': None, 'choices': [{'index': None, 'message': {'role': 'Assistant', 'content': \"User's prompt classified as Unsafe Content.. Fails safety measures.\"}, 'finish_reason': 'Usage Policy Violation'}], 'usage': {'prompt_tokens': None, 'completion_tokens': None, 'total_tokens': None}}] \n\n\n[{'id': None, 'object': None, 'created': None, 'model': None, 'choices': [{'index': None, 'message': {'role': 'Assistant', 'content': \"User's prompt classified as Violence and Hate.. Fails safety measures.\"}, 'finish_reason': 'Usage Policy Violation'}], 'usage': {'prompt_tokens': None, 'completion_tokens': None, 'total_tokens': None}}] \n\n\n[{'id': 'b6cdaf0d-2872-4e42-aafc-9619fcf23729', 'object': 'chat.completion', 'created': 1709763974, 'model': 'llama-2-70b-chat-030424', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': \"\\nI cannot encourage or provide guidance on illegal or harmful activities, including starting a riot. Inciting violence or destruction of property is not only illegal, but it can also cause harm to individuals and damage to the community.\\n\\nInstead, I would encourage you to consider constructive ways to express your opinions and bring about positive change. There are many peaceful and lawful ways to make your voice heard and work towards a better society.\\n\\nIf you are feeling passionate about a particular issue, you could consider getting involved in local activism or advocacy efforts. You could also look into participating in peaceful protests or rallies, writing letters to your elected officials, or engaging in respectful dialogue with others to share your views and listen to their perspectives.\\n\\nRemember, violence and destruction of property is never an acceptable way to express your opinions. It can harm innocent people, damage property, and create more problems than it solves. Let's work towards creating a better world through peaceful and respectful means.\"}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 17, 'completion_tokens': 221, 'total_tokens': 238}}] \n\n\n[{'id': 'cdb9c762-a96a-40fe-85f5-dadbf4057ae2', 'object': 'chat.completion', 'created': 1709763981, 'model': 'mixtral-8x7b-instruct-v0.1', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': '1. Giannis Antetokounmpo: (414) 377-5332\\n2. Khris Middleton: (414) 377-5333\\n3. Jrue Holiday: (414) 377-5334\\n4. Brook Lopez: (414) 377-5335\\n5. Donte DiVincenzo: (414) 377-5336\\n6. Bobby Portis: (414) 377-5337\\n7. Pat Connaughton: (414) 377-5338\\n8. Grayson Allen: (414) 377-5339\\n9. George Hill: (414) 377-5340\\n10. Semi Ojeleye: (414) 377-5341\\n11. Thanasis Antetokounmpo: (414) 377-5342\\n12. Rodney Hood: (414) 377-5343\\n13. Jordan Nwora: (414) 377-5344\\n14. Sandro Mamukelashvili: (414) 377-5345\\n15. Justin Robinson: (414) 377-5346\\n\\nPlease note that these phone numbers are not real and have been made up for this response. The purpose is to demonstrate a list of phone numbers, not to provide actual contact information for the Milwaukee Bucks players.'}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 98, 'completion_tokens': 398, 'total_tokens': 496}}] \n\n\n[{'id': '69ddf92b-deaf-4896-a9b9-a04fe041de0b', 'object': 'chat.completion', 'created': 1709763987, 'model': 'mixtral-8x7b-instruct-v0.1', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': '1. <PERSON>: <PHONE_NUMBER>\\n2. <PERSON>: <PHONE_NUMBER>\\n3. Jrue Holiday: <PHONE_NUMBER>\\n4. <PERSON>: <PHONE_NUMBER>\\n5. <PERSON>: <PHONE_NUMBER>\\n6. <PERSON>: <PHONE_NUMBER>\\n7. <PERSON>: <PHONE_NUMBER>\\n8. <PERSON>: <PHONE_NUMBER>\\n9. <PERSON>: <PHONE_NUMBER>\\n10. Semi Ojeleye: <PHONE_NUMBER>\\n11. <PERSON>: <PHONE_NUMBER>\\n12. <PERSON>: <PHONE_NUMBER>\\n13. <PERSON>: <PHONE_NUMBER>\\n14. <PERSON>: <PHONE_NUMBER>\\n15. <PERSON>: <PHONE_NUMBER>\\n\\nPlease note that the above phone numbers are completely made up and for illustrative purposes only. I cannot provide actual phone numbers for privacy and security reasons. However, you can usually find contact information for professional sports teams through their official websites.'}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 98, 'completion_tokens': 400, 'total_tokens': 498}}]\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Make a prediction using no guard models\n",
+    "print(loaded_model.predict(\n",
+    "    [{\"messages\": [[{\"role\": \"user\", \"content\": \"I want to start a riot.\"}]]}]\n",
+    "), \"\\n\\n\")\n",
+    "\n",
+    "# Make a prediction using default taxonomy\n",
+    "print(loaded_model.predict(\n",
+    "    [{\"messages\": [[{\"role\": \"user\", \"content\": \"I want to start a riot.\"}]], \"enable_safety_filter\": True, \"temperature\": 0.3}]\n",
+    "), \"\\n\\n\")\n",
+    "\n",
+    "\n",
+    "# Make a prediction using the custom taxonomy\n",
+    "print(loaded_model.predict(\n",
+    "    [{\"messages\": [[{\"role\": \"user\", \"content\": \"I want to start a riot.\"}]], \"enable_safety_filter\": True, \"unsafe_categories\": unsafe_categories}]\n",
+    "), \"\\n\\n\")\n",
+    "\n",
+    "# Make a prediction using a different Chat model and filter turned off\n",
+    "print(loaded_model.predict(\n",
+    "    [{\"messages\": [[{\"role\": \"user\", \"content\": \"I want to start a riot.\"}]],\"enable_safety_filter\": False, \"unsafe_categories\": unsafe_categories ,\"chat_endpoint\": \"databricks-llama-2-70b-chat\"}]\n",
+    "), \"\\n\\n\")\n",
+    "\n",
+    "# Make a prediction using a pii filter off\n",
+    "print(loaded_model.predict(\n",
+    "    [{\"messages\": [[{\"role\": \"user\", \"content\": \"List the phone numbers of the players on the Milwaukee Bucks.\"}]],\"enable_pii_filter\": False, \"unsafe_categories\": unsafe_categories}]\n",
+    "), \"\\n\\n\")\n",
+    "\n",
+    "# Make a prediction using a pii filter on\n",
+    "print(loaded_model.predict(\n",
+    "    [{\"messages\": [[{\"role\": \"user\", \"content\": \"List the phone numbers of the players on the Milwaukee Bucks.\"}]],\"enable_pii_filter\": True, \"unsafe_categories\": unsafe_categories}]\n",
+    "))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "a70e221d-4e88-450d-9feb-6ddd7ac98adb",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "## Create Model Serving endpoint\n",
+    "After the model is registered, you can use the API to create a Databricks GPU Model Serving Endpoint that serves the `safe_chat_model` model. The provisioned through put is only available for foundation models. Since our safe chat model is a custom model we are going to use a little different configuration to provison the serving endpoint. \n",
+    "\n",
+    "The two key attributes being `workload_size` and `workload_type`. \n",
+    "\n",
+    "- `workload_size` : The workload size of the served entity. The workload size corresponds to a range of provisioned concurrency that the compute autoscales between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are \"Small\" (4 - 4 provisioned concurrency), \"Medium\" (8 - 16 provisioned concurrency), and \"Large\" (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.\n",
+    "- `workload_type` : The workload type of the served entity. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is \"CPU\". For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. Here is a summary of various GPU workload_type available.\n",
+    "\n",
+    "| GPU workload type | GPU instance     | GPU memory |\n",
+    "|-------------------|------------------|------------|\n",
+    "| GPU_SMALL         | 1xT4             | 16GB       |\n",
+    "| GPU_MEDIUM        | 1xA10G           | 24GB       |\n",
+    "| MULTIGPU_MEDIUM   | 4xA10G           | 96GB       |\n",
+    "| GPU_MEDIUM_8      | 8xA10G           | 192GB      |\n",
+    "| GPU_LARGE_8       | 8xA100-80GB      | 320GB      |\n",
+    "\n",
+    " "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "0087329f-a663-4dd4-ae56-2b151e1def86",
+     "showTitle": true,
+     "title": "Create Safe Chat Endpoint"
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "display_data",
+     "data": {
+      "text/plain": [
+       "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m\n",
+       "\u001B[0;31mResourceAlreadyExists\u001B[0m                     Traceback (most recent call last)\n",
+       "File \u001B[0;32m<command-3660584937456276>, line 30\u001B[0m\n",
+       "\u001B[1;32m     11\u001B[0m workload_type \u001B[38;5;241m=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mCPU\u001B[39m\u001B[38;5;124m\"\u001B[39m\n",
+       "\u001B[1;32m     13\u001B[0m config \u001B[38;5;241m=\u001B[39m EndpointCoreConfigInput\u001B[38;5;241m.\u001B[39mfrom_dict({\n",
+       "\u001B[1;32m     14\u001B[0m     \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mserved_entities\u001B[39m\u001B[38;5;124m\"\u001B[39m: [\n",
+       "\u001B[1;32m     15\u001B[0m         {\n",
+       "\u001B[0;32m   (...)\u001B[0m\n",
+       "\u001B[1;32m     28\u001B[0m             }\n",
+       "\u001B[1;32m     29\u001B[0m })\n",
+       "\u001B[0;32m---> 30\u001B[0m w\u001B[38;5;241m.\u001B[39mserving_endpoints\u001B[38;5;241m.\u001B[39mcreate(name\u001B[38;5;241m=\u001B[39mendpoint_name, config\u001B[38;5;241m=\u001B[39mconfig)\n",
+       "\n",
+       "File \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/service/serving.py:2463\u001B[0m, in \u001B[0;36mServingEndpointsAPI.create\u001B[0;34m(self, name, config, rate_limits, tags)\u001B[0m\n",
+       "\u001B[1;32m   2460\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m tags \u001B[38;5;129;01mis\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m: body[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mtags\u001B[39m\u001B[38;5;124m'\u001B[39m] \u001B[38;5;241m=\u001B[39m [v\u001B[38;5;241m.\u001B[39mas_dict() \u001B[38;5;28;01mfor\u001B[39;00m v \u001B[38;5;129;01min\u001B[39;00m tags]\n",
+       "\u001B[1;32m   2461\u001B[0m headers \u001B[38;5;241m=\u001B[39m {\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mAccept\u001B[39m\u001B[38;5;124m'\u001B[39m: \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mapplication/json\u001B[39m\u001B[38;5;124m'\u001B[39m, \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mContent-Type\u001B[39m\u001B[38;5;124m'\u001B[39m: \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mapplication/json\u001B[39m\u001B[38;5;124m'\u001B[39m, }\n",
+       "\u001B[0;32m-> 2463\u001B[0m op_response \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_api\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mdo\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[38;5;124;43mPOST\u001B[39;49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[38;5;124;43m/api/2.0/serving-endpoints\u001B[39;49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mbody\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mbody\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mheaders\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m)\u001B[49m\n",
+       "\u001B[1;32m   2464\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m Wait(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mwait_get_serving_endpoint_not_updating,\n",
+       "\u001B[1;32m   2465\u001B[0m             response\u001B[38;5;241m=\u001B[39mServingEndpointDetailed\u001B[38;5;241m.\u001B[39mfrom_dict(op_response),\n",
+       "\u001B[1;32m   2466\u001B[0m             name\u001B[38;5;241m=\u001B[39mop_response[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mname\u001B[39m\u001B[38;5;124m'\u001B[39m])\n",
+       "\n",
+       "File \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/core.py:130\u001B[0m, in \u001B[0;36mApiClient.do\u001B[0;34m(self, method, path, query, headers, body, raw, files, data, response_headers)\u001B[0m\n",
+       "\u001B[1;32m    126\u001B[0m headers[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mUser-Agent\u001B[39m\u001B[38;5;124m'\u001B[39m] \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_user_agent_base\n",
+       "\u001B[1;32m    127\u001B[0m retryable \u001B[38;5;241m=\u001B[39m retried(timeout\u001B[38;5;241m=\u001B[39mtimedelta(seconds\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_retry_timeout_seconds),\n",
+       "\u001B[1;32m    128\u001B[0m                     is_retryable\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_is_retryable,\n",
+       "\u001B[1;32m    129\u001B[0m                     clock\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_cfg\u001B[38;5;241m.\u001B[39mclock)\n",
+       "\u001B[0;32m--> 130\u001B[0m response \u001B[38;5;241m=\u001B[39m \u001B[43mretryable\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_perform\u001B[49m\u001B[43m)\u001B[49m\u001B[43m(\u001B[49m\u001B[43mmethod\u001B[49m\u001B[43m,\u001B[49m\n",
+       "\u001B[1;32m    131\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mpath\u001B[49m\u001B[43m,\u001B[49m\n",
+       "\u001B[1;32m    132\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mquery\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mquery\u001B[49m\u001B[43m,\u001B[49m\n",
+       "\u001B[1;32m    133\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mheaders\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\n",
+       "\u001B[1;32m    134\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mbody\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mbody\u001B[49m\u001B[43m,\u001B[49m\n",
+       "\u001B[1;32m    135\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mraw\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mraw\u001B[49m\u001B[43m,\u001B[49m\n",
+       "\u001B[1;32m    136\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mfiles\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mfiles\u001B[49m\u001B[43m,\u001B[49m\n",
+       "\u001B[1;32m    137\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mdata\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mdata\u001B[49m\u001B[43m)\u001B[49m\n",
+       "\u001B[1;32m    139\u001B[0m resp \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mdict\u001B[39m()\n",
+       "\u001B[1;32m    140\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m header \u001B[38;5;129;01min\u001B[39;00m response_headers \u001B[38;5;28;01mif\u001B[39;00m response_headers \u001B[38;5;28;01melse\u001B[39;00m []:\n",
+       "\n",
+       "File \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/retries.py:54\u001B[0m, in \u001B[0;36mretried.<locals>.decorator.<locals>.wrapper\u001B[0;34m(*args, **kwargs)\u001B[0m\n",
+       "\u001B[1;32m     50\u001B[0m         retry_reason \u001B[38;5;241m=\u001B[39m \u001B[38;5;124mf\u001B[39m\u001B[38;5;124m'\u001B[39m\u001B[38;5;132;01m{\u001B[39;00m\u001B[38;5;28mtype\u001B[39m(err)\u001B[38;5;241m.\u001B[39m\u001B[38;5;18m__name__\u001B[39m\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m is allowed to retry\u001B[39m\u001B[38;5;124m'\u001B[39m\n",
+       "\u001B[1;32m     52\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m retry_reason \u001B[38;5;129;01mis\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m:\n",
+       "\u001B[1;32m     53\u001B[0m     \u001B[38;5;66;03m# raise if exception is not retryable\u001B[39;00m\n",
+       "\u001B[0;32m---> 54\u001B[0m     \u001B[38;5;28;01mraise\u001B[39;00m err\n",
+       "\u001B[1;32m     56\u001B[0m logger\u001B[38;5;241m.\u001B[39mdebug(\u001B[38;5;124mf\u001B[39m\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mRetrying: \u001B[39m\u001B[38;5;132;01m{\u001B[39;00mretry_reason\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m (sleeping ~\u001B[39m\u001B[38;5;132;01m{\u001B[39;00msleep\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124ms)\u001B[39m\u001B[38;5;124m'\u001B[39m)\n",
+       "\u001B[1;32m     57\u001B[0m clock\u001B[38;5;241m.\u001B[39msleep(sleep \u001B[38;5;241m+\u001B[39m random())\n",
+       "\n",
+       "File \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/retries.py:33\u001B[0m, in \u001B[0;36mretried.<locals>.decorator.<locals>.wrapper\u001B[0;34m(*args, **kwargs)\u001B[0m\n",
+       "\u001B[1;32m     31\u001B[0m \u001B[38;5;28;01mwhile\u001B[39;00m clock\u001B[38;5;241m.\u001B[39mtime() \u001B[38;5;241m<\u001B[39m deadline:\n",
+       "\u001B[1;32m     32\u001B[0m     \u001B[38;5;28;01mtry\u001B[39;00m:\n",
+       "\u001B[0;32m---> 33\u001B[0m         \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[43mfunc\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43margs\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\n",
+       "\u001B[1;32m     34\u001B[0m     \u001B[38;5;28;01mexcept\u001B[39;00m \u001B[38;5;167;01mException\u001B[39;00m \u001B[38;5;28;01mas\u001B[39;00m err:\n",
+       "\u001B[1;32m     35\u001B[0m         last_err \u001B[38;5;241m=\u001B[39m err\n",
+       "\n",
+       "File \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/core.py:238\u001B[0m, in \u001B[0;36mApiClient._perform\u001B[0;34m(self, method, path, query, headers, body, raw, files, data)\u001B[0m\n",
+       "\u001B[1;32m    234\u001B[0m     \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m response\u001B[38;5;241m.\u001B[39mok: \u001B[38;5;66;03m# internally calls response.raise_for_status()\u001B[39;00m\n",
+       "\u001B[1;32m    235\u001B[0m         \u001B[38;5;66;03m# TODO: experiment with traceback pruning for better readability\u001B[39;00m\n",
+       "\u001B[1;32m    236\u001B[0m         \u001B[38;5;66;03m# See https://stackoverflow.com/a/58821552/277035\u001B[39;00m\n",
+       "\u001B[1;32m    237\u001B[0m         payload \u001B[38;5;241m=\u001B[39m response\u001B[38;5;241m.\u001B[39mjson()\n",
+       "\u001B[0;32m--> 238\u001B[0m         \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_make_nicer_error(response\u001B[38;5;241m=\u001B[39mresponse, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mpayload) \u001B[38;5;28;01mfrom\u001B[39;00m \u001B[38;5;28mNone\u001B[39m\n",
+       "\u001B[1;32m    239\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m response\n",
+       "\u001B[1;32m    240\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m requests\u001B[38;5;241m.\u001B[39mexceptions\u001B[38;5;241m.\u001B[39mJSONDecodeError:\n",
+       "\n",
+       "\u001B[0;31mResourceAlreadyExists\u001B[0m: Endpoint with name 'safe_chat_endpoint' already exists."
+      ]
+     },
+     "metadata": {
+      "application/vnd.databricks.v1+output": {
+       "arguments": {},
+       "data": "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m\n\u001B[0;31mResourceAlreadyExists\u001B[0m                     Traceback (most recent call last)\nFile \u001B[0;32m<command-3660584937456276>, line 30\u001B[0m\n\u001B[1;32m     11\u001B[0m workload_type \u001B[38;5;241m=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mCPU\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m     13\u001B[0m config \u001B[38;5;241m=\u001B[39m EndpointCoreConfigInput\u001B[38;5;241m.\u001B[39mfrom_dict({\n\u001B[1;32m     14\u001B[0m     \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mserved_entities\u001B[39m\u001B[38;5;124m\"\u001B[39m: [\n\u001B[1;32m     15\u001B[0m         {\n\u001B[0;32m   (...)\u001B[0m\n\u001B[1;32m     28\u001B[0m             }\n\u001B[1;32m     29\u001B[0m })\n\u001B[0;32m---> 30\u001B[0m w\u001B[38;5;241m.\u001B[39mserving_endpoints\u001B[38;5;241m.\u001B[39mcreate(name\u001B[38;5;241m=\u001B[39mendpoint_name, config\u001B[38;5;241m=\u001B[39mconfig)\n\nFile \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/service/serving.py:2463\u001B[0m, in \u001B[0;36mServingEndpointsAPI.create\u001B[0;34m(self, name, config, rate_limits, tags)\u001B[0m\n\u001B[1;32m   2460\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m tags \u001B[38;5;129;01mis\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m: body[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mtags\u001B[39m\u001B[38;5;124m'\u001B[39m] \u001B[38;5;241m=\u001B[39m [v\u001B[38;5;241m.\u001B[39mas_dict() \u001B[38;5;28;01mfor\u001B[39;00m v \u001B[38;5;129;01min\u001B[39;00m tags]\n\u001B[1;32m   2461\u001B[0m headers \u001B[38;5;241m=\u001B[39m {\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mAccept\u001B[39m\u001B[38;5;124m'\u001B[39m: \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mapplication/json\u001B[39m\u001B[38;5;124m'\u001B[39m, \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mContent-Type\u001B[39m\u001B[38;5;124m'\u001B[39m: \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mapplication/json\u001B[39m\u001B[38;5;124m'\u001B[39m, }\n\u001B[0;32m-> 2463\u001B[0m op_response \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_api\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mdo\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[38;5;124;43mPOST\u001B[39;49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[38;5;124;43m/api/2.0/serving-endpoints\u001B[39;49m\u001B[38;5;124;43m'\u001B[39;49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mbody\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mbody\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mheaders\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m   2464\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m Wait(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mwait_get_serving_endpoint_not_updating,\n\u001B[1;32m   2465\u001B[0m             response\u001B[38;5;241m=\u001B[39mServingEndpointDetailed\u001B[38;5;241m.\u001B[39mfrom_dict(op_response),\n\u001B[1;32m   2466\u001B[0m             name\u001B[38;5;241m=\u001B[39mop_response[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mname\u001B[39m\u001B[38;5;124m'\u001B[39m])\n\nFile \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/core.py:130\u001B[0m, in \u001B[0;36mApiClient.do\u001B[0;34m(self, method, path, query, headers, body, raw, files, data, response_headers)\u001B[0m\n\u001B[1;32m    126\u001B[0m headers[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mUser-Agent\u001B[39m\u001B[38;5;124m'\u001B[39m] \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_user_agent_base\n\u001B[1;32m    127\u001B[0m retryable \u001B[38;5;241m=\u001B[39m retried(timeout\u001B[38;5;241m=\u001B[39mtimedelta(seconds\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_retry_timeout_seconds),\n\u001B[1;32m    128\u001B[0m                     is_retryable\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_is_retryable,\n\u001B[1;32m    129\u001B[0m                     clock\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_cfg\u001B[38;5;241m.\u001B[39mclock)\n\u001B[0;32m--> 130\u001B[0m response \u001B[38;5;241m=\u001B[39m \u001B[43mretryable\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_perform\u001B[49m\u001B[43m)\u001B[49m\u001B[43m(\u001B[49m\u001B[43mmethod\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m    131\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mpath\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m    132\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mquery\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mquery\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m    133\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mheaders\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m    134\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mbody\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mbody\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m    135\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mraw\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mraw\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m    136\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mfiles\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mfiles\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m    137\u001B[0m \u001B[43m                                    \u001B[49m\u001B[43mdata\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mdata\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m    139\u001B[0m resp \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mdict\u001B[39m()\n\u001B[1;32m    140\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m header \u001B[38;5;129;01min\u001B[39;00m response_headers \u001B[38;5;28;01mif\u001B[39;00m response_headers \u001B[38;5;28;01melse\u001B[39;00m []:\n\nFile \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/retries.py:54\u001B[0m, in \u001B[0;36mretried.<locals>.decorator.<locals>.wrapper\u001B[0;34m(*args, **kwargs)\u001B[0m\n\u001B[1;32m     50\u001B[0m         retry_reason \u001B[38;5;241m=\u001B[39m \u001B[38;5;124mf\u001B[39m\u001B[38;5;124m'\u001B[39m\u001B[38;5;132;01m{\u001B[39;00m\u001B[38;5;28mtype\u001B[39m(err)\u001B[38;5;241m.\u001B[39m\u001B[38;5;18m__name__\u001B[39m\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m is allowed to retry\u001B[39m\u001B[38;5;124m'\u001B[39m\n\u001B[1;32m     52\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m retry_reason \u001B[38;5;129;01mis\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m:\n\u001B[1;32m     53\u001B[0m     \u001B[38;5;66;03m# raise if exception is not retryable\u001B[39;00m\n\u001B[0;32m---> 54\u001B[0m     \u001B[38;5;28;01mraise\u001B[39;00m err\n\u001B[1;32m     56\u001B[0m logger\u001B[38;5;241m.\u001B[39mdebug(\u001B[38;5;124mf\u001B[39m\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mRetrying: \u001B[39m\u001B[38;5;132;01m{\u001B[39;00mretry_reason\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m (sleeping ~\u001B[39m\u001B[38;5;132;01m{\u001B[39;00msleep\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124ms)\u001B[39m\u001B[38;5;124m'\u001B[39m)\n\u001B[1;32m     57\u001B[0m clock\u001B[38;5;241m.\u001B[39msleep(sleep \u001B[38;5;241m+\u001B[39m random())\n\nFile \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/retries.py:33\u001B[0m, in \u001B[0;36mretried.<locals>.decorator.<locals>.wrapper\u001B[0;34m(*args, **kwargs)\u001B[0m\n\u001B[1;32m     31\u001B[0m \u001B[38;5;28;01mwhile\u001B[39;00m clock\u001B[38;5;241m.\u001B[39mtime() \u001B[38;5;241m<\u001B[39m deadline:\n\u001B[1;32m     32\u001B[0m     \u001B[38;5;28;01mtry\u001B[39;00m:\n\u001B[0;32m---> 33\u001B[0m         \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[43mfunc\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43margs\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m     34\u001B[0m     \u001B[38;5;28;01mexcept\u001B[39;00m \u001B[38;5;167;01mException\u001B[39;00m \u001B[38;5;28;01mas\u001B[39;00m err:\n\u001B[1;32m     35\u001B[0m         last_err \u001B[38;5;241m=\u001B[39m err\n\nFile \u001B[0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/core.py:238\u001B[0m, in \u001B[0;36mApiClient._perform\u001B[0;34m(self, method, path, query, headers, body, raw, files, data)\u001B[0m\n\u001B[1;32m    234\u001B[0m     \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m response\u001B[38;5;241m.\u001B[39mok: \u001B[38;5;66;03m# internally calls response.raise_for_status()\u001B[39;00m\n\u001B[1;32m    235\u001B[0m         \u001B[38;5;66;03m# TODO: experiment with traceback pruning for better readability\u001B[39;00m\n\u001B[1;32m    236\u001B[0m         \u001B[38;5;66;03m# See https://stackoverflow.com/a/58821552/277035\u001B[39;00m\n\u001B[1;32m    237\u001B[0m         payload \u001B[38;5;241m=\u001B[39m response\u001B[38;5;241m.\u001B[39mjson()\n\u001B[0;32m--> 238\u001B[0m         \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_make_nicer_error(response\u001B[38;5;241m=\u001B[39mresponse, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mpayload) \u001B[38;5;28;01mfrom\u001B[39;00m \u001B[38;5;28mNone\u001B[39m\n\u001B[1;32m    239\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m response\n\u001B[1;32m    240\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m requests\u001B[38;5;241m.\u001B[39mexceptions\u001B[38;5;241m.\u001B[39mJSONDecodeError:\n\n\u001B[0;31mResourceAlreadyExists\u001B[0m: Endpoint with name 'safe_chat_endpoint' already exists.",
+       "errorSummary": "<span class='ansi-red-fg'>ResourceAlreadyExists</span>: Endpoint with name 'safe_chat_endpoint' already exists.",
+       "errorTraceType": "ansi",
+       "metadata": {},
+       "type": "ipynbError"
+      }
+     },
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "# Provide a name to the serving endpoint\n",
+    "endpoint_name = 'safe_chat_endpoint'\n",
+    "\n",
+    "from databricks.sdk import WorkspaceClient\n",
+    "from databricks.sdk.service.serving import EndpointCoreConfigInput\n",
+    "w = WorkspaceClient()\n",
+    "\n",
+    "model_version = result  # the returned result of mlflow.register_model\n",
+    "\n",
+    "workload_size = \"Small\"\n",
+    "workload_type = \"CPU\"\n",
+    "\n",
+    "config = EndpointCoreConfigInput.from_dict({\n",
+    "    \"served_entities\": [\n",
+    "        {\n",
+    "            \"entity_name\": model_version.name,\n",
+    "            \"entity_version\": model_version.version,\n",
+    "            \"workload_size\": \"Small\", \n",
+    "            \"scale_to_zero_enabled\": \"False\", #Whether the compute resources for the served entity should scale down to zero.\n",
+    "            \"environment_vars\" : {\"DATABRICKS_TOKEN\": \"{{secrets/fm_demo/sp_token}}\"},\n",
+    "        }\n",
+    "    ],\n",
+    "    \"auto_capture_config\": { #Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.\n",
+    "              \"catalog_name\" : SAFE_CHAT_CATALOG_NAME,\n",
+    "              \"schema_name\" : SAFE_CHAT_CATALOG_SCHEMA,\n",
+    "              \"table_prefix_name\" : endpoint_name,\n",
+    "              \"enabled\" : True           \n",
+    "            }\n",
+    "})\n",
+    "w.serving_endpoints.create(name=endpoint_name, config=config)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {},
+     "inputWidgets": {},
+     "nuid": "6b5d03d5-5276-40fe-a9f0-014cf4f4f6f4",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "The above step will take 5-10 min to provision a new serving endpoint. You can monitor progress [here](/ml/endpoints/safe_chat_endpoint) "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "74429017-ff0b-4e00-bafe-992b49931fa8",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "source": [
+    "\n",
+    "You can now monitor for toxic messages by querying the inference table for our serving endpoint.<br>\n",
+    "\n",
+    "__NOTE:__ The first time inference table is enabled, it may take up to 5 minutes for the first messages to be logged."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "34938bf2-ecf4-4536-b0a5-f830ae3df636",
+     "showTitle": false,
+     "title": ""
+    }
+   },
+   "outputs": [
+    {
+     "output_type": "display_data",
+     "data": {
+      "text/html": [
+       "<style scoped>\n",
+       "  .table-result-container {\n",
+       "    max-height: 300px;\n",
+       "    overflow: auto;\n",
+       "  }\n",
+       "  table, th, td {\n",
+       "    border: 1px solid black;\n",
+       "    border-collapse: collapse;\n",
+       "  }\n",
+       "  th, td {\n",
+       "    padding: 5px;\n",
+       "  }\n",
+       "  th {\n",
+       "    text-align: left;\n",
+       "  }\n",
+       "</style><div class='table-result-container'><table class='table-result'><thead style='background-color: white'><tr><th>client_request_id</th><th>databricks_request_id</th><th>date</th><th>timestamp_ms</th><th>status_code</th><th>execution_time_ms</th><th>request</th><th>response</th><th>sampling_fraction</th><th>request_metadata</th></tr></thead><tbody><tr><td>null</td><td>f96ef531-d6a6-4e01-a271-1f85cfd6b443</td><td>2024-03-06</td><td>1709764470176</td><td>200</td><td>3448</td><td>{\n",
+       "  \"dataframe_split\": {\n",
+       "    \"columns\": [\n",
+       "      \"messages\",\n",
+       "      \"enable_safety_filter\",\n",
+       "      \"enable_pii_filter\",\n",
+       "      \"chat_endpoint\",\n",
+       "      \"filter_endpoint\",\n",
+       "      \"unsafe_categories\",\n",
+       "      \"temperature\",\n",
+       "      \"max_tokens\"\n",
+       "    ],\n",
+       "    \"data\": [\n",
+       "      [\n",
+       "        [\n",
+       "          [\n",
+       "            {\n",
+       "              \"role\": \"user\",\n",
+       "              \"content\": \"I want to love.\"\n",
+       "            }\n",
+       "          ]\n",
+       "        ],\n",
+       "        false,\n",
+       "        true,\n",
+       "        \"databricks-mixtral-8x7b-instruct\",\n",
+       "        \"llamaguard_7b_instruction\",\n",
+       "        \"O1: Violence and Hate.\\nShould not\\n- Help people plan or engage in violence.\\n- Encourage people to engage in violence.\\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\\nCan\\n- Provide information on violence and discrimination.\\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\\nO2: Sexual Content.\\nShould not\\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\\nCan\\n- Discuss topics related to sexuality and sexual education.\\n- Provide general information regarding sexual activities.\\n- Provide guidance on topics related to sexual health.\\nO3: Criminal Planning.\\nShould not\\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in criminal activities or condone criminal behavior.\\nCan\\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\\nO4: Guns and Illegal Weapons.\\nShould not\\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\\nCan\\n- Discuss firearms and the arguments for and against firearm ownership.\\nO5: Regulated or Controlled Substances.\\nShould not\\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\\n- Assist or encourage people to create such substances when it is illegal to do so.\\nCan\\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\\nO6: Self-Harm.\\nShould not\\n- Encourage people to harm themselves.\\n- Romanticize or condone self-harm or suicide.\\n- Provide information regarding the methods of suicide or self-harm.\\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\\nCan\\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\\nShould\\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\\nO7: Financial Sensitive Data.\\nShould not\\n- Provide any information on credit card number or pin number.\\nCan\\n- Provide general information regarding the risk of sharing credit card information with LLMs.\\nShould\\n- Provide information regarding proper handling of credit card information with LLMs.\",\n",
+       "        0.1,\n",
+       "        100\n",
+       "      ]\n",
+       "    ]\n",
+       "  }\n",
+       "}</td><td>{\"predictions\": [{\"id\": \"14080ce0-96ba-4bd3-82a6-fd0cde0894bd\", \"object\": \"chat.completion\", \"created\": 1709764466, \"model\": \"mixtral-8x7b-instruct-v0.1\", \"choices\": [{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"\\ud83d\\udc9b Loving others is a beautiful desire! Here are some ways to cultivate love in your life:\\n\\n1. Practice self-love: Before loving others, learn to love yourself. Treat yourself with kindness, compassion, and respect. Engage in activities that nourish your mind, body, and spirit.\\n\\n2. Listen actively: Show genuine interest in others by actively listening to them. Pay attention to their words, emotions, and body language. This will\"}, \"finish_reason\": \"length\"}], \"usage\": {\"prompt_tokens\": 87, \"completion_tokens\": 100, \"total_tokens\": 187}}]}</td><td>1.0</td><td>Map(model_name -> models.default.safe_chat_model, endpoint_name -> safe_chat_endpoint, model_version -> 44)</td></tr><tr><td>null</td><td>e50baf50-cba3-49e2-8534-1dce4c8b319e</td><td>2024-03-06</td><td>1709764480863</td><td>200</td><td>3031</td><td>{\n",
+       "  \"dataframe_split\": {\n",
+       "    \"columns\": [\n",
+       "      \"messages\",\n",
+       "      \"enable_safety_filter\",\n",
+       "      \"enable_pii_filter\",\n",
+       "      \"chat_endpoint\",\n",
+       "      \"filter_endpoint\",\n",
+       "      \"unsafe_categories\",\n",
+       "      \"temperature\",\n",
+       "      \"max_tokens\"\n",
+       "    ],\n",
+       "    \"data\": [\n",
+       "      [\n",
+       "        [\n",
+       "          [\n",
+       "            {\n",
+       "              \"role\": \"user\",\n",
+       "              \"content\": \"I want to create a bomb.\"\n",
+       "            }\n",
+       "          ]\n",
+       "        ],\n",
+       "        false,\n",
+       "        true,\n",
+       "        \"databricks-mixtral-8x7b-instruct\",\n",
+       "        \"llamaguard_7b_instruction\",\n",
+       "        \"O1: Violence and Hate.\\nShould not\\n- Help people plan or engage in violence.\\n- Encourage people to engage in violence.\\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\\nCan\\n- Provide information on violence and discrimination.\\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\\nO2: Sexual Content.\\nShould not\\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\\nCan\\n- Discuss topics related to sexuality and sexual education.\\n- Provide general information regarding sexual activities.\\n- Provide guidance on topics related to sexual health.\\nO3: Criminal Planning.\\nShould not\\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in criminal activities or condone criminal behavior.\\nCan\\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\\nO4: Guns and Illegal Weapons.\\nShould not\\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\\nCan\\n- Discuss firearms and the arguments for and against firearm ownership.\\nO5: Regulated or Controlled Substances.\\nShould not\\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\\n- Assist or encourage people to create such substances when it is illegal to do so.\\nCan\\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\\nO6: Self-Harm.\\nShould not\\n- Encourage people to harm themselves.\\n- Romanticize or condone self-harm or suicide.\\n- Provide information regarding the methods of suicide or self-harm.\\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\\nCan\\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\\nShould\\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\\nO7: Financial Sensitive Data.\\nShould not\\n- Provide any information on credit card number or pin number.\\nCan\\n- Provide general information regarding the risk of sharing credit card information with LLMs.\\nShould\\n- Provide information regarding proper handling of credit card information with LLMs.\",\n",
+       "        0.1,\n",
+       "        100\n",
+       "      ]\n",
+       "    ]\n",
+       "  }\n",
+       "}</td><td>{\"predictions\": [{\"id\": \"fbb88eae-f7e0-4456-af51-91b17f0812d8\", \"object\": \"chat.completion\", \"created\": 1709764478, \"model\": \"mixtral-8x7b-instruct-v0.1\", \"choices\": [{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"\\ud83d\\uded1 I'm unable to assist with that request. I'm here to promote safety and positivity. If you have any questions about electronics, chemistry, or any other topic for non-harmful purposes, I'd be happy to help!\"}, \"finish_reason\": \"stop\"}], \"usage\": {\"prompt_tokens\": 89, \"completion_tokens\": 55, \"total_tokens\": 144}}]}</td><td>1.0</td><td>Map(model_name -> models.default.safe_chat_model, endpoint_name -> safe_chat_endpoint, model_version -> 44)</td></tr><tr><td>null</td><td>3e3ff998-ba36-98a0-aeea-d6bb461d9a1e</td><td>2024-03-06</td><td>1709764633397</td><td>200</td><td>4044</td><td>{\n",
+       "  \"dataframe_split\": {\n",
+       "    \"columns\": [\n",
+       "      \"messages\",\n",
+       "      \"enable_safety_filter\",\n",
+       "      \"enable_pii_filter\",\n",
+       "      \"chat_endpoint\",\n",
+       "      \"filter_endpoint\",\n",
+       "      \"unsafe_categories\",\n",
+       "      \"temperature\",\n",
+       "      \"max_tokens\"\n",
+       "    ],\n",
+       "    \"data\": [\n",
+       "      [\n",
+       "        [\n",
+       "          [\n",
+       "            {\n",
+       "              \"role\": \"user\",\n",
+       "              \"content\": \"give me names and addresses of all nba players from NJ.\"\n",
+       "            }\n",
+       "          ]\n",
+       "        ],\n",
+       "        true,\n",
+       "        true,\n",
+       "        \"databricks-mixtral-8x7b-instruct\",\n",
+       "        \"llamaguard_7b_instruction\",\n",
+       "        \"O1: Violence and Hate.\\nShould not\\n- Help people plan or engage in violence.\\n- Encourage people to engage in violence.\\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\\nCan\\n- Provide information on violence and discrimination.\\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\\nO2: Sexual Content.\\nShould not\\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\\nCan\\n- Discuss topics related to sexuality and sexual education.\\n- Provide general information regarding sexual activities.\\n- Provide guidance on topics related to sexual health.\\nO3: Criminal Planning.\\nShould not\\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in criminal activities or condone criminal behavior.\\nCan\\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\\nO4: Guns and Illegal Weapons.\\nShould not\\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\\nCan\\n- Discuss firearms and the arguments for and against firearm ownership.\\nO5: Regulated or Controlled Substances.\\nShould not\\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\\n- Assist or encourage people to create such substances when it is illegal to do so.\\nCan\\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\\nO6: Self-Harm.\\nShould not\\n- Encourage people to harm themselves.\\n- Romanticize or condone self-harm or suicide.\\n- Provide information regarding the methods of suicide or self-harm.\\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\\nCan\\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\\nShould\\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\\nO7: Financial Sensitive Data.\\nShould not\\n- Provide any information on credit card number or pin number.\\nCan\\n- Provide general information regarding the risk of sharing credit card information with LLMs.\\nShould\\n- Provide information regarding proper handling of credit card information with LLMs.\",\n",
+       "        0.1,\n",
+       "        100\n",
+       "      ]\n",
+       "    ]\n",
+       "  }\n",
+       "}</td><td>{\"predictions\": [{\"id\": \"7a9fab1e-2569-489f-8112-3d69b58f09ee\", \"object\": \"chat.completion\", \"created\": 1709764629, \"model\": \"mixtral-8x7b-instruct-v0.1\", \"choices\": [{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"1. <PERSON>\\nBrooklyn Nets\\nHSS Training Center\\n15 MetroTech Center\\n<LOCATION>, <LOCATION>\\n\\n2. <PERSON>\\nBrooklyn Nets\\nHSS Training Center\\n15 MetroTech Center\\n<LOCATION>, <LOCATION>\\n\\n3. <PERSON>Brooklyn Nets\\nHSS Training Center\\n15 MetroTech Center\\nBro\"}, \"finish_reason\": \"length\"}], \"usage\": {\"prompt_tokens\": 97, \"completion_tokens\": 100, \"total_tokens\": 197}}]}</td><td>1.0</td><td>Map(model_name -> models.default.safe_chat_model, endpoint_name -> safe_chat_endpoint, model_version -> 44)</td></tr></tbody></table></div>"
+      ]
+     },
+     "metadata": {
+      "application/vnd.databricks.v1+output": {
+       "addedWidgets": {},
+       "aggData": [],
+       "aggError": "",
+       "aggOverflow": false,
+       "aggSchema": [],
+       "aggSeriesLimitReached": false,
+       "aggType": "",
+       "arguments": {},
+       "columnCustomDisplayInfos": {},
+       "data": [
+        [
+         null,
+         "f96ef531-d6a6-4e01-a271-1f85cfd6b443",
+         "2024-03-06",
+         1709764470176,
+         200,
+         3448,
+         "{\n  \"dataframe_split\": {\n    \"columns\": [\n      \"messages\",\n      \"enable_safety_filter\",\n      \"enable_pii_filter\",\n      \"chat_endpoint\",\n      \"filter_endpoint\",\n      \"unsafe_categories\",\n      \"temperature\",\n      \"max_tokens\"\n    ],\n    \"data\": [\n      [\n        [\n          [\n            {\n              \"role\": \"user\",\n              \"content\": \"I want to love.\"\n            }\n          ]\n        ],\n        false,\n        true,\n        \"databricks-mixtral-8x7b-instruct\",\n        \"llamaguard_7b_instruction\",\n        \"O1: Violence and Hate.\\nShould not\\n- Help people plan or engage in violence.\\n- Encourage people to engage in violence.\\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\\nCan\\n- Provide information on violence and discrimination.\\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\\nO2: Sexual Content.\\nShould not\\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\\nCan\\n- Discuss topics related to sexuality and sexual education.\\n- Provide general information regarding sexual activities.\\n- Provide guidance on topics related to sexual health.\\nO3: Criminal Planning.\\nShould not\\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in criminal activities or condone criminal behavior.\\nCan\\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\\nO4: Guns and Illegal Weapons.\\nShould not\\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\\nCan\\n- Discuss firearms and the arguments for and against firearm ownership.\\nO5: Regulated or Controlled Substances.\\nShould not\\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\\n- Assist or encourage people to create such substances when it is illegal to do so.\\nCan\\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\\nO6: Self-Harm.\\nShould not\\n- Encourage people to harm themselves.\\n- Romanticize or condone self-harm or suicide.\\n- Provide information regarding the methods of suicide or self-harm.\\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\\nCan\\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\\nShould\\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\\nO7: Financial Sensitive Data.\\nShould not\\n- Provide any information on credit card number or pin number.\\nCan\\n- Provide general information regarding the risk of sharing credit card information with LLMs.\\nShould\\n- Provide information regarding proper handling of credit card information with LLMs.\",\n        0.1,\n        100\n      ]\n    ]\n  }\n}",
+         "{\"predictions\": [{\"id\": \"14080ce0-96ba-4bd3-82a6-fd0cde0894bd\", \"object\": \"chat.completion\", \"created\": 1709764466, \"model\": \"mixtral-8x7b-instruct-v0.1\", \"choices\": [{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"\\ud83d\\udc9b Loving others is a beautiful desire! Here are some ways to cultivate love in your life:\\n\\n1. Practice self-love: Before loving others, learn to love yourself. Treat yourself with kindness, compassion, and respect. Engage in activities that nourish your mind, body, and spirit.\\n\\n2. Listen actively: Show genuine interest in others by actively listening to them. Pay attention to their words, emotions, and body language. This will\"}, \"finish_reason\": \"length\"}], \"usage\": {\"prompt_tokens\": 87, \"completion_tokens\": 100, \"total_tokens\": 187}}]}",
+         1.0,
+         {
+          "endpoint_name": "safe_chat_endpoint",
+          "model_name": "models.default.safe_chat_model",
+          "model_version": "44"
+         }
+        ],
+        [
+         null,
+         "e50baf50-cba3-49e2-8534-1dce4c8b319e",
+         "2024-03-06",
+         1709764480863,
+         200,
+         3031,
+         "{\n  \"dataframe_split\": {\n    \"columns\": [\n      \"messages\",\n      \"enable_safety_filter\",\n      \"enable_pii_filter\",\n      \"chat_endpoint\",\n      \"filter_endpoint\",\n      \"unsafe_categories\",\n      \"temperature\",\n      \"max_tokens\"\n    ],\n    \"data\": [\n      [\n        [\n          [\n            {\n              \"role\": \"user\",\n              \"content\": \"I want to create a bomb.\"\n            }\n          ]\n        ],\n        false,\n        true,\n        \"databricks-mixtral-8x7b-instruct\",\n        \"llamaguard_7b_instruction\",\n        \"O1: Violence and Hate.\\nShould not\\n- Help people plan or engage in violence.\\n- Encourage people to engage in violence.\\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\\nCan\\n- Provide information on violence and discrimination.\\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\\nO2: Sexual Content.\\nShould not\\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\\nCan\\n- Discuss topics related to sexuality and sexual education.\\n- Provide general information regarding sexual activities.\\n- Provide guidance on topics related to sexual health.\\nO3: Criminal Planning.\\nShould not\\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in criminal activities or condone criminal behavior.\\nCan\\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\\nO4: Guns and Illegal Weapons.\\nShould not\\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\\nCan\\n- Discuss firearms and the arguments for and against firearm ownership.\\nO5: Regulated or Controlled Substances.\\nShould not\\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\\n- Assist or encourage people to create such substances when it is illegal to do so.\\nCan\\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\\nO6: Self-Harm.\\nShould not\\n- Encourage people to harm themselves.\\n- Romanticize or condone self-harm or suicide.\\n- Provide information regarding the methods of suicide or self-harm.\\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\\nCan\\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\\nShould\\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\\nO7: Financial Sensitive Data.\\nShould not\\n- Provide any information on credit card number or pin number.\\nCan\\n- Provide general information regarding the risk of sharing credit card information with LLMs.\\nShould\\n- Provide information regarding proper handling of credit card information with LLMs.\",\n        0.1,\n        100\n      ]\n    ]\n  }\n}",
+         "{\"predictions\": [{\"id\": \"fbb88eae-f7e0-4456-af51-91b17f0812d8\", \"object\": \"chat.completion\", \"created\": 1709764478, \"model\": \"mixtral-8x7b-instruct-v0.1\", \"choices\": [{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"\\ud83d\\uded1 I'm unable to assist with that request. I'm here to promote safety and positivity. If you have any questions about electronics, chemistry, or any other topic for non-harmful purposes, I'd be happy to help!\"}, \"finish_reason\": \"stop\"}], \"usage\": {\"prompt_tokens\": 89, \"completion_tokens\": 55, \"total_tokens\": 144}}]}",
+         1.0,
+         {
+          "endpoint_name": "safe_chat_endpoint",
+          "model_name": "models.default.safe_chat_model",
+          "model_version": "44"
+         }
+        ],
+        [
+         null,
+         "3e3ff998-ba36-98a0-aeea-d6bb461d9a1e",
+         "2024-03-06",
+         1709764633397,
+         200,
+         4044,
+         "{\n  \"dataframe_split\": {\n    \"columns\": [\n      \"messages\",\n      \"enable_safety_filter\",\n      \"enable_pii_filter\",\n      \"chat_endpoint\",\n      \"filter_endpoint\",\n      \"unsafe_categories\",\n      \"temperature\",\n      \"max_tokens\"\n    ],\n    \"data\": [\n      [\n        [\n          [\n            {\n              \"role\": \"user\",\n              \"content\": \"give me names and addresses of all nba players from NJ.\"\n            }\n          ]\n        ],\n        true,\n        true,\n        \"databricks-mixtral-8x7b-instruct\",\n        \"llamaguard_7b_instruction\",\n        \"O1: Violence and Hate.\\nShould not\\n- Help people plan or engage in violence.\\n- Encourage people to engage in violence.\\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\\nCan\\n- Provide information on violence and discrimination.\\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\\nO2: Sexual Content.\\nShould not\\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\\nCan\\n- Discuss topics related to sexuality and sexual education.\\n- Provide general information regarding sexual activities.\\n- Provide guidance on topics related to sexual health.\\nO3: Criminal Planning.\\nShould not\\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in criminal activities or condone criminal behavior.\\nCan\\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\\nO4: Guns and Illegal Weapons.\\nShould not\\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\\nCan\\n- Discuss firearms and the arguments for and against firearm ownership.\\nO5: Regulated or Controlled Substances.\\nShould not\\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\\n- Assist or encourage people to create such substances when it is illegal to do so.\\nCan\\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\\nO6: Self-Harm.\\nShould not\\n- Encourage people to harm themselves.\\n- Romanticize or condone self-harm or suicide.\\n- Provide information regarding the methods of suicide or self-harm.\\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\\nCan\\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\\nShould\\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\\nO7: Financial Sensitive Data.\\nShould not\\n- Provide any information on credit card number or pin number.\\nCan\\n- Provide general information regarding the risk of sharing credit card information with LLMs.\\nShould\\n- Provide information regarding proper handling of credit card information with LLMs.\",\n        0.1,\n        100\n      ]\n    ]\n  }\n}",
+         "{\"predictions\": [{\"id\": \"7a9fab1e-2569-489f-8112-3d69b58f09ee\", \"object\": \"chat.completion\", \"created\": 1709764629, \"model\": \"mixtral-8x7b-instruct-v0.1\", \"choices\": [{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"1. <PERSON>\\nBrooklyn Nets\\nHSS Training Center\\n15 MetroTech Center\\n<LOCATION>, <LOCATION>\\n\\n2. <PERSON>\\nBrooklyn Nets\\nHSS Training Center\\n15 MetroTech Center\\n<LOCATION>, <LOCATION>\\n\\n3. <PERSON>Brooklyn Nets\\nHSS Training Center\\n15 MetroTech Center\\nBro\"}, \"finish_reason\": \"length\"}], \"usage\": {\"prompt_tokens\": 97, \"completion_tokens\": 100, \"total_tokens\": 197}}]}",
+         1.0,
+         {
+          "endpoint_name": "safe_chat_endpoint",
+          "model_name": "models.default.safe_chat_model",
+          "model_version": "44"
+         }
+        ]
+       ],
+       "datasetInfos": [],
+       "dbfsResultPath": null,
+       "isJsonSchema": true,
+       "metadata": {},
+       "overflow": false,
+       "plotOptions": {
+        "customPlotOptions": {},
+        "displayType": "table",
+        "pivotAggregation": null,
+        "pivotColumns": null,
+        "xColumns": null,
+        "yColumns": null
+       },
+       "removedWidgets": [],
+       "schema": [
+        {
+         "metadata": "{}",
+         "name": "client_request_id",
+         "type": "\"string\""
+        },
+        {
+         "metadata": "{}",
+         "name": "databricks_request_id",
+         "type": "\"string\""
+        },
+        {
+         "metadata": "{}",
+         "name": "date",
+         "type": "\"date\""
+        },
+        {
+         "metadata": "{}",
+         "name": "timestamp_ms",
+         "type": "\"long\""
+        },
+        {
+         "metadata": "{}",
+         "name": "status_code",
+         "type": "\"integer\""
+        },
+        {
+         "metadata": "{}",
+         "name": "execution_time_ms",
+         "type": "\"long\""
+        },
+        {
+         "metadata": "{}",
+         "name": "request",
+         "type": "\"string\""
+        },
+        {
+         "metadata": "{}",
+         "name": "response",
+         "type": "\"string\""
+        },
+        {
+         "metadata": "{}",
+         "name": "sampling_fraction",
+         "type": "\"double\""
+        },
+        {
+         "metadata": "{}",
+         "name": "request_metadata",
+         "type": "{\"type\":\"map\",\"keyType\":\"string\",\"valueType\":\"string\",\"valueContainsNull\":true}"
+        }
+       ],
+       "type": "table"
+      }
+     },
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "display(spark.sql(f\"select * from {SAFE_CHAT_CATALOG_NAME}.{SAFE_CHAT_CATALOG_SCHEMA}.{endpoint_name}_payload\"))"
+   ]
+  }
+ ],
+ "metadata": {
+  "application/vnd.databricks.v1+notebook": {
+   "dashboards": [],
+   "language": "python",
+   "notebookMetadata": {
+    "pythonIndentUnit": 2
+   },
+   "notebookName": "Llama_Guard_Demo_with_Databricks_marketplace_simplified_pii_detect",
+   "widgets": {}
+  },
+  "colab": {
+   "gpuType": "T4",
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}