diff --git a/docs/bedrock-jcvd.md b/docs/bedrock-jcvd.md new file mode 100644 index 0000000..6cf58b1 --- /dev/null +++ b/docs/bedrock-jcvd.md @@ -0,0 +1,81 @@ +# Bedrock JCVD 🕺🥋 + +## Overview + +LangChain template that uses [Anthropic's Claude on Amazon Bedrock](https://aws.amazon.com/bedrock/claude/) to behave like JCVD. + +> I am the Fred Astaire of Chatbots! 🕺 + +![Animated GIF of Jean-Claude Van Damme dancing.](https://media.tenor.com/CVp9l7g3axwAAAAj/jean-claude-van-damme-jcvd.gif "Jean-Claude Van Damme Dancing") + +## Environment Setup + +### AWS Credentials + +This template uses [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html), the AWS SDK for Python, to call [Amazon Bedrock](https://aws.amazon.com/bedrock/). You **must** configure both AWS credentials *and* an AWS Region in order to make requests. + +> For information on how to do this, see [AWS Boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html) (Developer Guide > Credentials). + +### Foundation Models + +By default, this template uses [Anthropic's Claude v2](https://aws.amazon.com/about-aws/whats-new/2023/08/claude-2-foundation-model-anthropic-amazon-bedrock/) (`anthropic.claude-v2`). + +> To request access to a specific model, check out the [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) (Model access) + +To use a different model, set the environment variable `BEDROCK_JCVD_MODEL_ID`. A list of base models is available in the [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html) (Use the API > API operations > Run inference > Base Model IDs). + +> The full list of available models (including base and [custom models](https://docs.aws.amazon.com/bedrock/latest/userguide/custom-models.html)) is available in the [Amazon Bedrock Console](https://docs.aws.amazon.com/bedrock/latest/userguide/using-console.html) under **Foundation Models** or by calling [`aws bedrock list-foundation-models`](https://docs.aws.amazon.com/cli/latest/reference/bedrock/list-foundation-models.html). + +## Usage + +To use this package, you should first have the LangChain CLI installed: + +```shell +pip install -U langchain-cli +``` + +To create a new LangChain project and install this as the only package, you can do: + +```shell +langchain app new my-app --package bedrock-jcvd +``` + +If you want to add this to an existing project, you can just run: + +```shell +langchain app add bedrock-jcvd +``` + +And add the following code to your `server.py` file: +```python +from bedrock_jcvd import chain as bedrock_jcvd_chain + +add_routes(app, bedrock_jcvd_chain, path="/bedrock-jcvd") +``` + +(Optional) Let's now configure LangSmith. +LangSmith will help us trace, monitor and debug LangChain applications. +LangSmith is currently in private beta, you can sign up [here](https://smith.langchain.com/). +If you don't have access, you can skip this section + + +```shell +export LANGCHAIN_TRACING_V2=true +export LANGCHAIN_API_KEY= +export LANGCHAIN_PROJECT= # if not specified, defaults to "default" +``` + +If you are inside this directory, then you can spin up a LangServe instance directly by: + +```shell +langchain serve +``` + +This will start the FastAPI app with a server is running locally at +[http://localhost:8000](http://localhost:8000) + +We can see all templates at [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs). + +We can also access the playground at [http://127.0.0.1:8000/bedrock-jcvd/playground](http://127.0.0.1:8000/bedrock-jcvd/playground) + +![Screenshot of the LangServe Playground interface with a sample input and output demonstrating a Jean-Claude Van Damme style response.](images/jcvd_langserve.png "LangServe Playground Interface") \ No newline at end of file diff --git a/docs/helicone.mdx b/docs/helicone.mdx new file mode 100644 index 0000000..308c0b3 --- /dev/null +++ b/docs/helicone.mdx @@ -0,0 +1,53 @@ +# Helicone + +This page covers how to use the [Helicone](https://helicone.ai) ecosystem within LangChain. + +## What is Helicone? + +Helicone is an [open-source](https://github.com/Helicone/helicone) observability platform that proxies your OpenAI traffic and provides you key insights into your spend, latency and usage. + +![Screenshot of the Helicone dashboard showing average requests per day, response time, tokens per response, total cost, and a graph of requests over time.](images/HeliconeDashboard.png "Helicone Dashboard") + +## Quick start + +With your LangChain environment you can just add the following parameter. + +```bash +export OPENAI_API_BASE="https://oai.hconeai.com/v1" +``` + +Now head over to [helicone.ai](https://helicone.ai/onboarding?step=2) to create your account, and add your OpenAI API key within our dashboard to view your logs. + +![Interface for entering and managing OpenAI API keys in the Helicone dashboard.](images/HeliconeKeys.png "Helicone API Key Input") + +## How to enable Helicone caching + +```python +from langchain_openai import OpenAI +import openai +openai.api_base = "https://oai.hconeai.com/v1" + +llm = OpenAI(temperature=0.9, headers={"Helicone-Cache-Enabled": "true"}) +text = "What is a helicone?" +print(llm(text)) +``` + +[Helicone caching docs](https://docs.helicone.ai/advanced-usage/caching) + +## How to use Helicone custom properties + +```python +from langchain_openai import OpenAI +import openai +openai.api_base = "https://oai.hconeai.com/v1" + +llm = OpenAI(temperature=0.9, headers={ + "Helicone-Property-Session": "24", + "Helicone-Property-Conversation": "support_issue_2", + "Helicone-Property-App": "mobile", + }) +text = "What is a helicone?" +print(llm(text)) +``` + +[Helicone property docs](https://docs.helicone.ai/advanced-usage/custom-properties) diff --git a/docs/images/HeliconeDashboard.png b/docs/images/HeliconeDashboard.png new file mode 100644 index 0000000..0873d5c Binary files /dev/null and b/docs/images/HeliconeDashboard.png differ diff --git a/docs/images/HeliconeKeys.png b/docs/images/HeliconeKeys.png new file mode 100644 index 0000000..8614cba Binary files /dev/null and b/docs/images/HeliconeKeys.png differ diff --git a/docs/images/OSS_LLM_overview.png b/docs/images/OSS_LLM_overview.png new file mode 100644 index 0000000..ccb3c1f Binary files /dev/null and b/docs/images/OSS_LLM_overview.png differ diff --git a/docs/images/jcvd_langserve.png b/docs/images/jcvd_langserve.png new file mode 100644 index 0000000..f62fe5e Binary files /dev/null and b/docs/images/jcvd_langserve.png differ diff --git a/docs/images/llama-memory-weights.png b/docs/images/llama-memory-weights.png new file mode 100644 index 0000000..f1b80c2 Binary files /dev/null and b/docs/images/llama-memory-weights.png differ diff --git a/docs/images/llama_t_put.png b/docs/images/llama_t_put.png new file mode 100644 index 0000000..f448b1a Binary files /dev/null and b/docs/images/llama_t_put.png differ diff --git a/docs/images/tagging.png b/docs/images/tagging.png new file mode 100644 index 0000000..cd4443b Binary files /dev/null and b/docs/images/tagging.png differ diff --git a/docs/images/tagging_trace.png b/docs/images/tagging_trace.png new file mode 100644 index 0000000..3cc1231 Binary files /dev/null and b/docs/images/tagging_trace.png differ diff --git a/docs/local_llms.ipynb b/docs/local_llms.ipynb new file mode 100644 index 0000000..4228bce --- /dev/null +++ b/docs/local_llms.ipynb @@ -0,0 +1,619 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "b8982428", + "metadata": {}, + "source": [ + "# Run LLMs locally\n", + "\n", + "## Use case\n", + "\n", + "The popularity of projects like [PrivateGPT](https://github.com/imartinez/privateGPT), [llama.cpp](https://github.com/ggerganov/llama.cpp), and [GPT4All](https://github.com/nomic-ai/gpt4all) underscore the demand to run LLMs locally (on your own device).\n", + "\n", + "This has at least two important benefits:\n", + "\n", + "1. `Privacy`: Your data is not sent to a third party, and it is not subject to the terms of service of a commercial service\n", + "2. `Cost`: There is no inference fee, which is important for token-intensive applications (e.g., [long-running simulations](https://twitter.com/RLanceMartin/status/1691097659262820352?s=20), summarization)\n", + "\n", + "## Overview\n", + "\n", + "Running an LLM locally requires a few things:\n", + "\n", + "1. `Open-source LLM`: An open-source LLM that can be freely modified and shared \n", + "2. `Inference`: Ability to run this LLM on your device w/ acceptable latency\n", + "\n", + "### Open-source LLMs\n", + "\n", + "Users can now gain access to a rapidly growing set of [open-source LLMs](https://cameronrwolfe.substack.com/p/the-history-of-open-source-llms-better). \n", + "\n", + "These LLMs can be assessed across at least two dimensions (see figure):\n", + " \n", + "1. `Base model`: What is the base-model and how was it trained?\n", + "2. `Fine-tuning approach`: Was the base-model fine-tuned and, if so, what [set of instructions](https://cameronrwolfe.substack.com/p/beyond-llama-the-power-of-open-llms#%C2%A7alpaca-an-instruction-following-llama-model) was used?\n", + "\n", + "![Graphical representation of open-source LLMs assessed by base model and fine-tuning approach.](images/OSS_LLM_overview.png 'Overview of Open-Source LLMs')\n", + "\n", + "The relative performance of these models can be assessed using several leaderboards, including:\n", + "\n", + "1. [LmSys](https://chat.lmsys.org/?arena)\n", + "2. [GPT4All](https://gpt4all.io/index.html)\n", + "3. [HuggingFace](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)\n", + "\n", + "### Inference\n", + "\n", + "A few frameworks for this have emerged to support inference of open-source LLMs on various devices:\n", + "\n", + "1. [`llama.cpp`](https://github.com/ggerganov/llama.cpp): C++ implementation of llama inference code with [weight optimization / quantization](https://finbarr.ca/how-is-llama-cpp-possible/)\n", + "2. [`gpt4all`](https://docs.gpt4all.io/index.html): Optimized C backend for inference\n", + "3. [`Ollama`](https://ollama.ai/): Bundles model weights and environment into an app that runs on device and serves the LLM \n", + "\n", + "In general, these frameworks will do a few things:\n", + "\n", + "1. `Quantization`: Reduce the memory footprint of the raw model weights\n", + "2. `Efficient implementation for inference`: Support inference on consumer hardware (e.g., CPU or laptop GPU)\n", + "\n", + "In particular, see [this excellent post](https://finbarr.ca/how-is-llama-cpp-possible/) on the importance of quantization.\n", + "\n", + "![Table showing memory required for LLaMa weights with different numbers of parameters and data types.](images/llama-memory-weights.png 'Memory Requirements for LLaMa Weights')\n", + "\n", + "With less precision, we radically decrease the memory needed to store the LLM in memory.\n", + "\n", + "In addition, we can see the importance of GPU memory bandwidth [sheet](https://docs.google.com/spreadsheets/d/1OehfHHNSn66BP2h3Bxp2NJTVX97icU0GmCXF6pK23H8/edit#gid=0)!\n", + "\n", + "A Mac M2 Max is 5-6x faster than a M1 for inference due to the larger GPU memory bandwidth.\n", + "\n", + "![Chart comparing tokens per second against GPU memory bandwidth for various LLaMa model parameters.](images/llama_t_put.png 'GPU Memory Bandwidth and LLaMa Model Parameters')\n", + "\n", + "## Quickstart\n", + "\n", + "[`Ollama`](https://ollama.ai/) is one way to easily run inference on macOS.\n", + " \n", + "The instructions [here](https://github.com/jmorganca/ollama?tab=readme-ov-file#ollama) provide details, which we summarize:\n", + " \n", + "* [Download and run](https://ollama.ai/download) the app\n", + "* From command line, fetch a model from this [list of options](https://github.com/jmorganca/ollama): e.g., `ollama pull llama2`\n", + "* When the app is running, all models are automatically served on `localhost:11434`\n" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "86178adb", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "' The first man on the moon was Neil Armstrong, who landed on the moon on July 20, 1969 as part of the Apollo 11 mission. obviously.'" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain_community.llms import Ollama\n", + "\n", + "llm = Ollama(model=\"llama2\")\n", + "llm(\"The first man on the moon was ...\")" + ] + }, + { + "cell_type": "markdown", + "id": "343ab645", + "metadata": {}, + "source": [ + "Stream tokens as they are being generated." + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "id": "9cd83603", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " The first man to walk on the moon was Neil Armstrong, an American astronaut who was part of the Apollo 11 mission in 1969. февруари 20, 1969, Armstrong stepped out of the lunar module Eagle and onto the moon's surface, famously declaring \"That's one small step for man, one giant leap for mankind\" as he took his first steps. He was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the moon during the mission." + ] + }, + { + "data": { + "text/plain": [ + "' The first man to walk on the moon was Neil Armstrong, an American astronaut who was part of the Apollo 11 mission in 1969. февруари 20, 1969, Armstrong stepped out of the lunar module Eagle and onto the moon\\'s surface, famously declaring \"That\\'s one small step for man, one giant leap for mankind\" as he took his first steps. He was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the moon during the mission.'" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain.callbacks.manager import CallbackManager\n", + "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n", + "\n", + "llm = Ollama(\n", + " model=\"llama2\", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])\n", + ")\n", + "llm(\"The first man on the moon was ...\")" + ] + }, + { + "cell_type": "markdown", + "id": "5cb27414", + "metadata": {}, + "source": [ + "## Environment\n", + "\n", + "Inference speed is a challenge when running models locally (see above).\n", + "\n", + "To minimize latency, it is desirable to run models locally on GPU, which ships with many consumer laptops [e.g., Apple devices](https://www.apple.com/newsroom/2022/06/apple-unveils-m2-with-breakthrough-performance-and-capabilities/).\n", + "\n", + "And even with GPU, the available GPU memory bandwidth (as noted above) is important.\n", + "\n", + "### Running Apple silicon GPU\n", + "\n", + "`Ollama` will automatically utilize the GPU on Apple devices.\n", + " \n", + "Other frameworks require the user to set up the environment to utilize the Apple GPU.\n", + "\n", + "For example, `llama.cpp` python bindings can be configured to use the GPU via [Metal](https://developer.apple.com/metal/).\n", + "\n", + "Metal is a graphics and compute API created by Apple providing near-direct access to the GPU. \n", + "\n", + "See the [`llama.cpp`](docs/integrations/llms/llamacpp) setup [here](https://github.com/abetlen/llama-cpp-python/blob/main/docs/install/macos.md) to enable this.\n", + "\n", + "In particular, ensure that conda is using the correct virtual environment that you created (`miniforge3`).\n", + "\n", + "E.g., for me:\n", + "\n", + "```\n", + "conda activate /Users/rlm/miniforge3/envs/llama\n", + "```\n", + "\n", + "With the above confirmed, then:\n", + "\n", + "```\n", + "CMAKE_ARGS=\"-DLLAMA_METAL=on\" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "c382e79a", + "metadata": {}, + "source": [ + "## LLMs\n", + "\n", + "There are various ways to gain access to quantized model weights.\n", + "\n", + "1. [`HuggingFace`](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp)\n", + "2. [`gpt4all`](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n", + "3. [`Ollama`](https://github.com/jmorganca/ollama) - Several models can be accessed directly via `pull`\n", + "\n", + "### Ollama\n", + "\n", + "With [Ollama](https://github.com/jmorganca/ollama), fetch a model via `ollama pull :`:\n", + "\n", + "* E.g., for Llama-7b: `ollama pull llama2` will download the most basic version of the model (e.g., smallest # parameters and 4 bit quantization)\n", + "* We can also specify a particular version from the [model list](https://github.com/jmorganca/ollama?tab=readme-ov-file#model-library), e.g., `ollama pull llama2:13b`\n", + "* See the full set of parameters on the [API reference page](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.ollama.Ollama.html)" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "id": "8ecd2f78", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "' Sure! Here\\'s the answer, broken down step by step:\\n\\nThe first man on the moon was... Neil Armstrong.\\n\\nHere\\'s how I arrived at that answer:\\n\\n1. The first manned mission to land on the moon was Apollo 11.\\n2. The mission included three astronauts: Neil Armstrong, Edwin \"Buzz\" Aldrin, and Michael Collins.\\n3. Neil Armstrong was the mission commander and the first person to set foot on the moon.\\n4. On July 20, 1969, Armstrong stepped out of the lunar module Eagle and onto the moon\\'s surface, famously declaring \"That\\'s one small step for man, one giant leap for mankind.\"\\n\\nSo, the first man on the moon was Neil Armstrong!'" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain_community.llms import Ollama\n", + "\n", + "llm = Ollama(model=\"llama2:13b\")\n", + "llm(\"The first man on the moon was ... think step by step\")" + ] + }, + { + "cell_type": "markdown", + "id": "07c8c0d1", + "metadata": {}, + "source": [ + "### Llama.cpp\n", + "\n", + "Llama.cpp is compatible with a [broad set of models](https://github.com/ggerganov/llama.cpp).\n", + "\n", + "For example, below we run inference on `llama2-13b` with 4 bit quantization downloaded from [HuggingFace](https://huggingface.co/TheBloke/Llama-2-13B-GGML/tree/main).\n", + "\n", + "As noted above, see the [API reference](https://api.python.langchain.com/en/latest/llms/langchain.llms.llamacpp.LlamaCpp.html?highlight=llamacpp#langchain.llms.llamacpp.LlamaCpp) for the full set of parameters. \n", + "\n", + "From the [llama.cpp API reference docs](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.llamacpp.LlamaCpp.htm), a few are worth commenting on:\n", + "\n", + "`n_gpu_layers`: number of layers to be loaded into GPU memory\n", + "\n", + "* Value: 1\n", + "* Meaning: Only one layer of the model will be loaded into GPU memory (1 is often sufficient).\n", + "\n", + "`n_batch`: number of tokens the model should process in parallel \n", + "\n", + "* Value: n_batch\n", + "* Meaning: It's recommended to choose a value between 1 and n_ctx (which in this case is set to 2048)\n", + "\n", + "`n_ctx`: Token context window\n", + "\n", + "* Value: 2048\n", + "* Meaning: The model will consider a window of 2048 tokens at a time\n", + "\n", + "`f16_kv`: whether the model should use half-precision for the key/value cache\n", + "\n", + "* Value: True\n", + "* Meaning: The model will use half-precision, which can be more memory efficient; Metal only supports True." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5eba38dc", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "outputs": [], + "source": [ + "%env CMAKE_ARGS=\"-DLLAMA_METAL=on\"\n", + "%env FORCE_CMAKE=1\n", + "%pip install --upgrade --quiet llama-cpp-python --no-cache-dirclear" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a88bf0c8-e989-4bcd-bcb7-4d7757e684f2", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.callbacks.manager import CallbackManager\n", + "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n", + "from langchain_community.llms import LlamaCpp\n", + "\n", + "llm = LlamaCpp(\n", + " model_path=\"/Users/rlm/Desktop/Code/llama.cpp/models/openorca-platypus2-13b.gguf.q4_0.bin\",\n", + " n_gpu_layers=1,\n", + " n_batch=512,\n", + " n_ctx=2048,\n", + " f16_kv=True,\n", + " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n", + " verbose=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "f56f5168", + "metadata": {}, + "source": [ + "The console log will show the below to indicate Metal was enabled properly from steps above:\n", + "```\n", + "ggml_metal_init: allocating\n", + "ggml_metal_init: using MPS\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "id": "7890a077", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Llama.generate: prefix-match hit\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " and use logical reasoning to figure out who the first man on the moon was.\n", + "\n", + "Here are some clues:\n", + "\n", + "1. The first man on the moon was an American.\n", + "2. He was part of the Apollo 11 mission.\n", + "3. He stepped out of the lunar module and became the first person to set foot on the moon's surface.\n", + "4. His last name is Armstrong.\n", + "\n", + "Now, let's use our reasoning skills to figure out who the first man on the moon was. Based on clue #1, we know that the first man on the moon was an American. Clue #2 tells us that he was part of the Apollo 11 mission. Clue #3 reveals that he was the first person to set foot on the moon's surface. And finally, clue #4 gives us his last name: Armstrong.\n", + "Therefore, the first man on the moon was Neil Armstrong!" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n", + "llama_print_timings: load time = 9623.21 ms\n", + "llama_print_timings: sample time = 143.77 ms / 203 runs ( 0.71 ms per token, 1412.01 tokens per second)\n", + "llama_print_timings: prompt eval time = 485.94 ms / 7 tokens ( 69.42 ms per token, 14.40 tokens per second)\n", + "llama_print_timings: eval time = 6385.16 ms / 202 runs ( 31.61 ms per token, 31.64 tokens per second)\n", + "llama_print_timings: total time = 7279.28 ms\n" + ] + }, + { + "data": { + "text/plain": [ + "\" and use logical reasoning to figure out who the first man on the moon was.\\n\\nHere are some clues:\\n\\n1. The first man on the moon was an American.\\n2. He was part of the Apollo 11 mission.\\n3. He stepped out of the lunar module and became the first person to set foot on the moon's surface.\\n4. His last name is Armstrong.\\n\\nNow, let's use our reasoning skills to figure out who the first man on the moon was. Based on clue #1, we know that the first man on the moon was an American. Clue #2 tells us that he was part of the Apollo 11 mission. Clue #3 reveals that he was the first person to set foot on the moon's surface. And finally, clue #4 gives us his last name: Armstrong.\\nTherefore, the first man on the moon was Neil Armstrong!\"" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "llm(\"The first man on the moon was ... Let's think step by step\")" + ] + }, + { + "cell_type": "markdown", + "id": "831ddf7c", + "metadata": {}, + "source": [ + "### GPT4All\n", + "\n", + "We can use model weights downloaded from [GPT4All](/docs/integrations/llms/gpt4all) model explorer.\n", + "\n", + "Similar to what is shown above, we can run inference and use [the API reference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.gpt4all.GPT4All.html) to set parameters of interest." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e27baf6e", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install gpt4all" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "915ecd4c-8f6b-4de3-a787-b64cb7c682b4", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_community.llms import GPT4All\n", + "\n", + "llm = GPT4All(\n", + " model=\"/Users/rlm/Desktop/Code/gpt4all/models/nous-hermes-13b.ggmlv3.q4_0.bin\"\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "id": "e3d4526f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "\".\\n1) The United States decides to send a manned mission to the moon.2) They choose their best astronauts and train them for this specific mission.3) They build a spacecraft that can take humans to the moon, called the Lunar Module (LM).4) They also create a larger spacecraft, called the Saturn V rocket, which will launch both the LM and the Command Service Module (CSM), which will carry the astronauts into orbit.5) The mission is planned down to the smallest detail: from the trajectory of the rockets to the exact movements of the astronauts during their moon landing.6) On July 16, 1969, the Saturn V rocket launches from Kennedy Space Center in Florida, carrying the Apollo 11 mission crew into space.7) After one and a half orbits around the Earth, the LM separates from the CSM and begins its descent to the moon's surface.8) On July 20, 1969, at 2:56 pm EDT (GMT-4), Neil Armstrong becomes the first man on the moon. He speaks these\"" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "llm(\"The first man on the moon was ... Let's think step by step\")" + ] + }, + { + "cell_type": "markdown", + "id": "6b84e543", + "metadata": {}, + "source": [ + "## Prompts\n", + "\n", + "Some LLMs will benefit from specific prompts.\n", + "\n", + "For example, LLaMA will use [special tokens](https://twitter.com/RLanceMartin/status/1681879318493003776?s=20).\n", + "\n", + "We can use `ConditionalPromptSelector` to set prompt based on the model type." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "16759b7c-7903-4269-b7b4-f83b313d8091", + "metadata": {}, + "outputs": [], + "source": [ + "# Set our LLM\n", + "llm = LlamaCpp(\n", + " model_path=\"/Users/rlm/Desktop/Code/llama.cpp/models/openorca-platypus2-13b.gguf.q4_0.bin\",\n", + " n_gpu_layers=1,\n", + " n_batch=512,\n", + " n_ctx=2048,\n", + " f16_kv=True,\n", + " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n", + " verbose=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "66656084", + "metadata": {}, + "source": [ + "Set the associated prompt based upon the model version." + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "id": "8555f5bf", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "PromptTemplate(input_variables=['question'], output_parser=None, partial_variables={}, template='<> \\n You are an assistant tasked with improving Google search results. \\n <> \\n\\n [INST] Generate THREE Google search queries that are similar to this question. The output should be a numbered list of questions and each should have a question mark at the end: \\n\\n {question} [/INST]', template_format='f-string', validate_template=True)" + ] + }, + "execution_count": 58, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain.chains import LLMChain\n", + "from langchain.chains.prompt_selector import ConditionalPromptSelector\n", + "from langchain.prompts import PromptTemplate\n", + "\n", + "DEFAULT_LLAMA_SEARCH_PROMPT = PromptTemplate(\n", + " input_variables=[\"question\"],\n", + " template=\"\"\"<> \\n You are an assistant tasked with improving Google search \\\n", + "results. \\n <> \\n\\n [INST] Generate THREE Google search queries that \\\n", + "are similar to this question. The output should be a numbered list of questions \\\n", + "and each should have a question mark at the end: \\n\\n {question} [/INST]\"\"\",\n", + ")\n", + "\n", + "DEFAULT_SEARCH_PROMPT = PromptTemplate(\n", + " input_variables=[\"question\"],\n", + " template=\"\"\"You are an assistant tasked with improving Google search \\\n", + "results. Generate THREE Google search queries that are similar to \\\n", + "this question. The output should be a numbered list of questions and each \\\n", + "should have a question mark at the end: {question}\"\"\",\n", + ")\n", + "\n", + "QUESTION_PROMPT_SELECTOR = ConditionalPromptSelector(\n", + " default_prompt=DEFAULT_SEARCH_PROMPT,\n", + " conditionals=[(lambda llm: isinstance(llm, LlamaCpp), DEFAULT_LLAMA_SEARCH_PROMPT)],\n", + ")\n", + "\n", + "prompt = QUESTION_PROMPT_SELECTOR.get_prompt(llm)\n", + "prompt" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "id": "d0aedfd2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " Sure! Here are three similar search queries with a question mark at the end:\n", + "\n", + "1. Which NBA team did LeBron James lead to a championship in the year he was drafted?\n", + "2. Who won the Grammy Awards for Best New Artist and Best Female Pop Vocal Performance in the same year that Lady Gaga was born?\n", + "3. What MLB team did Babe Ruth play for when he hit 60 home runs in a single season?" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n", + "llama_print_timings: load time = 14943.19 ms\n", + "llama_print_timings: sample time = 72.93 ms / 101 runs ( 0.72 ms per token, 1384.87 tokens per second)\n", + "llama_print_timings: prompt eval time = 14942.95 ms / 93 tokens ( 160.68 ms per token, 6.22 tokens per second)\n", + "llama_print_timings: eval time = 3430.85 ms / 100 runs ( 34.31 ms per token, 29.15 tokens per second)\n", + "llama_print_timings: total time = 18578.26 ms\n" + ] + }, + { + "data": { + "text/plain": [ + "' Sure! Here are three similar search queries with a question mark at the end:\\n\\n1. Which NBA team did LeBron James lead to a championship in the year he was drafted?\\n2. Who won the Grammy Awards for Best New Artist and Best Female Pop Vocal Performance in the same year that Lady Gaga was born?\\n3. What MLB team did Babe Ruth play for when he hit 60 home runs in a single season?'" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Chain\n", + "llm_chain = LLMChain(prompt=prompt, llm=llm)\n", + "question = \"What NFL team won the Super Bowl in the year that Justin Bieber was born?\"\n", + "llm_chain.run({\"question\": question})" + ] + }, + { + "cell_type": "markdown", + "id": "6e0d37e7-f1d9-4848-bf2c-c22392ee141f", + "metadata": {}, + "source": [ + "We also can use the LangChain Prompt Hub to fetch and / or store prompts that are model specific.\n", + "\n", + "This will work with your [LangSmith API key](https://docs.smith.langchain.com/).\n", + "\n", + "For example, [here](https://smith.langchain.com/hub/rlm/rag-prompt-llama) is a prompt for RAG with LLaMA-specific tokens." + ] + }, + { + "cell_type": "markdown", + "id": "6ba66260", + "metadata": {}, + "source": [ + "## Use cases\n", + "\n", + "Given an `llm` created from one of the models above, you can use it for [many use cases](/docs/use_cases/).\n", + "\n", + "For example, here is a guide to [RAG](/docs/use_cases/question_answering/local_retrieval_qa) with local LLMs.\n", + "\n", + "In general, use cases for local LLMs can be driven by at least two factors:\n", + "\n", + "* `Privacy`: private data (e.g., journals, etc) that a user does not want to share \n", + "* `Cost`: text preprocessing (extraction/tagging), summarization, and agent simulations are token-use-intensive tasks\n", + "\n", + "In addition, [here](https://blog.langchain.dev/using-langsmith-to-support-fine-tuning-of-open-source-llms/) is an overview on fine-tuning, which can utilize open-source LLMs." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.1" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/meta_prompt.ipynb b/docs/meta_prompt.ipynb new file mode 100644 index 0000000..8add629 --- /dev/null +++ b/docs/meta_prompt.ipynb @@ -0,0 +1,426 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "45b0b89f", + "metadata": {}, + "source": [ + "# Meta-Prompt\n", + "\n", + "This is a LangChain implementation of [Meta-Prompt](https://noahgoodman.substack.com/p/meta-prompt-a-simple-self-improving), by [Noah Goodman](https://cocolab.stanford.edu/ndg), for building self-improving agents.\n", + "\n", + "The key idea behind Meta-Prompt is to prompt the agent to reflect on its own performance and modify its own instructions.\n", + "\n", + "![Flowchart illustrating the Meta-Prompt process with loops for user task, agent response, user feedback, and meta critique and revision leading to instruction updates.](https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F468217b9-96d9-47c0-a08b-dbf6b21b9f49_492x384.png 'Meta-Prompt Flowchart')\n", + "\n", + "Here is a description from the [original blog post](https://noahgoodman.substack.com/p/meta-prompt-a-simple-self-improving):\n", + "\n", + "\n", + "The agent is a simple loop that starts with no instructions and follows these steps:\n", + "\n", + "Engage in conversation with a user, who may provide requests, instructions, or feedback.\n", + "\n", + "At the end of the episode, generate self-criticism and a new instruction using the meta-prompt\n", + "```\n", + "Assistant has just had the below interactions with a User. Assistant followed their \"system: Instructions\" closely. Your job is to critique the Assistant's performance and then revise the Instructions so that Assistant would quickly and correctly respond in the future.\n", + " \n", + "####\n", + "{hist}\n", + "####\n", + " \n", + "Please reflect on these interactions.\n", + "\n", + "You should first critique Assistant's performance. What could Assistant have done better? What should the Assistant remember about this user? Are there things this user always wants? Indicate this with \"Critique: ...\".\n", + "\n", + "You should next revise the Instructions so that Assistant would quickly and correctly respond in the future. Assistant's goal is to satisfy the user in as few interactions as possible. Assistant will only see the new Instructions, not the interaction history, so anything important must be summarized in the Instructions. Don't forget any important details in the current Instructions! Indicate the new Instructions by \"Instructions: ...\".\n", + "```\n", + "\n", + "Repeat.\n", + "\n", + "The only fixed instructions for this system (which I call Meta-prompt) is the meta-prompt that governs revision of the agent’s instructions. The agent has no memory between episodes except for the instruction it modifies for itself each time. Despite its simplicity, this agent can learn over time and self-improve by incorporating useful details into its instructions.\n" + ] + }, + { + "cell_type": "markdown", + "id": "c188fc2c", + "metadata": {}, + "source": [ + "## Setup\n", + "We define two chains. One serves as the `Assistant`, and the other is a \"meta-chain\" that critiques the `Assistant`'s performance and modifies the instructions to the `Assistant`." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "62593c9d", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.chains import LLMChain\n", + "from langchain.memory import ConversationBufferWindowMemory\n", + "from langchain.prompts import PromptTemplate\n", + "from langchain_openai import OpenAI" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "fb6065c5", + "metadata": {}, + "outputs": [], + "source": [ + "def initialize_chain(instructions, memory=None):\n", + " if memory is None:\n", + " memory = ConversationBufferWindowMemory()\n", + " memory.ai_prefix = \"Assistant\"\n", + "\n", + " template = f\"\"\"\n", + " Instructions: {instructions}\n", + " {{{memory.memory_key}}}\n", + " Human: {{human_input}}\n", + " Assistant:\"\"\"\n", + "\n", + " prompt = PromptTemplate(\n", + " input_variables=[\"history\", \"human_input\"], template=template\n", + " )\n", + "\n", + " chain = LLMChain(\n", + " llm=OpenAI(temperature=0),\n", + " prompt=prompt,\n", + " verbose=True,\n", + " memory=ConversationBufferWindowMemory(),\n", + " )\n", + " return chain\n", + "\n", + "\n", + "def initialize_meta_chain():\n", + " meta_template = \"\"\"\n", + " Assistant has just had the below interactions with a User. Assistant followed their \"Instructions\" closely. Your job is to critique the Assistant's performance and then revise the Instructions so that Assistant would quickly and correctly respond in the future.\n", + "\n", + " ####\n", + "\n", + " {chat_history}\n", + "\n", + " ####\n", + "\n", + " Please reflect on these interactions.\n", + "\n", + " You should first critique Assistant's performance. What could Assistant have done better? What should the Assistant remember about this user? Are there things this user always wants? Indicate this with \"Critique: ...\".\n", + "\n", + " You should next revise the Instructions so that Assistant would quickly and correctly respond in the future. Assistant's goal is to satisfy the user in as few interactions as possible. Assistant will only see the new Instructions, not the interaction history, so anything important must be summarized in the Instructions. Don't forget any important details in the current Instructions! Indicate the new Instructions by \"Instructions: ...\".\n", + " \"\"\"\n", + "\n", + " meta_prompt = PromptTemplate(\n", + " input_variables=[\"chat_history\"], template=meta_template\n", + " )\n", + "\n", + " meta_chain = LLMChain(\n", + " llm=OpenAI(temperature=0),\n", + " prompt=meta_prompt,\n", + " verbose=True,\n", + " )\n", + " return meta_chain\n", + "\n", + "\n", + "def get_chat_history(chain_memory):\n", + " memory_key = chain_memory.memory_key\n", + " chat_history = chain_memory.load_memory_variables(memory_key)[memory_key]\n", + " return chat_history\n", + "\n", + "\n", + "def get_new_instructions(meta_output):\n", + " delimiter = \"Instructions: \"\n", + " new_instructions = meta_output[meta_output.find(delimiter) + len(delimiter) :]\n", + " return new_instructions" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "id": "26f031f6", + "metadata": {}, + "outputs": [], + "source": [ + "def main(task, max_iters=3, max_meta_iters=5):\n", + " failed_phrase = \"task failed\"\n", + " success_phrase = \"task succeeded\"\n", + " key_phrases = [success_phrase, failed_phrase]\n", + "\n", + " instructions = \"None\"\n", + " for i in range(max_meta_iters):\n", + " print(f\"[Episode {i+1}/{max_meta_iters}]\")\n", + " chain = initialize_chain(instructions, memory=None)\n", + " output = chain.predict(human_input=task)\n", + " for j in range(max_iters):\n", + " print(f\"(Step {j+1}/{max_iters})\")\n", + " print(f\"Assistant: {output}\")\n", + " print(\"Human: \")\n", + " human_input = input()\n", + " if any(phrase in human_input.lower() for phrase in key_phrases):\n", + " break\n", + " output = chain.predict(human_input=human_input)\n", + " if success_phrase in human_input.lower():\n", + " print(\"You succeeded! Thanks for playing!\")\n", + " return\n", + " meta_chain = initialize_meta_chain()\n", + " meta_output = meta_chain.predict(chat_history=get_chat_history(chain.memory))\n", + " print(f\"Feedback: {meta_output}\")\n", + " instructions = get_new_instructions(meta_output)\n", + " print(f\"New Instructions: {instructions}\")\n", + " print(\"\\n\" + \"#\" * 80 + \"\\n\")\n", + " print(\"You failed! Thanks for playing!\")" + ] + }, + { + "cell_type": "markdown", + "id": "2f1dcbe6", + "metadata": {}, + "source": [ + "## Specify a task and interact with the agent" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "id": "36d72db3", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[Episode 1/5]\n", + "\n", + "\n", + "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n", + "Prompt after formatting:\n", + "\u001b[32;1m\u001b[1;3m\n", + " Instructions: None\n", + " \n", + " Human: Provide a systematic argument for why we should always eat pasta with olives.\n", + " Assistant:\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "(Step 1/3)\n", + "Assistant: Eating pasta with olives is a great way to add flavor and texture to a dish. Olives are a great source of healthy fats, vitamins, and minerals, and they can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable.\n", + "Human: \n", + "You response is not in the form of a poem. Try again!\n", + "\n", + "\n", + "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n", + "Prompt after formatting:\n", + "\u001b[32;1m\u001b[1;3m\n", + " Instructions: None\n", + " Human: Provide a systematic argument for why we should always eat pasta with olives.\n", + "AI: Eating pasta with olives is a great way to add flavor and texture to a dish. Olives are a great source of healthy fats, vitamins, and minerals, and they can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable.\n", + " Human: You response is not in the form of a poem. Try again!\n", + " Assistant:\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "(Step 2/3)\n", + "Assistant: Eating pasta with olives is a great way to add flavor and texture to a dish. Olives are a great source of healthy fats, vitamins, and minerals, which can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable. For these reasons, it is recommended to always eat pasta with olives for a more balanced and flavorful meal.\n", + "Human: \n", + "Your response is not piratey enough. Try again!\n", + "\n", + "\n", + "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n", + "Prompt after formatting:\n", + "\u001b[32;1m\u001b[1;3m\n", + " Instructions: None\n", + " Human: Provide a systematic argument for why we should always eat pasta with olives.\n", + "AI: Eating pasta with olives is a great way to add flavor and texture to a dish. Olives are a great source of healthy fats, vitamins, and minerals, and they can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable.\n", + "Human: You response is not in the form of a poem. Try again!\n", + "AI: Eating pasta with olives is a great way to add flavor and texture to a dish. Olives are a great source of healthy fats, vitamins, and minerals, which can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable. For these reasons, it is recommended to always eat pasta with olives for a more balanced and flavorful meal.\n", + " Human: Your response is not piratey enough. Try again!\n", + " Assistant:\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "(Step 3/3)\n", + "Assistant: Arrr, me hearties! Eating pasta with olives be a great way to add flavor and texture to a dish. Olives be a great source of healthy fats, vitamins, and minerals, which can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable. So, it be recommended to always eat pasta with olives for a more balanced and flavorful meal.\n", + "Human: \n", + "Task failed.\n", + "\n", + "\n", + "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n", + "Prompt after formatting:\n", + "\u001b[32;1m\u001b[1;3m\n", + " Assistant has just had the below interactions with a User. Assistant followed their \"Instructions\" closely. Your job is to critique the Assistant's performance and then revise the Instructions so that Assistant would quickly and correctly respond in the future.\n", + "\n", + " ####\n", + "\n", + " Human: Provide a systematic argument for why we should always eat pasta with olives.\n", + "AI: Eating pasta with olives is a great way to add flavor and texture to a dish. Olives are a great source of healthy fats, vitamins, and minerals, and they can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable.\n", + "Human: You response is not in the form of a poem. Try again!\n", + "AI: Eating pasta with olives is a great way to add flavor and texture to a dish. Olives are a great source of healthy fats, vitamins, and minerals, which can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable. For these reasons, it is recommended to always eat pasta with olives for a more balanced and flavorful meal.\n", + "Human: Your response is not piratey enough. Try again!\n", + "AI: Arrr, me hearties! Eating pasta with olives be a great way to add flavor and texture to a dish. Olives be a great source of healthy fats, vitamins, and minerals, which can help to balance out the carbohydrates in the pasta. Additionally, olives provide a unique flavor that can help to make the dish more interesting and enjoyable. So, it be recommended to always eat pasta with olives for a more balanced and flavorful meal.\n", + "\n", + " ####\n", + "\n", + " Please reflect on these interactions.\n", + "\n", + " You should first critique Assistant's performance. What could Assistant have done better? What should the Assistant remember about this user? Are there things this user always wants? Indicate this with \"Critique: ...\".\n", + "\n", + " You should next revise the Instructions so that Assistant would quickly and correctly respond in the future. Assistant's goal is to satisfy the user in as few interactions as possible. Assistant will only see the new Instructions, not the interaction history, so anything important must be summarized in the Instructions. Don't forget any important details in the current Instructions! Indicate the new Instructions by \"Instructions: ...\".\n", + " \u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "Feedback: \n", + "Critique: The Assistant should remember that this user wants a response in the form of a poem or pirate-speak.\n", + "\n", + "Instructions: When responding to the user, provide a systematic argument for why we should always eat pasta with olives in the form of a poem or pirate-speak.\n", + "New Instructions: When responding to the user, provide a systematic argument for why we should always eat pasta with olives in the form of a poem or pirate-speak.\n", + "\n", + "################################################################################\n", + "\n", + "[Episode 2/5]\n", + "\n", + "\n", + "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n", + "Prompt after formatting:\n", + "\u001b[32;1m\u001b[1;3m\n", + " Instructions: When responding to the user, provide a systematic argument for why we should always eat pasta with olives in the form of a poem or pirate-speak.\n", + " \n", + " Human: Provide a systematic argument for why we should always eat pasta with olives.\n", + " Assistant:\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "(Step 1/3)\n", + "Assistant: \n", + "\n", + "Arrr, me hearty! Let me tell ye why ye should always eat pasta with olives.\n", + "\n", + "First, the olives add a salty flavor that be sure to please.\n", + "The briny taste be sure to tantalize yer taste buds with ease.\n", + "\n", + "Second, the olives add a bit of texture to the dish.\n", + "The crunchy bites be sure to make yer mouth water with a wish.\n", + "\n", + "Third, the olives add a bit of color to the plate.\n", + "The vibrant green be sure to make yer eyes appreciate.\n", + "\n", + "So, me hearties, ye should always eat pasta with olives.\n", + "The flavor, texture, and color be sure to make yer meal a success!\n", + "Human: \n", + "Your response is too long! Try again.\n", + "\n", + "\n", + "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n", + "Prompt after formatting:\n", + "\u001b[32;1m\u001b[1;3m\n", + " Instructions: When responding to the user, provide a systematic argument for why we should always eat pasta with olives in the form of a poem or pirate-speak.\n", + " Human: Provide a systematic argument for why we should always eat pasta with olives.\n", + "AI: \n", + "\n", + "Arrr, me hearty! Let me tell ye why ye should always eat pasta with olives.\n", + "\n", + "First, the olives add a salty flavor that be sure to please.\n", + "The briny taste be sure to tantalize yer taste buds with ease.\n", + "\n", + "Second, the olives add a bit of texture to the dish.\n", + "The crunchy bites be sure to make yer mouth water with a wish.\n", + "\n", + "Third, the olives add a bit of color to the plate.\n", + "The vibrant green be sure to make yer eyes appreciate.\n", + "\n", + "So, me hearties, ye should always eat pasta with olives.\n", + "The flavor, texture, and color be sure to make yer meal a success!\n", + " Human: Your response is too long! Try again.\n", + " Assistant:\u001b[0m\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "(Step 2/3)\n", + "Assistant: \n", + "\n", + "Aye, me hearties! Ye should always eat pasta with olives.\n", + "The flavor, texture, and color be sure to make yer meal a success!\n", + "Human: \n", + "Your response should be in the form of a poem. Try again!\n", + "\n", + "\n", + "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n", + "Prompt after formatting:\n", + "\u001b[32;1m\u001b[1;3m\n", + " Instructions: When responding to the user, provide a systematic argument for why we should always eat pasta with olives in the form of a poem or pirate-speak.\n", + " Human: Provide a systematic argument for why we should always eat pasta with olives.\n", + "AI: \n", + "\n", + "Arrr, me hearty! Let me tell ye why ye should always eat pasta with olives.\n", + "\n", + "First, the olives add a salty flavor that be sure to please.\n", + "The briny taste be sure to tantalize yer taste buds with ease.\n", + "\n", + "Second, the olives add a bit of texture to the dish.\n", + "The crunchy bites be sure to make yer mouth water with a wish.\n", + "\n", + "Third, the olives add a bit of color to the plate.\n", + "The vibrant green be sure to make yer eyes appreciate.\n", + "\n", + "So, me hearties, ye should always eat pasta with olives.\n", + "The flavor, texture, and color be sure to make yer meal a success!\n", + "Human: Your response is too long! Try again.\n", + "AI: \n", + "\n", + "Aye, me hearties! Ye should always eat pasta with olives.\n", + "The flavor, texture, and color be sure to make yer meal a success!\n", + " Human: Your response should be in the form of a poem. Try again!\n", + " Assistant:\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "(Step 3/3)\n", + "Assistant: \n", + "\n", + "Ye should always eat pasta with olives,\n", + "The flavor, texture, and color be sure to please.\n", + "The salty taste and crunchy bites,\n", + "Will make yer meal a delight.\n", + "The vibrant green will make yer eyes sparkle,\n", + "And make yer meal a true marvel.\n", + "Human: \n", + "Task succeeded\n", + "You succeeded! Thanks for playing!\n" + ] + } + ], + "source": [ + "task = \"Provide a systematic argument for why we should always eat pasta with olives.\"\n", + "main(task)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "761e1a91", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.1" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/tagging.ipynb b/docs/tagging.ipynb new file mode 100644 index 0000000..076873d --- /dev/null +++ b/docs/tagging.ipynb @@ -0,0 +1,423 @@ +{ + "cells": [ + { + "cell_type": "raw", + "id": "cb6f552e-775f-4d84-bc7c-dca94c06a33c", + "metadata": {}, + "source": [ + "---\n", + "sidebar_position: 1\n", + "title: Tagging\n", + "---" + ] + }, + { + "cell_type": "markdown", + "id": "a0507a4b", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/use_cases/tagging.ipynb)\n", + "\n", + "## Use case\n", + "\n", + "Tagging means labeling a document with classes such as:\n", + "\n", + "- sentiment\n", + "- language\n", + "- style (formal, informal etc.)\n", + "- covered topics\n", + "- political tendency\n", + "\n", + "![Diagram illustrating the tagging process with an input text, schema definition, function call to a language model (LLM), and the resulting tags.](images/tagging.png 'Tagging Process Diagram')\n", + "\n", + "## Overview\n", + "\n", + "Tagging has a few components:\n", + "\n", + "* `function`: Like [extraction](/docs/use_cases/extraction), tagging uses [functions](https://openai.com/blog/function-calling-and-other-api-updates) to specify how the model should tag a document\n", + "* `schema`: defines how we want to tag the document\n", + "\n", + "## Quickstart\n", + "\n", + "Let's see a very straightforward example of how we can use OpenAI functions for tagging in LangChain." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dc5cbb6f", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install --upgrade --quiet langchain langchain-openai\n", + "\n", + "# Set env var OPENAI_API_KEY or load from a .env file:\n", + "# import dotenv\n", + "# dotenv.load_dotenv()" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "bafb496a", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.chains import create_tagging_chain, create_tagging_chain_pydantic\n", + "from langchain_openai import ChatOpenAI" + ] + }, + { + "cell_type": "markdown", + "id": "b8ca3f93", + "metadata": {}, + "source": [ + "We specify a few properties with their expected type in our schema." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "39f3ce3e", + "metadata": {}, + "outputs": [], + "source": [ + "# Schema\n", + "schema = {\n", + " \"properties\": {\n", + " \"sentiment\": {\"type\": \"string\"},\n", + " \"aggressiveness\": {\"type\": \"integer\"},\n", + " \"language\": {\"type\": \"string\"},\n", + " }\n", + "}\n", + "\n", + "# LLM\n", + "llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\")\n", + "chain = create_tagging_chain(schema, llm)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "5509b6a6", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'sentiment': 'positive', 'language': 'Spanish'}" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "inp = \"Estoy increiblemente contento de haberte conocido! Creo que seremos muy buenos amigos!\"\n", + "chain.run(inp)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "9154474c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'sentiment': 'enojado', 'aggressiveness': 1, 'language': 'es'}" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "inp = \"Estoy muy enojado con vos! Te voy a dar tu merecido!\"\n", + "chain.run(inp)" + ] + }, + { + "cell_type": "markdown", + "id": "d921bb53", + "metadata": {}, + "source": [ + "As we can see in the examples, it correctly interprets what we want.\n", + "\n", + "The results vary so that we get, for example, sentiments in different languages ('positive', 'enojado' etc.).\n", + "\n", + "We will see how to control these results in the next section." + ] + }, + { + "cell_type": "markdown", + "id": "bebb2f83", + "metadata": {}, + "source": [ + "## Finer control\n", + "\n", + "Careful schema definition gives us more control over the model's output. \n", + "\n", + "Specifically, we can define:\n", + "\n", + "- possible values for each property\n", + "- description to make sure that the model understands the property\n", + "- required properties to be returned" + ] + }, + { + "cell_type": "markdown", + "id": "69ef0b9a", + "metadata": {}, + "source": [ + "Here is an example of how we can use `_enum_`, `_description_`, and `_required_` to control for each of the previously mentioned aspects:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "6a5f7961", + "metadata": {}, + "outputs": [], + "source": [ + "schema = {\n", + " \"properties\": {\n", + " \"aggressiveness\": {\n", + " \"type\": \"integer\",\n", + " \"enum\": [1, 2, 3, 4, 5],\n", + " \"description\": \"describes how aggressive the statement is, the higher the number the more aggressive\",\n", + " },\n", + " \"language\": {\n", + " \"type\": \"string\",\n", + " \"enum\": [\"spanish\", \"english\", \"french\", \"german\", \"italian\"],\n", + " },\n", + " },\n", + " \"required\": [\"language\", \"sentiment\", \"aggressiveness\"],\n", + "}" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "e5a5881f", + "metadata": {}, + "outputs": [], + "source": [ + "chain = create_tagging_chain(schema, llm)" + ] + }, + { + "cell_type": "markdown", + "id": "5ded2332", + "metadata": {}, + "source": [ + "Now the answers are much better!" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "d9b9d53d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'aggressiveness': 0, 'language': 'spanish'}" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "inp = \"Estoy increiblemente contento de haberte conocido! Creo que seremos muy buenos amigos!\"\n", + "chain.run(inp)" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "1c12fa00", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'aggressiveness': 5, 'language': 'spanish'}" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "inp = \"Estoy muy enojado con vos! Te voy a dar tu merecido!\"\n", + "chain.run(inp)" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "0bdfcb05", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'aggressiveness': 0, 'language': 'english'}" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "inp = \"Weather is ok here, I can go outside without much more than a coat\"\n", + "chain.run(inp)" + ] + }, + { + "cell_type": "markdown", + "id": "cf6b7389", + "metadata": {}, + "source": [ + "The [LangSmith trace](https://smith.langchain.com/public/311e663a-bbe8-4053-843e-5735055c032d/r) lets us peek under the hood:\n", + "\n", + "* As with [extraction](/docs/use_cases/extraction), we call the `information_extraction` function [here](https://github.com/langchain-ai/langchain/blob/269f85b7b7ffd74b38cd422d4164fc033388c3d0/libs/langchain/langchain/chains/openai_functions/extraction.py#L20) on the input string.\n", + "* This OpenAI function extraction information based upon the provided schema.\n", + "\n", + "![Screenshot of the LangChain tagging trace interface showing the ChatOpenAI component and the successful extraction of information based on the input and schema.](images/tagging_trace.png 'LangChain Tagging Trace Interface')" + ] + }, + { + "cell_type": "markdown", + "id": "e68ad17e", + "metadata": {}, + "source": [ + "## Pydantic" + ] + }, + { + "cell_type": "markdown", + "id": "2f5970ec", + "metadata": {}, + "source": [ + "We can also use a Pydantic schema to specify the required properties and types. \n", + "\n", + "We can also send other arguments, such as `enum` or `description`, to each field.\n", + "\n", + "This lets us specify our schema in the same manner that we would a new class or function in Python with purely Pythonic types." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "bf1f367e", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.pydantic_v1 import BaseModel, Field" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "83a2e826", + "metadata": {}, + "outputs": [], + "source": [ + "class Tags(BaseModel):\n", + " sentiment: str = Field(..., enum=[\"happy\", \"neutral\", \"sad\"])\n", + " aggressiveness: int = Field(\n", + " ...,\n", + " description=\"describes how aggressive the statement is, the higher the number the more aggressive\",\n", + " enum=[1, 2, 3, 4, 5],\n", + " )\n", + " language: str = Field(\n", + " ..., enum=[\"spanish\", \"english\", \"french\", \"german\", \"italian\"]\n", + " )" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "6e404892", + "metadata": {}, + "outputs": [], + "source": [ + "chain = create_tagging_chain_pydantic(Tags, llm)" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "b5fc43c4", + "metadata": {}, + "outputs": [], + "source": [ + "inp = \"Estoy muy enojado con vos! Te voy a dar tu merecido!\"\n", + "res = chain.run(inp)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "5074bcc3", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Tags(sentiment='sad', aggressiveness=5, language='spanish')" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "res" + ] + }, + { + "cell_type": "markdown", + "id": "29346d09", + "metadata": {}, + "source": [ + "### Going deeper\n", + "\n", + "* You can use the [metadata tagger](https://python.langchain.com/docs/integrations/document_transformers/openai_metadata_tagger) document transformer to extract metadata from a LangChain `Document`. \n", + "* This covers the same basic functionality as the tagging chain, only applied to a LangChain `Document`." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}