From 45ec0d88f935a090eec03903fae679269af4e687 Mon Sep 17 00:00:00 2001 From: kylebunting Date: Thu, 29 Aug 2024 15:43:30 -0600 Subject: [PATCH] Renaming exercise 2 --- .../02_add_chat_with_data.md | 25 -- .../0201.md | 218 +++++++-------- .../0202.md | 147 ++++++----- .../0203.md | 249 +++++++++--------- ...lement_vector_search_in_cosmos_db_nosql.md | 41 +++ 5 files changed, 354 insertions(+), 326 deletions(-) delete mode 100644 docs/02_add_chat_with_data/02_add_chat_with_data.md rename docs/{02_add_chat_with_data => 02_implement_vector_search_in_cosmos_db_nosql}/0201.md (98%) rename docs/{02_add_chat_with_data => 02_implement_vector_search_in_cosmos_db_nosql}/0202.md (92%) rename docs/{02_add_chat_with_data => 02_implement_vector_search_in_cosmos_db_nosql}/0203.md (96%) create mode 100644 docs/02_implement_vector_search_in_cosmos_db_nosql/02_implement_vector_search_in_cosmos_db_nosql.md diff --git a/docs/02_add_chat_with_data/02_add_chat_with_data.md b/docs/02_add_chat_with_data/02_add_chat_with_data.md deleted file mode 100644 index 3035bdf67..000000000 --- a/docs/02_add_chat_with_data/02_add_chat_with_data.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: 'Exercise 02: Add chat with your data' -layout: default -nav_order: 3 -has_children: true ---- - -# Exercise 02 - Add chat with your data - -## Lab Scenario - -One of the most natural ways to integrate Azure OpenAI in an existing solution is to incorporate chat into an existing system. For this solution to bring the most value to an organization, however, the chat service must have access to information that may be proprietary or otherwise confidential. In this exercise, we will add custom data to augment an existing Azure OpenAI chat deployment, allowing customer service agents to review customer data in a natural language format. - -## Objectives - -After you complete this lab, you will be able to: - -* Prepare a dataset in Azure Blob Storage for ingestion into Azure OpenAI -* Ingest data from Azure Blob Storage into Azure OpenAI via Azure AI Search -* Test chat completions using the Chat playground in Azure OpenAI -* Incorporate chat completions in a Streamlit application - -## Lab Duration - -* **Estimated Time:** 60 minutes diff --git a/docs/02_add_chat_with_data/0201.md b/docs/02_implement_vector_search_in_cosmos_db_nosql/0201.md similarity index 98% rename from docs/02_add_chat_with_data/0201.md rename to docs/02_implement_vector_search_in_cosmos_db_nosql/0201.md index 9b8836a8f..b86a1006d 100644 --- a/docs/02_add_chat_with_data/0201.md +++ b/docs/02_implement_vector_search_in_cosmos_db_nosql/0201.md @@ -1,109 +1,109 @@ ---- -title: '1. Configure vector search in Azure Cosmos DB NoSQL' -layout: default -nav_order: 1 -parent: 'Exercise 02: Implement contextual grounding using vector search in Azure Cosmos DB NoSQL' ---- - -# Task 01 - Configure vector search in Azure Cosmos DB NoSQL (15 minutes) - -## Introduction - -Vector search is a technique that allows items to be found based on their data characteristics instead of exact matches on a specific property field. Instead of requiring exact matches, vector search enables you to find items based on their vector representations. This technique is advantageous when performing similarity searches and is particularly valuable in applications that need to search for information within large blocks of text, such as Consoso Suites' applications. - -The Azure Cosmos DB for NoSQL API includes a vector search feature that provides a robust method for managing and querying high-dimensional vectors. This capability is essential for AI-driven applications requiring an integrated vector search capability. It allows you to store vectors alongside traditional schema-free data within your documents, streamlining data management and significantly enhancing the efficiency of vector operations. Keeping all relevant data in a single logical unit simplifies your data architecture, making it easy to understand and manage, making it easy to understand and manage. - -## Description - -In this task, you will enable the Vector Search feature in your Azure Cosmos DB for NoSQL database, define container vector policies, and specify indexing policies on Cosmos DB containers you will create for storing user reviews and property maintenance data. - -You will conclude by uploading data supplied by the Contoso Suites staff into the containers you created in Cosmos DB. These JSON data files contain user reviews and maintenance tasks for several hotels on their resorts. They offer an example of the types of data the company believes can benefit from the similarity search capabilities provided by Vector Search in Cosmos DB, so they would like you to incorporate this data into the proof of concept. - -The key tasks are as follows: - -1. Enable the Vector Search feature in Azure Cosmos DB for NoSQL. -2. Create containers named "UserReviews" and "MaintenanceTasks" in the `ContosoSuites` database, using the `hotel_id` field as the partition key. -3. During container creation, define a container vector policy using the `cosine` distance function and assign an appropriate vector index type. Name the vector field `vector_embeddings` for each. -4. Populate the containers with data from files in the [/src/data folder](https://github.com/microsoft/TechExcel-Integrating-Azure-PaaS-and-AI-Services-for-AI-Design-Wins/tree/main/src/data) of the repository. -   1. The `UserReviews` container should be populated from the `UserReviews.json` file. -   2. Populate the `MaintenanceTasks` container with data from the file named `PropertyMaintenance.json`. - -## Success Criteria - -- You have enabled the Vector Search for NoSQL API feature in your Azure Cosmos DB instance. -- You have defined a vector policy for the `UserReviews` and `MaintenanceTasks` containers in the `ContosoSuites` database in Azure Cosmos DB. -- You have set appropriate vector indexing policies on both containers to improve similarity search efficiency. -- You have populated both containers with data. - -## Learning Resources - -- [What is a vector database?](https://learn.microsoft.com/azure/cosmos-db/vector-database) -- [What are vector embeddings?](https://learn.microsoft.com/azure/cosmos-db/gen-ai/vector-embeddings) -- [What is vector search?](https://learn.microsoft.com/azure/cosmos-db/gen-ai/vector-search-overview) -- [Vector Search in Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search) -- [What are distance functions?](https://learn.microsoft.com/azure/cosmos-db/gen-ai/distance-functions) - -## Solution - -
-Expand this section to view the solution - -- Enabling the Vector Search for NoSQL API feature in Azure Cosmos DB, can be done via the Azure portal or the Azure CLI. The steps for each technique are listed below. Note that enabling the registration may take several minutes to take effect. - - The steps for enabling the feature in the Azure portal are as follows: - 1. Navigate to your Azure Cosmos DB for NoSQL resource in the [Azure portal](https://portal.azure.com). - 2. Expand the **Settings** item in the left-hand menu, select **Features**, and on the **Features** page, select **Vector Search for NoSQL API**. - - ![The Features page for the Azure Cosmos DB NoSQL database is displayed, with the Vector Search for NoSQL API feature highlighted in the features list.](../../media/Solution/0201-azure-cosmosdb-features-vector-search.png) - - 3. In the **Vector Search for NoSQL API** dialog, review the feature description and select **Enable**. - - ![The Enable button is highlighted on the Vector Search for NoSQL API enrollment dialog.](../../media/Solution/0201-azure-cosmosdb-features-vector-search-enable.png) - - 4. Wait for the notification that the feature was successfully enabled. - - - To enable Vector Search via the Azure CLI: - 1. Execute the following command from the Azure Cloud Shell. Ensure you replace the `` and `` tokens with the values from your deployed resource group. - - ```azurecli - az cosmosdb update \ - --resource-group \ - --name \ - --capabilities EnableNoSQLVectorSearch - ``` - - 2. Wait for the command to run successfully before leaving the Azure Cloud Shell. - -- Container vector policies and vector indexing policies must be defined at the time of container creation. - - In the [Azure portal](https://portal.azure.com), navigate to your Cosmos DB resource. - - Select **Data Explorer** in the left-hand menu. - - On the **Data Explorer** page, select **New Container** - - In the **New Container** dialog: - - Select **Use existing** under **Database id** and select the **ContosoSuites** database from the dropdown list. - - Enter "UserReviews" into the **Container id** box. - - Enter "/hotel" into the **Partition key** box. - - Expand the **Container Vectory Policy** section of the dialog and select **Add vector embedding**. - - Path: Enter "/vector_embeddings" - - Data type: Select **float32**. - - Distance function: Select **cosine**. - - Dimensions: Enter **1536**. This is based on the number of dimensions generated by the `ada-text-embedding-002` model in Azure OpenAI. - - Index type: Select **quantizedFlat**. Given the number of dimensions being specified, 1536, the `flat` index type would not be appropriate, as it only supports a maximum of 505 dimensions for vectors. The `diskANN` index could also be used here, but is only available in a limited preview at this time. - - Select **OK** to create the container. - - Repeat the above steps to create a second container named "MaintenanceTasks." - -- The Azure Cosmos DB Data Explorer can be used to upload the data files provided by Contoso Suites. - - In the [Azure portal](https://portal.azure.com), navigate to your Cosmos DB resource and select **Data Explorer** in the left-hand menu. - - In the Data Explorer, expand the **ContosoSuites** database and the **UserReviews** container, then select **Items**. - - ![Data Explorer is highlighted in the left-hand menu. The expand icon is highlighted for the database and UserReviews containers. Items is highlighted.](../../media/Solution/0201-azure-cosmos-db-data-explorer-user-reviews-items.png) - - - Select **Upload Item** on the toolbar. - - ![The Upload Item button on the Azure Cosmos DB toolbar is highlighted.](../../media/Solution/0201-azure-cosmos-db-toolbar-upload-item.png) - - - In the **Upload Items** dialog, select the browse button and navigate to the `UserReviews.json` file in the `/src/data` directory in the location where cloned the repository, then select **Upload** to import the data in the file. - - ![The Upload Items dialog is displayed with the browse and Upload buttons highlighted. UserReviews.json appears in the Select JSON files box.](../../media/Solution/0201-upload-items-user-reviews.png) - - - Repeat the above steps, this time uploading data into the `MaintenanceTasks` container from the `PropertyMaintenance.json` file. - -
+--- +title: '1. Configure vector search in Azure Cosmos DB NoSQL' +layout: default +nav_order: 1 +parent: 'Exercise 02: Implement contextual grounding using vector search in Azure Cosmos DB NoSQL' +--- + +# Task 01 - Configure vector search in Azure Cosmos DB NoSQL (15 minutes) + +## Introduction + +Vector search is a technique that allows items to be found based on their data characteristics instead of exact matches on a specific property field. Instead of requiring exact matches, vector search enables you to find items based on their vector representations. This technique is advantageous when performing similarity searches and is particularly valuable in applications that need to search for information within large blocks of text, such as Consoso Suites' applications. + +The Azure Cosmos DB for NoSQL API includes a vector search feature that provides a robust method for managing and querying high-dimensional vectors. This capability is essential for AI-driven applications requiring an integrated vector search capability. It allows you to store vectors alongside traditional schema-free data within your documents, streamlining data management and significantly enhancing the efficiency of vector operations. Keeping all relevant data in a single logical unit simplifies your data architecture, making it easy to understand and manage, making it easy to understand and manage. + +## Description + +In this task, you will enable the Vector Search feature in your Azure Cosmos DB for NoSQL database, define container vector policies, and specify indexing policies on Cosmos DB containers you will create for storing user reviews and property maintenance data. + +You will conclude by uploading data supplied by the Contoso Suites staff into the containers you created in Cosmos DB. These JSON data files contain user reviews and maintenance tasks for several hotels on their resorts. They offer an example of the types of data the company believes can benefit from the similarity search capabilities provided by Vector Search in Cosmos DB, so they would like you to incorporate this data into the proof of concept. + +The key tasks are as follows: + +1. Enable the Vector Search feature in Azure Cosmos DB for NoSQL. +2. Create containers named "UserReviews" and "MaintenanceTasks" in the `ContosoSuites` database, using the `hotel_id` field as the partition key. +3. During container creation, define a container vector policy using the `cosine` distance function and assign an appropriate vector index type. Name the vector field `vector_embeddings` for each. +4. Populate the containers with data from files in the [/src/data folder](https://github.com/microsoft/TechExcel-Integrating-Azure-PaaS-and-AI-Services-for-AI-Design-Wins/tree/main/src/data) of the repository. +   1. The `UserReviews` container should be populated from the `UserReviews.json` file. +   2. Populate the `MaintenanceTasks` container with data from the file named `PropertyMaintenance.json`. + +## Success Criteria + +- You have enabled the Vector Search for NoSQL API feature in your Azure Cosmos DB instance. +- You have defined a vector policy for the `UserReviews` and `MaintenanceTasks` containers in the `ContosoSuites` database in Azure Cosmos DB. +- You have set appropriate vector indexing policies on both containers to improve similarity search efficiency. +- You have populated both containers with data. + +## Learning Resources + +- [What is a vector database?](https://learn.microsoft.com/azure/cosmos-db/vector-database) +- [What are vector embeddings?](https://learn.microsoft.com/azure/cosmos-db/gen-ai/vector-embeddings) +- [What is vector search?](https://learn.microsoft.com/azure/cosmos-db/gen-ai/vector-search-overview) +- [Vector Search in Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search) +- [What are distance functions?](https://learn.microsoft.com/azure/cosmos-db/gen-ai/distance-functions) + +## Solution + +
+Expand this section to view the solution + +- Enabling the Vector Search for NoSQL API feature in Azure Cosmos DB, can be done via the Azure portal or the Azure CLI. The steps for each technique are listed below. Note that enabling the registration may take several minutes to take effect. + - The steps for enabling the feature in the Azure portal are as follows: + 1. Navigate to your Azure Cosmos DB for NoSQL resource in the [Azure portal](https://portal.azure.com). + 2. Expand the **Settings** item in the left-hand menu, select **Features**, and on the **Features** page, select **Vector Search for NoSQL API**. + + ![The Features page for the Azure Cosmos DB NoSQL database is displayed, with the Vector Search for NoSQL API feature highlighted in the features list.](../../media/Solution/0201-azure-cosmosdb-features-vector-search.png) + + 3. In the **Vector Search for NoSQL API** dialog, review the feature description and select **Enable**. + + ![The Enable button is highlighted on the Vector Search for NoSQL API enrollment dialog.](../../media/Solution/0201-azure-cosmosdb-features-vector-search-enable.png) + + 4. Wait for the notification that the feature was successfully enabled. + + - To enable Vector Search via the Azure CLI: + 1. Execute the following command from the Azure Cloud Shell. Ensure you replace the `` and `` tokens with the values from your deployed resource group. + + ```azurecli + az cosmosdb update \ + --resource-group \ + --name \ + --capabilities EnableNoSQLVectorSearch + ``` + + 2. Wait for the command to run successfully before leaving the Azure Cloud Shell. + +- Container vector policies and vector indexing policies must be defined at the time of container creation. + - In the [Azure portal](https://portal.azure.com), navigate to your Cosmos DB resource. + - Select **Data Explorer** in the left-hand menu. + - On the **Data Explorer** page, select **New Container** + - In the **New Container** dialog: + - Select **Use existing** under **Database id** and select the **ContosoSuites** database from the dropdown list. + - Enter "UserReviews" into the **Container id** box. + - Enter "/hotel" into the **Partition key** box. + - Expand the **Container Vectory Policy** section of the dialog and select **Add vector embedding**. + - Path: Enter "/vector_embeddings" + - Data type: Select **float32**. + - Distance function: Select **cosine**. + - Dimensions: Enter **1536**. This is based on the number of dimensions generated by the `ada-text-embedding-002` model in Azure OpenAI. + - Index type: Select **quantizedFlat**. Given the number of dimensions being specified, 1536, the `flat` index type would not be appropriate, as it only supports a maximum of 505 dimensions for vectors. The `diskANN` index could also be used here, but is only available in a limited preview at this time. + - Select **OK** to create the container. + - Repeat the above steps to create a second container named "MaintenanceTasks." + +- The Azure Cosmos DB Data Explorer can be used to upload the data files provided by Contoso Suites. + - In the [Azure portal](https://portal.azure.com), navigate to your Cosmos DB resource and select **Data Explorer** in the left-hand menu. + - In the Data Explorer, expand the **ContosoSuites** database and the **UserReviews** container, then select **Items**. + + ![Data Explorer is highlighted in the left-hand menu. The expand icon is highlighted for the database and UserReviews containers. Items is highlighted.](../../media/Solution/0201-azure-cosmos-db-data-explorer-user-reviews-items.png) + + - Select **Upload Item** on the toolbar. + + ![The Upload Item button on the Azure Cosmos DB toolbar is highlighted.](../../media/Solution/0201-azure-cosmos-db-toolbar-upload-item.png) + + - In the **Upload Items** dialog, select the browse button and navigate to the `UserReviews.json` file in the `/src/data` directory in the location where cloned the repository, then select **Upload** to import the data in the file. + + ![The Upload Items dialog is displayed with the browse and Upload buttons highlighted. UserReviews.json appears in the Select JSON files box.](../../media/Solution/0201-upload-items-user-reviews.png) + + - Repeat the above steps, this time uploading data into the `MaintenanceTasks` container from the `PropertyMaintenance.json` file. + +
diff --git a/docs/02_add_chat_with_data/0202.md b/docs/02_implement_vector_search_in_cosmos_db_nosql/0202.md similarity index 92% rename from docs/02_add_chat_with_data/0202.md rename to docs/02_implement_vector_search_in_cosmos_db_nosql/0202.md index c7ff6c0be..59e82a0da 100644 --- a/docs/02_add_chat_with_data/0202.md +++ b/docs/02_implement_vector_search_in_cosmos_db_nosql/0202.md @@ -1,70 +1,77 @@ ---- -title: '2. Add data to the Chat playground' -layout: default -nav_order: 2 -parent: 'Exercise 02: Add chat with your data' ---- - -# Task 02 - Add data to the Chat playground (20 minutes) - -## Introduction - -Azure OpenAI supports adding data from a variety of Azure data sources, including Azure AI Search, Azure Blob Storage, Azure Cosmos DB for MongoDB vCore, data from a specific URL, or uploading local files. We can process data from these resources and make them available to an Azure OpenAI deployment, allowing an assistant to answer natural language user queries. - -## Description - -In the prior task, you made resort and hotel data available in Azure Blob Storage. In this task, you will show the Contoso Suites staff how to ask questions of a ChatGPT deployment based on the data you imported. - -The key tasks are as follows: - -1. Add a new data source for resort and hotel information in the Azure AI Studio Chat playground. Make sure that you have enabled vectorization using a text-embedding-ada-002 deployment. -2. Following is a sample customer request that a Contoso Suites customer service agent has received in the past. "I am looking for a sunny beachside resort on an island. There need to be diving opportunities nearby and I'd prefer it not to be too crowded an area. Which resorts would you recommend?" Enter this request into the chat session and note the response. - - {: .note } - > You may receive a message that your chat service is not able to complete the request based on its available information. If you do, please try rephrasing this as two separate requests. The first is, "Which resorts are on islands?" The second is, "Of those resorts, which have good diving opportunities?" - -3. Following is a sample customer request that a Contoso Suites customer service agent has received in the past. "Our family is celebrating my mother's 90th birthday and we want to have that celebration in Aruba. Do you have a hotel that can accommodate 19 room rentals? And are there any reception rooms at that hotel?" Enter this request into the chat session and note the response. - -## Success Criteria - -- You have created vectorized indexes in Azure AI Search for resorts and hotels. -- You have demonstrated how to use the Chat playground to allow ChatGPT to interact with custom data. - -## Learning Resources - -- [Azure OpenAI on your data](https://learn.microsoft.com/azure/ai-services/openai/concepts/use-your-data) -- [Data, privacy, and security for Azure OpenAI Service](https://learn.microsoft.com/legal/cognitive-services/openai/data-privacy) - -## Solution - -
-Expand this section to view the solution - -- All of this work can be done in the current Azure AI Studio (https://oai.azure.com). -- The steps to add a data source in the Chat playground are: - - Navigate to the Chat tab in Azure AI Studio. - - Select the "Add your data" tab from the Assistant setup page. - - Select the **Add a data source** button. - - ![Add a data source](../../media/Solution/0202_AddDataSource.png) - - - In the modal dialog, choose "Azure Blob Storage" from the data source drop-down list. - - CORS will need to be enabled for the storage account. You may do this from within the dialog, as long as you have appropriate permissions on the storage account. - - Select the storage account you created and the `contoso-suites` storage container. - - Choose the Azure AI Search resource you created. - - The index name can be something simple, such as "resorts" and the Indexer schedule can be set to Once. - - Select the "Add vector search to this search resource" option and choose your text-embedding-ada-002 deployment from the drop-down list. - - ![Select the data source](../../media/Solution/0202_SelectDataSource.png) - - Be sure that you select the "I acknowledge that connecting to an Azure AI Search account will incur usage to my account." checkbox once it appears. It will appear after you have selected your storage account. - - ![Acknowledge that connecting to an Azure AI Search account will incur usage to your account.](../../media/Solution/0202_Acknowledgement.png). - - - From the Search type menu, choose "Hybrid (vector + keyword)" and select the option acknowledging that this will incur usage to your account. - - ![Enable hybrid search via vector and keyword](../../media/Solution/0202_HybridSearch.png) - -- After ingestion and processing is complete, you can ask questions of the uploaded dataset. - -
+--- +title: '2. Add data to the Chat playground' +layout: default +nav_order: 2 +parent: 'Exercise 02: Add chat with your data' +--- + +# Task 02 - Add data to the Chat playground (20 minutes) + +## Introduction + +Azure OpenAI supports adding data from a variety of Azure data sources, including Azure AI Search, Azure Blob Storage, Azure Cosmos DB for MongoDB vCore, data from a specific URL, or uploading local files. We can process data from these resources and make them available to an Azure OpenAI deployment, allowing an assistant to answer natural language user queries. + +## Description + +Upload data and create vectors using Azure Open AI (30 min) + 1. Upload data from blob storage (User reviews and property maintenance JSON files) + 2. Vectorize text fields + 3. Write vectors to field defined in container + 4. Review data in data explorer to see what vectors look like. + 5. Update application to handle vectorizing new records? + +In the prior task, you made resort and hotel data available in Azure Blob Storage. In this task, you will show the Contoso Suites staff how to ask questions of a ChatGPT deployment based on the data you imported. + +The key tasks are as follows: + +1. Add a new data source for resort and hotel information in the Azure AI Studio Chat playground. Make sure that you have enabled vectorization using a text-embedding-ada-002 deployment. +2. Following is a sample customer request that a Contoso Suites customer service agent has received in the past. "I am looking for a sunny beachside resort on an island. There need to be diving opportunities nearby and I'd prefer it not to be too crowded an area. Which resorts would you recommend?" Enter this request into the chat session and note the response. + + {: .note } + > You may receive a message that your chat service is not able to complete the request based on its available information. If you do, please try rephrasing this as two separate requests. The first is, "Which resorts are on islands?" The second is, "Of those resorts, which have good diving opportunities?" + +3. Following is a sample customer request that a Contoso Suites customer service agent has received in the past. "Our family is celebrating my mother's 90th birthday and we want to have that celebration in Aruba. Do you have a hotel that can accommodate 19 room rentals? And are there any reception rooms at that hotel?" Enter this request into the chat session and note the response. + +## Success Criteria + +- You have created vectorized indexes in Azure AI Search for resorts and hotels. +- You have demonstrated how to use the Chat playground to allow ChatGPT to interact with custom data. + +## Learning Resources + +- [Azure OpenAI on your data](https://learn.microsoft.com/azure/ai-services/openai/concepts/use-your-data) +- [Data, privacy, and security for Azure OpenAI Service](https://learn.microsoft.com/legal/cognitive-services/openai/data-privacy) + +## Solution + +
+Expand this section to view the solution + +- All of this work can be done in the current Azure AI Studio (https://oai.azure.com). +- The steps to add a data source in the Chat playground are: + - Navigate to the Chat tab in Azure AI Studio. + - Select the "Add your data" tab from the Assistant setup page. + - Select the **Add a data source** button. + + ![Add a data source](../../media/Solution/0202_AddDataSource.png) + + - In the modal dialog, choose "Azure Blob Storage" from the data source drop-down list. + - CORS will need to be enabled for the storage account. You may do this from within the dialog, as long as you have appropriate permissions on the storage account. + - Select the storage account you created and the `contoso-suites` storage container. + - Choose the Azure AI Search resource you created. + - The index name can be something simple, such as "resorts" and the Indexer schedule can be set to Once. + - Select the "Add vector search to this search resource" option and choose your text-embedding-ada-002 deployment from the drop-down list. + + ![Select the data source](../../media/Solution/0202_SelectDataSource.png) + + Be sure that you select the "I acknowledge that connecting to an Azure AI Search account will incur usage to my account." checkbox once it appears. It will appear after you have selected your storage account. + + ![Acknowledge that connecting to an Azure AI Search account will incur usage to your account.](../../media/Solution/0202_Acknowledgement.png). + + - From the Search type menu, choose "Hybrid (vector + keyword)" and select the option acknowledging that this will incur usage to your account. + + ![Enable hybrid search via vector and keyword](../../media/Solution/0202_HybridSearch.png) + +- After ingestion and processing is complete, you can ask questions of the uploaded dataset. + +
diff --git a/docs/02_add_chat_with_data/0203.md b/docs/02_implement_vector_search_in_cosmos_db_nosql/0203.md similarity index 96% rename from docs/02_add_chat_with_data/0203.md rename to docs/02_implement_vector_search_in_cosmos_db_nosql/0203.md index 04612b886..1a6098e02 100644 --- a/docs/02_add_chat_with_data/0203.md +++ b/docs/02_implement_vector_search_in_cosmos_db_nosql/0203.md @@ -1,122 +1,127 @@ ---- -title: '3. Add custom chat to a Streamlit dashboard' -layout: default -nav_order: 3 -parent: 'Exercise 02: Add chat with your data' ---- - -# Task 03 - Add custom chat to a Streamlit dashboard (30 minutes) - -## Introduction - -The Azure AI Studio Chat playground is a good place to try out functionality such as adding your own data and chatting with an assistant, but it is not the only way to enable this communication. It is also possible to integrate Azure OpenAI resources into existing code bases in a variety of languages, including C#, F#, Java, JavaScript, Python, and any language supporting interactions with REST APIs. - -## Description - -Now that you have demonstrated some of the capabilities around Azure OpenAI and using custom data to inform responses, the Contoso Suites development team would like to incorporate an Azure OpenAI ChatGPT model in their website. To simplify matters, they would like you to demonstrate in a Streamlit dashboard how we can incorporate chat capabilities. They are not particularly concerned about user interface niceties, as that is something they are capable of doing. Instead, they want you to demonstrate the integration process. - -The key tasks are as follows: - -1. In the `src\ContosoSuitesDashboard` folder, install all of the packages in **requirements.txt**. -2. Fill in the contents of `config.json` with relevant values from your Azure OpenAI service and from your Azure AI Search service. You will fill in the following variables: `AOAIEndpoint`, `AOAIKey`, `AOAIDeploymentName`, `SearchEndpoint`, `SearchKey`, and `SearchIndex`. Leave the remaining variables alone for now. You can find the Azure OpenAI endpoint, key, and deployment name in the Azure portal, specifically the **Keys and Endpoint** option under the **Resource Management** menu for your Azure OpenAI service. You can find the search endpoint search key in the Azure portal as well, specifically the **Keys** option under the **Settings** menu for your Azure AI Search service. The search index is the index that you created in Exercise 02, Task 02. - - {: .note } - > This `config.json` file is intended for demonstrating a Streamlit dashboard locally. Outside of a demonstration scenario, you would want to use [a combination of environment variables, Azure Key Vault, and Streamlit secrets](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/how-to-secure-azure-openai-keys-using-environment-variables/ba-p/3821162) to manage these details. - - {: .note } - > When filling in the **AOAIEndpoint** and **SearchEndpoint** configuration settings, be sure to include the `https://` part of the URL. - -3. The file named `Index.py` contains the skeleton of a Streamlit dashboard. It references a page called `pages\1_Chat_with_Data.py`. In the "Chat with Data" section of the latter file, complete the functions named `handle_chat_prompt()` and `create_chat_completion()`. -4. After loading the Streamlit page, ask the following question: "Our family is celebrating my mother's 90th birthday and we want to have that celebration in Aruba. Do you have a hotel that can accommodate 19 room rentals? And are there any reception rooms at that hotel?" - - {: .note } - > Use the following command in a terminal to run Streamlit: `streamlit run Index.py`. You must be in the `src\ContosoSuitesDashboard` directory in your terminal, must have Python installed, and must have installed requirements, including Streamlit. After running this command, you will be able to view the Streamlit app in your web browser by navigating to the URL displayed in the terminal. - - {: .note } - > If you get an error that Streamlit is not installed, you may instead need to run `python -m streamlit run Index.py`. - -5. Continue the chat conversation with this follow-up: "What other amenities does that hotel have?" - -## Success Criteria - -- Website users are able to enter their prompts into a textbox and submit the prompt to Azure OpenAI. -- The resulting response will appear on the webpage as a chat response. -- Session history is retained as long as the Streamlit dashboard is open but refreshing the page will reset session history. - -## Learning Resources - -- [Quickstart: Chat with Azure OpenAI models using your own data](https://learn.microsoft.com/azure/ai-services/openai/use-your-data-quickstart?tabs=powershell%2Cpython&pivots=programming-language-python) -- [Sample Chat App with AOAI](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main) -- [Sample 08 - Use your own data](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/openai/Azure.AI.OpenAI/tests/Samples/Sample08_UseYourOwnData.cs) -- [Build a conversational app with Streamlit](https://docs.streamlit.io/knowledge-base/tutorials/build-conversational-apps) - -## Solution - -
-Expand this section to view the solution - -- Be sure that all Python packages are installed before trying to run Streamlit. At the command line and inside the `src\ContosoSuitesDashboard\` folder, execute the following command: `pip install -r requirements.txt` -- Modify the `config.json` file to fill in values for each of the configuration settings. For **AOAIEndpoint** and **SearchEndpoint**, you will want to include this as a URL, starting with `https://`. -- The `main()` function acts as the control function for this page. It initializes the chat history and displays it on each page refresh. Then, it calls `handle_prompt()` to handle the user's prompt. `handle_prompt()` then calls `handle_chat_prompt()` to send the text of a message to the Azure OpenAI service. -- The `handle_chat_prompt()` function does two things: it echoes the user's prompt to the chat window, and then it sends the prompt to Azure OpenAI and writes the resulting message to the chat window. - - The code for the completed `handle_chat_prompt()` function is as follows: - - ```python - # Echo the user's prompt to the chat window - st.session_state.messages.append({"role": "user", "content": prompt}) - with st.chat_message("user"): - st.markdown(prompt) - - # Send the user's prompt to Azure OpenAI and display the response - # The call to Azure OpenAI is handled in create_chat_completion() - # This function loops through the responses and displays them as they come in. - # It also appends the full response to the chat history. - with st.chat_message("assistant"): - message_placeholder = st.empty() - full_response = "" - for response in create_chat_completion(deployment_name, st.session_state.messages, config["SearchEndpoint"], config["SearchKey"], config["SearchIndex"]): - full_response += (response.choices[0].delta.content or "") - message_placeholder.markdown(full_response + "▌") - message_placeholder.markdown(full_response) - st.session_state.messages.append({"role": "assistant", "content": full_response}) - ``` - - {: .note } - > After filling in the `create_chat_completion()` function, be sure to remove the line `raise NotImplementedError`. Otherwise, you will get a `NotImplementedError` error message when running the code. You should remove these raise blocks as you implement functions in the code. They exist in order to allow us to define stub functions, where we know the function name but do not yet have completely working code in place. - -- The `create_chat_completion()` function reaches out to Azure OpenAI and performs the chat completion, ensuring that we only include information from our Azure AI Search index. - - The code for the completed `create_chat_completion()` function is as follows: - - ```python - # Create an Azure OpenAI client. We create it in here because each exercise will - # require at a minimum different base URLs. - client = openai.AzureOpenAI( - base_url=f"{aoai_endpoint}/openai/deployments/{deployment_name}/extensions/", - api_key=aoai_api_key, - api_version="2023-12-01-preview" - ) - # Create and return a new chat completion request - # Be sure to include the "extra_body" parameter to use Azure AI Search as the data source - return client.chat.completions.create( - model=deployment_name, - messages=[ - {"role": m["role"], "content": m["content"]} - for m in messages - ], - stream=True, - extra_body={ - "dataSources": [ - { - "type": "AzureCognitiveSearch", - "parameters": { - "endpoint": endpoint, - "key": key, - "indexName": index_name, - } - } - ] - } - ) - ``` - -
+--- +title: '3. Add custom chat to a Streamlit dashboard' +layout: default +nav_order: 3 +parent: 'Exercise 02: Add chat with your data' +--- + +# Task 03 - Add custom chat to a Streamlit dashboard (30 minutes) + +## Introduction + +The Azure AI Studio Chat playground is a good place to try out functionality such as adding your own data and chatting with an assistant, but it is not the only way to enable this communication. It is also possible to integrate Azure OpenAI resources into existing code bases in a variety of languages, including C#, F#, Java, JavaScript, Python, and any language supporting interactions with REST APIs. + +## Description + +Execute vector distance queries (15 min) + 1. Execute various vector searches + 2. Compare against traditional search? + 3. Add new record and then search for it? This will showcase vectorization process running against new records and making them immediately available for searching. + +Now that you have demonstrated some of the capabilities around Azure OpenAI and using custom data to inform responses, the Contoso Suites development team would like to incorporate an Azure OpenAI ChatGPT model in their website. To simplify matters, they would like you to demonstrate in a Streamlit dashboard how we can incorporate chat capabilities. They are not particularly concerned about user interface niceties, as that is something they are capable of doing. Instead, they want you to demonstrate the integration process. + +The key tasks are as follows: + +1. In the `src\ContosoSuitesDashboard` folder, install all of the packages in **requirements.txt**. +2. Fill in the contents of `config.json` with relevant values from your Azure OpenAI service and from your Azure AI Search service. You will fill in the following variables: `AOAIEndpoint`, `AOAIKey`, `AOAIDeploymentName`, `SearchEndpoint`, `SearchKey`, and `SearchIndex`. Leave the remaining variables alone for now. You can find the Azure OpenAI endpoint, key, and deployment name in the Azure portal, specifically the **Keys and Endpoint** option under the **Resource Management** menu for your Azure OpenAI service. You can find the search endpoint search key in the Azure portal as well, specifically the **Keys** option under the **Settings** menu for your Azure AI Search service. The search index is the index that you created in Exercise 02, Task 02. + + {: .note } + > This `config.json` file is intended for demonstrating a Streamlit dashboard locally. Outside of a demonstration scenario, you would want to use [a combination of environment variables, Azure Key Vault, and Streamlit secrets](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/how-to-secure-azure-openai-keys-using-environment-variables/ba-p/3821162) to manage these details. + + {: .note } + > When filling in the **AOAIEndpoint** and **SearchEndpoint** configuration settings, be sure to include the `https://` part of the URL. + +3. The file named `Index.py` contains the skeleton of a Streamlit dashboard. It references a page called `pages\1_Chat_with_Data.py`. In the "Chat with Data" section of the latter file, complete the functions named `handle_chat_prompt()` and `create_chat_completion()`. +4. After loading the Streamlit page, ask the following question: "Our family is celebrating my mother's 90th birthday and we want to have that celebration in Aruba. Do you have a hotel that can accommodate 19 room rentals? And are there any reception rooms at that hotel?" + + {: .note } + > Use the following command in a terminal to run Streamlit: `streamlit run Index.py`. You must be in the `src\ContosoSuitesDashboard` directory in your terminal, must have Python installed, and must have installed requirements, including Streamlit. After running this command, you will be able to view the Streamlit app in your web browser by navigating to the URL displayed in the terminal. + + {: .note } + > If you get an error that Streamlit is not installed, you may instead need to run `python -m streamlit run Index.py`. + +5. Continue the chat conversation with this follow-up: "What other amenities does that hotel have?" + +## Success Criteria + +- Website users are able to enter their prompts into a textbox and submit the prompt to Azure OpenAI. +- The resulting response will appear on the webpage as a chat response. +- Session history is retained as long as the Streamlit dashboard is open but refreshing the page will reset session history. + +## Learning Resources + +- [Quickstart: Chat with Azure OpenAI models using your own data](https://learn.microsoft.com/azure/ai-services/openai/use-your-data-quickstart?tabs=powershell%2Cpython&pivots=programming-language-python) +- [Sample Chat App with AOAI](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main) +- [Sample 08 - Use your own data](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/openai/Azure.AI.OpenAI/tests/Samples/Sample08_UseYourOwnData.cs) +- [Build a conversational app with Streamlit](https://docs.streamlit.io/knowledge-base/tutorials/build-conversational-apps) + +## Solution + +
+Expand this section to view the solution + +- Be sure that all Python packages are installed before trying to run Streamlit. At the command line and inside the `src\ContosoSuitesDashboard\` folder, execute the following command: `pip install -r requirements.txt` +- Modify the `config.json` file to fill in values for each of the configuration settings. For **AOAIEndpoint** and **SearchEndpoint**, you will want to include this as a URL, starting with `https://`. +- The `main()` function acts as the control function for this page. It initializes the chat history and displays it on each page refresh. Then, it calls `handle_prompt()` to handle the user's prompt. `handle_prompt()` then calls `handle_chat_prompt()` to send the text of a message to the Azure OpenAI service. +- The `handle_chat_prompt()` function does two things: it echoes the user's prompt to the chat window, and then it sends the prompt to Azure OpenAI and writes the resulting message to the chat window. + - The code for the completed `handle_chat_prompt()` function is as follows: + + ```python + # Echo the user's prompt to the chat window + st.session_state.messages.append({"role": "user", "content": prompt}) + with st.chat_message("user"): + st.markdown(prompt) + + # Send the user's prompt to Azure OpenAI and display the response + # The call to Azure OpenAI is handled in create_chat_completion() + # This function loops through the responses and displays them as they come in. + # It also appends the full response to the chat history. + with st.chat_message("assistant"): + message_placeholder = st.empty() + full_response = "" + for response in create_chat_completion(deployment_name, st.session_state.messages, config["SearchEndpoint"], config["SearchKey"], config["SearchIndex"]): + full_response += (response.choices[0].delta.content or "") + message_placeholder.markdown(full_response + "▌") + message_placeholder.markdown(full_response) + st.session_state.messages.append({"role": "assistant", "content": full_response}) + ``` + + {: .note } + > After filling in the `create_chat_completion()` function, be sure to remove the line `raise NotImplementedError`. Otherwise, you will get a `NotImplementedError` error message when running the code. You should remove these raise blocks as you implement functions in the code. They exist in order to allow us to define stub functions, where we know the function name but do not yet have completely working code in place. + +- The `create_chat_completion()` function reaches out to Azure OpenAI and performs the chat completion, ensuring that we only include information from our Azure AI Search index. + - The code for the completed `create_chat_completion()` function is as follows: + + ```python + # Create an Azure OpenAI client. We create it in here because each exercise will + # require at a minimum different base URLs. + client = openai.AzureOpenAI( + base_url=f"{aoai_endpoint}/openai/deployments/{deployment_name}/extensions/", + api_key=aoai_api_key, + api_version="2023-12-01-preview" + ) + # Create and return a new chat completion request + # Be sure to include the "extra_body" parameter to use Azure AI Search as the data source + return client.chat.completions.create( + model=deployment_name, + messages=[ + {"role": m["role"], "content": m["content"]} + for m in messages + ], + stream=True, + extra_body={ + "dataSources": [ + { + "type": "AzureCognitiveSearch", + "parameters": { + "endpoint": endpoint, + "key": key, + "indexName": index_name, + } + } + ] + } + ) + ``` + +
diff --git a/docs/02_implement_vector_search_in_cosmos_db_nosql/02_implement_vector_search_in_cosmos_db_nosql.md b/docs/02_implement_vector_search_in_cosmos_db_nosql/02_implement_vector_search_in_cosmos_db_nosql.md new file mode 100644 index 000000000..91523a1ce --- /dev/null +++ b/docs/02_implement_vector_search_in_cosmos_db_nosql/02_implement_vector_search_in_cosmos_db_nosql.md @@ -0,0 +1,41 @@ +--- +title: 'Exercise 02: Implement contextual grounding using vector search in Azure Cosmos DB NoSQL' +layout: default +nav_order: 3 +has_children: true +--- + +# Exercise 02 - Implement contextual grounding using vector search in Azure Cosmos DB NoSQL + +## Lab Scenario + +One of the most natural ways to integrate Azure OpenAI in an existing solution is to incorporate chat into an existing system. For this solution to bring the most value to an organization, however, the chat service must have access to information that may be proprietary or otherwise confidential. In this exercise, we will add custom data to augment an existing Azure OpenAI chat deployment, allowing customer service agents to review customer data in a natural language format. + +## Objectives + +After you complete this lab, you will be able to: + +- Enable the Vector Search feature in Azure Cosmos DB NoSQL +- Define container vector policies +- Create vector indexing policies +- Generate vector embeddings using Azure OpenAI +- Peform similarity search using the `VectorDistance()` function in Cosmos DB + +## Lab Duration + +- **Estimated Time:** 60 minutes + +1. Enable vector search (15 min) + 1. Enroll in feature + 2. Create containers (UserReviews and PropertyMaintenance) + 3. Define container vector policy and vector indexing policy. +2. Upload data and create vectors using Azure Open AI (30 min) + 1. Upload data from blob storage (User reviews and property maintenance JSON files) + 2. Vectorize text fields + 3. Write vectors to field defined in container + 4. Review data in data explorer to see what vectors look like. + 5. Update application to handle vectorizing new records? +3. Execute vector distance queries (15 min) + 1. Execute various vector searches + 2. Compare against traditional search? + 3. Add new record and then search for it? This will showcase vectorization process running against new records and making them immediately available for searching. \ No newline at end of file