From 66842d833bc11672c198235f39af5c17dcdad90a Mon Sep 17 00:00:00 2001 From: davidastart <37092569+davidastart@users.noreply.github.com> Date: Sun, 1 Sep 2024 16:11:47 -0400 Subject: [PATCH] Update introduction.md --- ai-vector-image/introduction/introduction.md | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/ai-vector-image/introduction/introduction.md b/ai-vector-image/introduction/introduction.md index 84add9c0..07f048af 100644 --- a/ai-vector-image/introduction/introduction.md +++ b/ai-vector-image/introduction/introduction.md @@ -4,16 +4,27 @@ Large Language Models (LLMs) have transformed artificial intelligence by enabling computers to understand and generate human-like text. These models rely on vectors—mathematical representations of words, phrases, and sentences—to process and create language. Vectors allow LLMs to capture the meaning of words and the relationships between them, making it possible for the models to perform tasks like text generation, translation, and question-answering with impressive accuracy. However, as we push LLMs to handle more complex tasks, such as integrating text with other types of data like images, new challenges arise. Combining these different kinds of vectors—those representing text and those representing images—requires advanced techniques to ensure the model can effectively understand and generate multimodal information. -In this workshop, we will look at one approach to solving this problem. We will leverage one model to generate descriptions for the image and then a second model for creating the vectors for the textual descriptions. This greatly simplifies the problem because having a single model that can both take images and text as inputs and generate compatible vectors is a challenge and very few models support this. By breaking the task up into two steps we get captions for the images which we can use in the application and we get access to many more LLMs as the tasks individually are much simplier. The diagram below shows the workflow that we will accomplish in this workshop. +This workshop outlines a two-step approach to tackle a problem by leveraging two different models. The first model generates descriptions for images, while the second model creates vectors for these textual descriptions. The second model is loaded in the database allowing for both vector generation and AI Vector Search without leaving the database. By separating the tasks, the complexity is reduced, making it easier to use existing models, as very few can handle both images and text simultaneously. This approach not only simplifies the problem but also broadens the range of available large language models (LLMs) since each task is more straightforward on its own. + +The workflow diagram would illustrate the following steps: + +- Image Input: Start with an image that needs to be described. +- Description Generation: Use a model to generate a textual description or caption for the image. +- Text Vectorization: Pass the generated description through a second model (embededd in the database) that creates vectors from the text. +- APEX Application: Create a quick application leveraging an embedded text model and AI Vector Search + +This method makes the solution more versatile since the text embeddings and search occur within the database allowing any application to be developed. ![Image alt text](images/diagram1.png) -Estimated Workshop Time: 70 Minutes + [](youtube:pu79sny1AzY) +Estimated Workshop Time: 70 Minutes + ### Objectives In this workshop, you will learn how to: