Skip to content

Commit

Permalink
Update introduction.md
Browse files Browse the repository at this point in the history
  • Loading branch information
davidastart committed Sep 1, 2024
1 parent f83b309 commit 66842d8
Showing 1 changed file with 13 additions and 2 deletions.
15 changes: 13 additions & 2 deletions ai-vector-image/introduction/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,27 @@

Large Language Models (LLMs) have transformed artificial intelligence by enabling computers to understand and generate human-like text. These models rely on vectors—mathematical representations of words, phrases, and sentences—to process and create language. Vectors allow LLMs to capture the meaning of words and the relationships between them, making it possible for the models to perform tasks like text generation, translation, and question-answering with impressive accuracy. However, as we push LLMs to handle more complex tasks, such as integrating text with other types of data like images, new challenges arise. Combining these different kinds of vectors—those representing text and those representing images—requires advanced techniques to ensure the model can effectively understand and generate multimodal information.

In this workshop, we will look at one approach to solving this problem. We will leverage one model to generate descriptions for the image and then a second model for creating the vectors for the textual descriptions. This greatly simplifies the problem because having a single model that can both take images and text as inputs and generate compatible vectors is a challenge and very few models support this. By breaking the task up into two steps we get captions for the images which we can use in the application and we get access to many more LLMs as the tasks individually are much simplier. The diagram below shows the workflow that we will accomplish in this workshop.
This workshop outlines a two-step approach to tackle a problem by leveraging two different models. The first model generates descriptions for images, while the second model creates vectors for these textual descriptions. The second model is loaded in the database allowing for both vector generation and AI Vector Search without leaving the database. By separating the tasks, the complexity is reduced, making it easier to use existing models, as very few can handle both images and text simultaneously. This approach not only simplifies the problem but also broadens the range of available large language models (LLMs) since each task is more straightforward on its own.

The workflow diagram would illustrate the following steps:

- Image Input: Start with an image that needs to be described.
- Description Generation: Use a model to generate a textual description or caption for the image.
- Text Vectorization: Pass the generated description through a second model (embededd in the database) that creates vectors from the text.
- APEX Application: Create a quick application leveraging an embedded text model and AI Vector Search

This method makes the solution more versatile since the text embeddings and search occur within the database allowing any application to be developed.


![Image alt text](images/diagram1.png)


Estimated Workshop Time: 70 Minutes


[](youtube:pu79sny1AzY)

Estimated Workshop Time: 70 Minutes

### Objectives

In this workshop, you will learn how to:
Expand Down

0 comments on commit 66842d8

Please sign in to comment.