diff --git a/README.md b/README.md index 8b3fbf0..c7c8f5c 100644 --- a/README.md +++ b/README.md @@ -93,7 +93,9 @@ LLM-based models: python -m pip install -U angle-emb ``` -### ⌛ Load BERT-based Model +### ⌛ Infer BERT-based Model +[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1QJcA2Mvive4pBxWweTpZz9OgwvE42eJZ?usp=sharing) + 1) **With Prompts**: You can specify a prompt with `prompt=YOUR_PROMPT` in `encode` method. If set a prompt, the inputs should be a list of dict or a single dict with key `text`, where `text` is the placeholder in the prompt for the input text. You can use other placeholder names. We provide a set of predefined prompts in `Prompts` class, you can check them via `Prompts.list_prompts()`. @@ -137,27 +139,88 @@ for i, dv1 in enumerate(doc_vecs): ``` -### ⌛ Load LLM-based Models +### ⌛ Infer LLM-based Models +[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1QJcA2Mvive4pBxWweTpZz9OgwvE42eJZ?usp=sharing) If the pretrained weight is a LoRA-based model, you need to specify the backbone via `model_name_or_path` and specify the LoRA path via the `pretrained_lora_path` in `from_pretrained` method. ```python from angle_emb import AnglE, Prompts +from angle_emb.utils import cosine_similarity angle = AnglE.from_pretrained('NousResearch/Llama-2-7b-hf', pretrained_lora_path='SeanLee97/angle-llama-7b-nli-v2', pooling_strategy='last', is_llm=True, - torch_dtype='float16') + torch_dtype='float16').cuda() print('All predefined prompts:', Prompts.list_prompts()) -vec = angle.encode({'text': 'hello world'}, to_numpy=True, prompt=Prompts.A) -print(vec) -vecs = angle.encode([{'text': 'hello world1'}, {'text': 'hello world2'}], to_numpy=True, prompt=Prompts.A) -print(vecs) +doc_vecs = angle.encode([ + 'The weather is great!', + 'The weather is very good!', + 'i am going to bed' +], prompt=Prompts.A) + +for i, dv1 in enumerate(doc_vecs): + for dv2 in doc_vecs[i+1:]: + print(cosine_similarity(dv1, dv2)) +``` + + +### ⌛ Infer BiLLM-based Models +[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1QJcA2Mvive4pBxWweTpZz9OgwvE42eJZ?usp=sharing) + +Specify `apply_billm` and `billm_model_class` to load and infer billm models + + +```python +from angle_emb import AnglE, Prompts +from angle_emb.utils import cosine_similarity + +# specify `apply_billm` and `billm_model_class` to load billm models +angle = AnglE.from_pretrained('NousResearch/Llama-2-7b-hf', + pretrained_lora_path='SeanLee97/bellm-llama-7b-nli', + pooling_strategy='last', + is_llm=True, + apply_billm=True, + billm_model_class='LlamaForCausalMask', + torch_dtype='float16').cuda() + +doc_vecs = angle.encode([ + 'The weather is great!', + 'The weather is very good!', + 'i am going to bed' +], prompt='The representative word for sentence {text} is:"') + +for i, dv1 in enumerate(doc_vecs): + for dv2 in doc_vecs[i+1:]: + print(cosine_similarity(dv1, dv2)) +``` +### ⌛ Infer Espresso/Matryoshka Models +[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1QJcA2Mvive4pBxWweTpZz9OgwvE42eJZ?usp=sharing) + +Specify `layer_index` and `embedding_size` to truncate embeddings. + + +```python +from angle_emb import AnglE +from angle_emb.utils import cosine_similarity + + +angle = AnglE.from_pretrained('mixedbread-ai/mxbai-embed-2d-large-v1', pooling_strategy='cls').cuda() +# specify layer_index and embedding size to truncate embeddings +doc_vecs = angle.encode([ + 'The weather is great!', + 'The weather is very good!', + 'i am going to bed' +], layer_index=22, embedding_size=768) + +for i, dv1 in enumerate(doc_vecs): + for dv2 in doc_vecs[i+1:]: + print(cosine_similarity(dv1, dv2)) ``` -### ⌛ Load Third-party Models +### ⌛ Infer Third-party Models You can load any transformer-based third-party models such as `mixedbread-ai/mxbai-embed-large-v1`, `sentence-transformers/all-MiniLM-L6-v2`, and `BAAI/bge-large-en-v1.5` using `angle_emb`. diff --git a/docs/notes/pretrained_models.rst b/docs/notes/pretrained_models.rst index a4dd39c..aa0ef4a 100644 --- a/docs/notes/pretrained_models.rst +++ b/docs/notes/pretrained_models.rst @@ -24,9 +24,9 @@ LLM-based models: +------------------------------------+-----------------------------+------------------+--------------------------+------------------+---------------------------------+ | 🤗 HF (lora weight) | Backbone | Max Tokens | Prompts | Pooling Strategy | Scenario | +====================================+=============================+==================+==========================+==================+=================================+ -| `SeanLee97/angle-llama-13b-nli`_ | NousResearch/Llama-2-13b-hf | 4096 | ``Prompts.A`` | last token | English, Similarity Measurement | +| `SeanLee97/angle-llama-13b-nli`_ | NousResearch/Llama-2-13b-hf | 4096 | ``Prompts.A`` | last | English, Similarity Measurement | +------------------------------------+-----------------------------+------------------+--------------------------+------------------+---------------------------------+ -| `SeanLee97/angle-llama-7b-nli-v2`_ | NousResearch/Llama-2-7b-hf | 4096 | ``Prompts.A`` | last token | English, Similarity Measurement | +| `SeanLee97/angle-llama-7b-nli-v2`_ | NousResearch/Llama-2-7b-hf | 4096 | ``Prompts.A`` | last | English, Similarity Measurement | +------------------------------------+-----------------------------+------------------+--------------------------+------------------+---------------------------------+ .. _SeanLee97/angle-llama-13b-nli: https://huggingface.co/SeanLee97/angle-llama-13b-nli diff --git a/docs/notes/quickstart.rst b/docs/notes/quickstart.rst index 6d1012a..348d9fe 100644 --- a/docs/notes/quickstart.rst +++ b/docs/notes/quickstart.rst @@ -14,7 +14,7 @@ A few steps to get started with AnglE: Other installation methods, please refer to the `Installation` section. -⌛ Load BERT-based Model +⌛ Infer BERT-based Model ------------------------------------ 1) **With Prompts**: You can specify a prompt with `prompt=YOUR_PROMPT` in `encode` method. @@ -65,7 +65,7 @@ You can use other placeholder names. We provide a set of predefined prompts in ` -⌛ Load LLM-based Models +⌛ Infer LLM-based Models ------------------------------------ If the pretrained weight is a LoRA-based model, you need to specify the backbone via `model_name_or_path` and specify the LoRA path via the `pretrained_lora_path` in `from_pretrained` method. @@ -78,7 +78,7 @@ If the pretrained weight is a LoRA-based model, you need to specify the backbone pretrained_lora_path='SeanLee97/angle-llama-7b-nli-v2', pooling_strategy='last', is_llm=True, - torch_dtype='float16') + torch_dtype='float16').cuda() print('All predefined prompts:', Prompts.list_prompts()) vec = angle.encode({'text': 'hello world'}, to_numpy=True, prompt=Prompts.A) @@ -86,3 +86,57 @@ If the pretrained weight is a LoRA-based model, you need to specify the backbone vecs = angle.encode([{'text': 'hello world1'}, {'text': 'hello world2'}], to_numpy=True, prompt=Prompts.A) print(vecs) + +⌛ Infer BiLLM-based Models +------------------------------------ + +Specify `apply_billm` and `billm_model_class` to load and infer billm models + +.. code-block:: python + + from angle_emb import AnglE, Prompts + from angle_emb.utils import cosine_similarity + + # specify `apply_billm` and `billm_model_class` to load billm models + angle = AnglE.from_pretrained('NousResearch/Llama-2-7b-hf', + pretrained_lora_path='SeanLee97/bellm-llama-7b-nli', + pooling_strategy='last', + is_llm=True, + apply_billm=True, + billm_model_class='LlamaForCausalMask', + torch_dtype='float16').cuda() + + doc_vecs = angle.encode([ + 'The weather is great!', + 'The weather is very good!', + 'i am going to bed' + ], prompt='The representative word for sentence {text} is:"') + + for i, dv1 in enumerate(doc_vecs): + for dv2 in doc_vecs[i+1:]: + print(cosine_similarity(dv1, dv2)) + + + +⌛ Infer Espresso/Matryoshka Models +------------------------------------ + +Specify `layer_index` and `embedding_size` to truncate embeddings. + +.. code-block:: python + + from angle_emb import AnglE + from angle_emb.utils import cosine_similarity + + + angle = AnglE.from_pretrained('mixedbread-ai/mxbai-embed-2d-large-v1', pooling_strategy='cls').cuda() + # specify layer_index and embedding_size to truncate embeddings + doc_vecs = angle.encode([ + 'The weather is great!', + 'The weather is very good!', + 'i am going to bed' + ], layer_index=22, embedding_size=768) + + for i, dv1 in enumerate(doc_vecs): + for dv2 in doc_vecs[i+1:]: + print(cosine_similarity(dv1, dv2)) \ No newline at end of file