Skip to content

Commit

Permalink
docs: fixes qdrant_hybrid.ipynb typos (run-llama#9729)
Browse files Browse the repository at this point in the history
docs: qdrant_hybrid.ipynb typos
  • Loading branch information
Anush008 authored Dec 28, 2023
1 parent e114f1f commit 1170539
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions docs/examples/vector_stores/qdrant_hybrid.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
"\n",
"Qdrant supports hybrid search by combining search results from `sparse` and `dense` vectors.\n",
"\n",
"`dense` vectors are the ones you have probably already been using -- embedding models from OpenAI, BGE, SentenceTransformers, etc. are typicaly `dense` embedding models. They create a numerical representation of a piece of text, represented as a long list of numbers. These `dense` vectors can capture rich semantics across the entire piece of text.\n",
"`dense` vectors are the ones you have probably already been using -- embedding models from OpenAI, BGE, SentenceTransformers, etc. are typically `dense` embedding models. They create a numerical representation of a piece of text, represented as a long list of numbers. These `dense` vectors can capture rich semantics across the entire piece of text.\n",
"\n",
"`sparse` vectors are slightly different. They use a specialized approach or model (TF-IDF, BM25, SPLADE, etc.) for generating vectors. These vectors are typically mostly zeros, making them `sparse` vectors. These `sparse` vectors are great at capturing specific keywords and similar small-details.\n",
"`sparse` vectors are slightly different. They use a specialized approach or model (TF-IDF, BM25, SPLADE, etc.) for generating vectors. These vectors are typically mostly zeros, making them `sparse` vectors. These `sparse` vectors are great at capturing specific keywords and similar small details.\n",
"\n",
"This notebook walks through setting up and customizing hybrid search with Qdrant and `naver/efficient-splade-VI-BT-large` variants from Huggingface."
]
Expand Down Expand Up @@ -140,9 +140,9 @@
"source": [
"## Hybrid Queries\n",
"\n",
"When querying with hybrid mode, we can set `similarity_top_k` and `sparse_top_k` seperately.\n",
"When querying with hybrid mode, we can set `similarity_top_k` and `sparse_top_k` separately.\n",
"\n",
"`sparse_top_k` represents how many nodes will be retrieved from each dense and sparse query. For example, if `sparse_top_k=5` is set, that means I will retrieve 5 nodes using sparse vectors and 5 nodes using dense vectoors.\n",
"`sparse_top_k` represents how many nodes will be retrieved from each dense and sparse query. For example, if `sparse_top_k=5` is set, that means I will retrieve 5 nodes using sparse vectors and 5 nodes using dense vectors.\n",
"\n",
"`similarity_top_k` controls the final number of returned nodes. In the above setting, we end up with 10 nodes. A fusion algorithm is applied to rank and order the nodes from different vector spaces ([relative score fusion](https://weaviate.io/blog/hybrid-search-fusion-algorithms#relative-score-fusion) in this case). `similarity_top_k=2` means the top two nodes after fusion are returned."
]
Expand Down Expand Up @@ -207,7 +207,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets compare to not using hyrbid search at all!"
"Lets compare to not using hybrid search at all!"
]
},
{
Expand Down Expand Up @@ -249,7 +249,7 @@
"source": [
"### Async Support\n",
"\n",
"And of course, async queries are also supported (note that in-memory qdrant data is not shared between async and sync clients!)"
"And of course, async queries are also supported (note that in-memory Qdrant data is not shared between async and sync clients!)"
]
},
{
Expand Down Expand Up @@ -419,7 +419,7 @@
"source": [
"### Customizing `hybrid_fusion_fn()`\n",
"\n",
"By default, when running hbyrid queries with qdrant, Relative Score Fusion is used to combine the nodes retrieved from both sparse and dense queries. \n",
"By default, when running hbyrid queries with Qdrant, Relative Score Fusion is used to combine the nodes retrieved from both sparse and dense queries. \n",
"\n",
"You can customize this function to be any other method (plain deduplication, Reciprocal Rank Fusion, etc.).\n",
"\n",
Expand Down Expand Up @@ -576,7 +576,7 @@
" },\n",
")\n",
"\n",
"# enable hyrbid since we created a sparse collection\n",
"# enable hybrid since we created a sparse collection\n",
"vector_store = QdrantVectorStore(\n",
" collection_name=\"llama2_paper\", client=client, enable_hybrid=True\n",
")"
Expand Down

0 comments on commit 1170539

Please sign in to comment.