diff --git a/docs/examples/vector_stores/qdrant_hybrid.ipynb b/docs/examples/vector_stores/qdrant_hybrid.ipynb index 5c5cd089b66c0..52f72f87cf9e3 100644 --- a/docs/examples/vector_stores/qdrant_hybrid.ipynb +++ b/docs/examples/vector_stores/qdrant_hybrid.ipynb @@ -15,9 +15,9 @@ "\n", "Qdrant supports hybrid search by combining search results from `sparse` and `dense` vectors.\n", "\n", - "`dense` vectors are the ones you have probably already been using -- embedding models from OpenAI, BGE, SentenceTransformers, etc. are typicaly `dense` embedding models. They create a numerical representation of a piece of text, represented as a long list of numbers. These `dense` vectors can capture rich semantics across the entire piece of text.\n", + "`dense` vectors are the ones you have probably already been using -- embedding models from OpenAI, BGE, SentenceTransformers, etc. are typically `dense` embedding models. They create a numerical representation of a piece of text, represented as a long list of numbers. These `dense` vectors can capture rich semantics across the entire piece of text.\n", "\n", - "`sparse` vectors are slightly different. They use a specialized approach or model (TF-IDF, BM25, SPLADE, etc.) for generating vectors. These vectors are typically mostly zeros, making them `sparse` vectors. These `sparse` vectors are great at capturing specific keywords and similar small-details.\n", + "`sparse` vectors are slightly different. They use a specialized approach or model (TF-IDF, BM25, SPLADE, etc.) for generating vectors. These vectors are typically mostly zeros, making them `sparse` vectors. These `sparse` vectors are great at capturing specific keywords and similar small details.\n", "\n", "This notebook walks through setting up and customizing hybrid search with Qdrant and `naver/efficient-splade-VI-BT-large` variants from Huggingface." ] @@ -140,9 +140,9 @@ "source": [ "## Hybrid Queries\n", "\n", - "When querying with hybrid mode, we can set `similarity_top_k` and `sparse_top_k` seperately.\n", + "When querying with hybrid mode, we can set `similarity_top_k` and `sparse_top_k` separately.\n", "\n", - "`sparse_top_k` represents how many nodes will be retrieved from each dense and sparse query. For example, if `sparse_top_k=5` is set, that means I will retrieve 5 nodes using sparse vectors and 5 nodes using dense vectoors.\n", + "`sparse_top_k` represents how many nodes will be retrieved from each dense and sparse query. For example, if `sparse_top_k=5` is set, that means I will retrieve 5 nodes using sparse vectors and 5 nodes using dense vectors.\n", "\n", "`similarity_top_k` controls the final number of returned nodes. In the above setting, we end up with 10 nodes. A fusion algorithm is applied to rank and order the nodes from different vector spaces ([relative score fusion](https://weaviate.io/blog/hybrid-search-fusion-algorithms#relative-score-fusion) in this case). `similarity_top_k=2` means the top two nodes after fusion are returned." ] @@ -207,7 +207,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Lets compare to not using hyrbid search at all!" + "Lets compare to not using hybrid search at all!" ] }, { @@ -249,7 +249,7 @@ "source": [ "### Async Support\n", "\n", - "And of course, async queries are also supported (note that in-memory qdrant data is not shared between async and sync clients!)" + "And of course, async queries are also supported (note that in-memory Qdrant data is not shared between async and sync clients!)" ] }, { @@ -419,7 +419,7 @@ "source": [ "### Customizing `hybrid_fusion_fn()`\n", "\n", - "By default, when running hbyrid queries with qdrant, Relative Score Fusion is used to combine the nodes retrieved from both sparse and dense queries. \n", + "By default, when running hbyrid queries with Qdrant, Relative Score Fusion is used to combine the nodes retrieved from both sparse and dense queries. \n", "\n", "You can customize this function to be any other method (plain deduplication, Reciprocal Rank Fusion, etc.).\n", "\n", @@ -576,7 +576,7 @@ " },\n", ")\n", "\n", - "# enable hyrbid since we created a sparse collection\n", + "# enable hybrid since we created a sparse collection\n", "vector_store = QdrantVectorStore(\n", " collection_name=\"llama2_paper\", client=client, enable_hybrid=True\n", ")"