From 5d3a629894eb1e34c7eb9243cabe22e88ee9320f Mon Sep 17 00:00:00 2001
From: DarkLight1337 <tlleungac@connect.ust.hk>
Date: Mon, 2 Dec 2024 16:44:26 +0000
Subject: [PATCH] Update

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
---
 docs/source/usage/compatibility_matrix.rst | 6 +++---
 docs/source/usage/pooling_models.rst       | 8 ++++++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/docs/source/usage/compatibility_matrix.rst b/docs/source/usage/compatibility_matrix.rst
index a93632ff36fb8..79ca27fb694eb 100644
--- a/docs/source/usage/compatibility_matrix.rst
+++ b/docs/source/usage/compatibility_matrix.rst
@@ -39,7 +39,7 @@ Feature x Feature
      - :abbr:`prmpt adptr (Prompt Adapter)`
      - :ref:`SD <spec_decode>`
      - CUDA graph
-     - :abbr:`emd (Embedding Models)`
+     - :abbr:`pooling (Pooling Models)`
      - :abbr:`enc-dec (Encoder-Decoder Models)`
      - :abbr:`logP (Logprobs)`
      - :abbr:`prmpt logP (Prompt Logprobs)`
@@ -151,7 +151,7 @@ Feature x Feature
      - 
      - 
      - 
-   * - :abbr:`emd (Embedding Models)`
+   * - :abbr:`pooling (Pooling Models)`
      - ✗
      - ✗
      - ✗ 
@@ -386,7 +386,7 @@ Feature x Hardware
      - ✅
      - ✗
      - ✅
-   * - :abbr:`emd (Embedding Models)`
+   * - :abbr:`pooling (Pooling Models)`
      - ✅
      - ✅
      - ✅
diff --git a/docs/source/usage/pooling_models.rst b/docs/source/usage/pooling_models.rst
index a2554d1b0eada..01b4e5fa5e353 100644
--- a/docs/source/usage/pooling_models.rst
+++ b/docs/source/usage/pooling_models.rst
@@ -3,7 +3,7 @@
 Using Pooling Models
 ====================
 
-vLLM provides second-class support for pooling models, including embedding, reranking and reward models.
+vLLM also supports pooling models, including embedding, reranking and reward models.
 
 In vLLM, pooling models implement the :class:`~vllm.model_executor.models.VllmModelForPooling` interface.
 These models use a :class:`~vllm.model_executor.layers.Pooler` to aggregate the final hidden states of the input
@@ -11,7 +11,11 @@ before returning them.
 
 Technically, any :ref:`generative model <generative_models>` in vLLM can be converted into a pooling model
 by aggregating and returning the hidden states directly, skipping the generation step.
-Nevertheless, you should use those that are specifically trained as pooling models.
+Nevertheless, to get the best results, you should use pooling models that are specifically trained as such.
+
+We currently support pooling models primarily as a matter of convenience.
+As shown in the :code:`Compatibility Matrix <compatibility_matrix>`, most vLLM features are not applicable to
+pooling models as they only work on the generation or decode stage, so performance may not improve as much.
 
 Offline Inference
 -----------------