diff --git a/docs/source/en/quantization/gptq.md b/docs/source/en/quantization/gptq.md
index ce6331fb01c412..5e53d643c07ce0 100644
--- a/docs/source/en/quantization/gptq.md
+++ b/docs/source/en/quantization/gptq.md
@@ -120,7 +120,7 @@ model = AutoModelForCausalLM.from_pretrained("{your_username}/opt-125m-gptq", de
 
 ## Marlin
 
-[Marlin]([https://github.com/turboderp/exllama](https://github.com/IST-DASLab/marlin)) is a CUDA gptq kernel, 4-bit only, that is highly optimized for the Nvidia A100 GPU (Ampere) architecture where the the loading, dequantization, and execution of post-dequantized weights are highly parallelized offering a substantial inference improvement versus the original CUDA gptq kernel. Marlin is only available for quantized inference and does support model quantization.
+[Marlin](https://github.com/IST-DASLab/marlin)) is a CUDA gptq kernel, 4-bit only, that is highly optimized for the Nvidia A100 GPU (Ampere) architecture where the the loading, dequantization, and execution of post-dequantized weights are highly parallelized offering a substantial inference improvement versus the original CUDA gptq kernel. Marlin is only available for quantized inference and does support model quantization.
 
 
 ## ExLlama