diff --git a/docs/chapter4.md b/docs/chapter4.md
index 0813b4a..768ac28 100644
--- a/docs/chapter4.md
+++ b/docs/chapter4.md
@@ -352,7 +352,7 @@ $$
 
 $$
 \begin{equation}
-|\Phi_\rho(x_1)-\Phi_\rho(x_2)|\leq|\Phi_\rho'(\xi)||x_1-x_2|
+|\Phi_\rho(x_1)-\Phi_\rho(x_2)| \leq |\Phi_\rho'(\xi)| |x_1-x_2|
 \end{equation}
 $$
 
diff --git a/docs/chapter5.md b/docs/chapter5.md
index b0eede4..340ca08 100644
--- a/docs/chapter5.md
+++ b/docs/chapter5.md
@@ -98,7 +98,7 @@ $$
 
 ### 证明简述
 
-首先，我们回顾不可知 PAC 可学的概念：对于所有分布 $\mathcal{D}$，若存在学习算法 $\mathcal{L}$ 与多项式函数 $poly(\cdot,\cdot,\cdot,\cdot)$，使得对于任意 $m\geq poly(1/\epsilon,1/\delta,size(\mathbf{x}),size(c))$，$\mathcal{L}$ 输出的假设能够满足：
+首先，我们回顾不可知 PAC 可学的概念：对于所有分布 $\mathcal{D}$，若存在学习算法 $\mathfrak{L}$ 与多项式函数 $poly(\cdot,\cdot,\cdot,\cdot)$，使得对于任意 $m\geq poly(1/\epsilon,1/\delta,size(\mathbf{x}),size(c))$，$\mathfrak{L}$ 输出的假设能够满足：
 $$
 \begin{equation}
    	P\big(\mathbb{E}(h)-\min_{h'\in\mathcal{H}}\mathbb{E}(h')\leq\epsilon\big)\geq1-\delta
diff --git a/docs/notation.md b/docs/notation.md
new file mode 100644
index 0000000..62c2b4d
--- /dev/null
+++ b/docs/notation.md
@@ -0,0 +1,27 @@
+# 主要符号表
+
+$x$ 标量  
+$x$ 向量  
+$A$ 矩阵  
+$I$ 单位阵  
+$\mathcal{X}$ 样本空间或状态空间  
+$\mathcal{H}$ 假设空间  
+$\mathcal{D}$ 概率分布  
+$D$ 数据样本(数据集)  
+$\mathbb{R}$ 实数集  
+$\mathbb{R}^+$ 正实数集  
+$\mathfrak{L}$ 学习算法  
+$(·,·,·)$ 行向量  
+$(;,;,)$ 列向量  
+$(·)^T$ 向量或矩阵转置  
+${\cdots}$ 集合  
+$[m]$ 集合 $\{1,\dots,m\}$  
+$|{\cdots}|$ 集合 ${\cdots}$ 中元素的个数  
+$\|·\|_p$ 范数, $p$ 缺省时为 $L_2$ 范数  
+$P()$, $P(·|·)$ 概率质量函数, 条件概率质量函数  
+$p(·)$, $p(·|·)$ 概率密度函数, 条件概率密度函数  
+$E_{.~\mathcal{D}}[f(·)]$ 函数 $f(·)$ 对 $·$ 在分布 $D$ 下的数学期望, 意义明确时将省略 $D$ 和(或)$·$  
+$\sup(·)$ 上确界  
+$\inf(·)$ 下确界  
+$\mathbb{I}(·)$ 指示函数, 在 $·$ 为真和假时分别取值为 $1, 0$  
+$\text{sign}(·)$ 符号函数, 在 $·<0,=0,>0$ 时分别取值为 $-1, 0, 1$