forked from duty-machine/duty-machine
-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
15 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
--- | ||
title: "分类变量也可以计算相关性!" | ||
date: 2024-11-19T00:04:37Z | ||
draft: ["false"] | ||
tags: [ | ||
"fetched", | ||
"R语言生信医学统计与科研" | ||
] | ||
categories: ["Acdemic"] | ||
--- | ||
分类变量也可以计算相关性! by R语言生信医学统计与科研 | ||
------ | ||
<div><p>对于分类变量来说,如果是有序分类变量,可以使用Wilcoxon秩和检验,推断各等级强度的差异;对于无序分类变量,可以使用卡方检验/Fisher精确检验,比较频数分布的差异。上述这些统计检验,可以解释为列联表中行列变量是否存在统计学的差异(是否显著相互影响),但是无法给出相关性(关联)的强度或方向。</p><p>接下来,我们来讲解一下常见的用于统计列联表中行列变量相关性的系数:(1)Phi-coefficient,只适用于2X2的四格列联表,取值为[-1, 1]。其中,1表示强正相关,-1表示强负相关;(2)Pearson contingency coefficient,是Phi-coefficinet的矫正和推广,可以用于多维列联表资料,取值为[0,1];(3)Cramer`s V coefficient,是对行列数量不同时,对Pearson contingency coefficient的矫正,<span>取值为[0,1</span><span>],该系数不受样本量的限制。</span><br></p><p><span>在R中,可以通过 vcd 包来实现。<br></span></p><section><ul><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li></ul><pre data-lang="properties"><code><span><span>></span> <span>library(vcd)</span></span></code><code><span><span>></span> <span>tab <- table(lung$sex, lung$ph.ecog)</span></span></code><code><span><span>></span> <span>tab</span></span></code><code><span> </span></code><code><span> <span>0</span> <span>1 2 3</span></span></code><code><span> <span>1</span> <span>36 71 29 1</span></span></code><code><span> <span>2</span> <span>27 42 21 0</span></span></code><code><span><span>></span> <span>assocstats(tab)</span></span></code><code><span> <span>X^2</span> <span>df P(> X^2)</span></span></code><code><span><span>Likelihood</span> <span>Ratio 1.6863 3 0.63998</span></span></code><code><span><span>Pearson</span> <span>1.3341 3 0.72105</span></span></code><code><span><br></span></code><code><span><span>Phi-Coefficient</span> : <span>NA </span></span></code><code><span><span>Contingency</span> <span>Coeff.: 0.076 </span></span></code><code><span><span>Cramer's</span> <span>V : 0.077</span></span></code></pre></section><p>解读:</p><ul><li><p><span>Likelihood Ratio:似然比卡方检验</span></p></li><li><p><span>Pearson:皮尔森卡方检验<br></span></p></li><li><p><span>Phi-Coefficient:只适用于2x2的四格列联表,这里不适用,输出NA</span></p></li><li><p><span>Contingency Coeff.:列联系数</span></p></li><li><p><span>Cramer's V:克莱姆相关系数</span></p></li></ul><p><mp-style-type data-value="3"></mp-style-type></p></div> | ||
<hr> | ||
<a href="https://mp.weixin.qq.com/s/jbM1MO2LWmoxZCTEUPmvFQ",target="_blank" rel="noopener noreferrer">原文链接</a> |