Skip to content

Pearson's chi square test

rpietro edited this page Aug 10, 2012 · 1 revision

.

Input

Required variables to run this test

The most common use of this test is to check whether two categorical variables have frequencies that are independent. For example, is there an association between being a male (male/female) and having lung cancer (yes/no)? Data can be entered in two formats, either providing the individual frequencies or providing individual variables.

> numerators  <-  c(10, 30) # 10 and 30 are your sample numerators, respectively the number of women and men with cancer in your sample
> denominators  <-  c(100, 110) # 100 and 110 are your sample denominators, namely the number of women and men in your sample who don't have lung cancer 
> table1 <- rbind(numerators, denominators) #turning numerators and enominators into a table

How to describe this test in your Methods section

"We used Pearson's chi-square test to measure the association among categorical variables, namely [add the variables]."

Output

Annotated output

> chisq.test(table1) #see previous section on how to format this table
> chisq.test(gender, lung_cancer) #alternative form if you have two categorical variables such as gender (male/female) and lung_cancer (yes/no)

	Pearson's Chi-squared test with Yates' continuity correction

data:  table1 
X-squared = 6.0889, df = 1, p-value = 0.0136 #values for the chi-square statistic, the number of degrees of freedom and the p value indicating that the two variables are associated (or that we can't prove that they are not independent). 

In a manuscript you would say that male gender and the presence of lung cancer were associated (p = 0.0136)

Annotated reference

  1. full worked examples http://goo.gl/qkYOs
  2. Brief explanation about the underlying theory http://goo.gl/Avev5