-
Notifications
You must be signed in to change notification settings - Fork 0
Logistic Regression
Logistic regression is a type of regression used when the outcome variable is binary or ordinal ("yes" or "no", "risk" or "no risk"). It is commonly used for predicting the probability of occurrence of an event, based on several predictor variables that may either be numerical (continuous or discrete) or categorical (ordinal or nominal) (adapted from R-Bloggers http://goo.gl/FZAXA).
This test requires an dataset with a vector for an outcome variable of a character type, with dichotomous characteristic, and vectors with the predictors or covariances with different types of variables. Predictors might be numerical, categorical or discrete.
Taking the Cedegren dataset, reported by Manning (2007) available at http://goo.gl/KmJSD
install.packages("MASS",repos=("http://cran.us.r-project.org"))
install.packages("verification",repos=("http://cran.us.r-project.org"))
library(MASS)
library(verification)
attach(anorexia)
str(anorexia)
head(anorexia)
tail(anorexia)
summary(anorexia)
"We used the Bhapkar coefficient to measure concordance between raters."
Annotated output from function (example from http://goo.gl/NsR20)
#This function will create a glm object (here taken as 'anorex.1' who will carry all the statistics analysis for the logistic regression).
#Items for the functions are: outcome=Postwt; Predictors=Prewt,Treat; family=type of model (see?glm for more detais; data=dataset used for the analysis)
anorex.1 <- glm(Postwt ~ Prewt + Treat,
family = gaussian, data = anorexia)
#summary function will provide a summary for the logisti regression analysis containing:
#1. model formula
#2. deviance residuals
#3. Coefficients with p-values for each analysis
#4. Deviances and degrees of freedom
#5. AIC parameter
summary(anorex.1)
#anova function will calculate differences between models, or compare one model to the null model hypothesis.
anova(anorex.1,test="Chisq")
#anova(anorex.1,anorex.2,test="Chisq") --- A variation with a second glm model (anorex.2)
#Analyse logistic models' adequacy and fitness
pred_fig1 <- as.numeric(fitted(anorex.1))
roc.area(Postwt, pred_fig1)
logistic.display(logisticmodelfigure1)#Logistic.display will give the OR coefficients as well as the 95%CI
residuals(logisticmodelfigure1) # residuals
influence(logisticmodelfigure1) # regression diagnostics
layout(matrix(c(1,2,3,4),2,2)) # creates the white space for 4 graphs/page
plot(logisticmodelfigure1) #generates 4 graphs/page
- Original function description at http://rss.acs.unt.edu/Rdoc/library/irr/html/bhapkar.html
- Brief explanation at http://www.john-uebersax.com/stat/mcnemar.htm#bhapkar
- Original article: Bhapkar, V.P. (1966). A note on the equivalence of two test criteria for hypotheses in categorical data. Journal of the American Statistical Association, 61, 228-235. http://goo.gl/P21a3