Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multinomial logistic regression #189

Open
w4rner opened this issue Feb 14, 2018 · 3 comments
Open

Multinomial logistic regression #189

w4rner opened this issue Feb 14, 2018 · 3 comments

Comments

@w4rner
Copy link

w4rner commented Feb 14, 2018

I noticed in the notebook GLMest.ipynb, you don't add a constant column to X. Are you not supposed to?

@khan1792
Copy link

In LogisticRegression function of sklearn, you need not to set constant. It's default setting is "fit_intercept = True”. Only for functions of statmodel package, you must ad it.

@w4rner
Copy link
Author

w4rner commented Feb 23, 2018

Thanks Kanyao!
I manually added a constant column, and my results was: beta 0 = -8.459980007874217e-06
I guess that explains why it is next to 0. Although I'm surprised it even worked given multicollinearity! Any thoughts?

@khan1792
Copy link

In sklearn, you can use result.intercept_ to look at the real intercept (even you add the constant column). I think when you add a constant column manually, this column become a variable whose value is constantly 1, and it will provide no information if you already have an intercept. In logistic regression, this kind of variable will serve as an offset for adjusting the intercept, but the coefficients are the same no matter you add the offset or not. I know in oversampling method, we can use this kind of variable to correct the intercept (although the exact constant value of the variable will be calculated in a certain way). In this logic, I speculate that when you add the constant column, the real intercept (no constant column in the model) will become the biased intercept in your model minus beta 0. Since the beta 0 is very small, the biased intercept is very similar the real intercept. You can check my speculation in your code (don't forget to use result.intercept_ to look at the intercept even you add the constant column).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants