Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to deal with the error in using jstable::TableSubgroupMultiGLM( ) for subgroup analysis of multiple logistic regression #6

Closed
agnesisyss opened this issue Sep 29, 2023 · 13 comments

Comments

@agnesisyss
Copy link

hi, kim, I am agnes.
Thanks for your useful package.
I am a tiro of R, I use this package jstable::TableSubgroupMultiGLM( ) for subgroup analysis of multiple logistic regression
below is my code:
data.design <- svydesign(id = ~sdmvpsu,
weights = nhs_wt,
data = A)
TableSubgroupMultiGLM(Y
X, var_subgroups = c("sex","eth","maritalstatus"), #both Y and X are categorical variables

  •                   data = data.design, 
    
  •                   family = "gaussian", 
    
  •                   line = TRUE)
    

then, there report the error :
Error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, :
NA/NaN/Inf in 'y'
In addition: Warning messages:
1: In Ops.factor(y, mu) : ‘-’ not meaningful for factors
2: In Ops.factor(eta, offset) : ‘-’ not meaningful for factors
3: In Ops.factor(y, mu) : ‘-’ not meaningful for factors

would your please tell me how to deal with this error above, thanks a lot.

@jinseob2kim
Copy link
Owner

Can you convert Y variable to integer or numeric?(0, 1)

Please show me Y variable or your dataset

@agnesisyss
Copy link
Author

hi Kim:
thanks a lot for your reply, I have write an e-mail([email protected]) to you, but i am not sure wether you can receive.
the attachment is my data for analysis.
Y is dependent variable,and X ,age, sex,race, education and poverty are independent variables.

                                                                                                           thanks a lot
                                                                                                            Sisi

recode.csv

@jinseob2kim
Copy link
Owner

hi Kim: thanks a lot for your reply, I have write an e-mail([email protected]) to you, but i am not sure wether you can receive. the attachment is my data for analysis. Y is dependent variable,and X ,age, sex,race, education and poverty are independent variables.

                                                                                                           thanks a lot
                                                                                                            Sisi

recode.csv

Sorry. Now, my function can't support X with 3 category: only 2 category OK. The below code is example with "Y ~ sex"

library(survival);library(jstable);library(survey);library(data.table);library(magrittr)

a <- fread("recode.csv")
for (v in c("sex", "race","education")){
  a[[v]] <- factor(a[[v]])
}

data.design <- svydesign(id = ~sdmvpsu, weights = ~nhs_wt, data = a)
TableSubgroupMultiGLM(Y~sex, var_subgroups = c("race","education"), data = data.design, family = "binomial") 
                      
                

@ltj-github
Copy link

hi Kim: thanks a lot for your reply, I have write an e-mail([email protected]) to you, but i am not sure wether you can receive. the attachment is my data for analysis. Y is dependent variable,and X ,age, sex,race, education and poverty are independent variables.嗨,金[email protected]附件是我的分析资料。Y为因变量,X、年龄、性别、种族、教育程度和贫困程度为自变量。

                                                                                                           thanks a lot
                                                                                                            Sisi

recode.csv

Sorry. Now, my function can't support X with 3 category: only 2 category OK. The below code is example with "Y ~ sex"抱歉现在,我的函数不能支持3类X:只有2类OK。下面的代码是“Y ~ sex”的示例

library(survival);library(jstable);library(survey);library(data.table);library(magrittr)

a <- fread("recode.csv")
for (v in c("sex", "race","education")){
  a[[v]] <- factor(a[[v]])
}

data.design <- svydesign(id = ~sdmvpsu, weights = ~nhs_wt, data = a)
TableSubgroupMultiGLM(Y~sex, var_subgroups = c("race","education"), data = data.design, family = "binomial") 
                      
                

Hi ,I'm using this code and I'm getting :
data.design <- svydesign(id = ~sdmvpsu, weights = ~nhs_wt, data = a)

TableSubgroupMultiGLM(Y~sex, var_subgroups = c("race","education"), data = data.design, family = "binomial")
Error in purrr::map():
ℹ In index: 1.
Caused by error in solve.default():
! The system is computationally singular: the inverse condition number=3.26124e-19
Run rlang::last_trace() to see where the error occurred.

@jinseob2kim
Copy link
Owner

I think this is a converge issue. Can you try each subgroup analysis without my package?

Ex) svyglm(Y~sex, design = subset(a, race==1), family = "quasibinomial")

@ltj-github
Copy link

When I use "svyglm(Y~sex, design = subset(a, race==1), family = "quasibinomial")", I encounter the error "Error in UseMethod("svyglm", design) : "svyglm" does not apply to "c('tbl_df', 'tbl', 'data.frame')"Method of target object". However, when I change the data to weighted data "data.design", I get the result. But using "TableSubgroupMultiGLM" still does not work.

@jinseob2kim
Copy link
Owner

Can you get result of all subset variable/value combinations? If you share your data, I run

@ltj-github
Copy link

At present, my classification data is named in English, and the gtsummary package is usually used. I will convert the variables to the form of 0,1. Maybe you will use it more conveniently.

@jinseob2kim
Copy link
Owner

Can you try other subgroup analysis?

svyglm(Y~sex, design = subset(a, race==0), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, sex==0), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, sex==1), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, education==0), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, education==1), family = "quasibinomial")

@ltj-github
Copy link

data.csv
Here's my data.

@ltj-github
Copy link

Can you try other subgroup analysis?你能尝试其他的亚组分析吗?

svyglm(Y~sex, design = subset(a, race==0), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, sex==0), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, sex==1), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, education==0), family = "quasibinomial")
svyglm(Y~sex, design = subset(a, education==1), family = "quasibinomial")

This is the result of trying, only the last one will get the result

a <- fread("recode.csv")
for (v in c("sex", "race","education")){

  • a[[v]] <- factor(a[[v]])
  • }

a <- svydesign(id = sdmvpsu, weights = nhs_wt, data = a)
svyglm(Y
sex, design = subset(a, race==0), family = "quasibinomial")
Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
svyglm(Y
sex, design = subset(a, sex==0), family = "quasibinomial")
Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
svyglm(Ysex, design = subset(a, sex==1), family = "quasibinomial")
Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
svyglm(Y
sex, design = subset(a, education==0), family = "quasibinomial")
Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
svyglm(Y~sex, design = subset(a, education==1), family = "quasibinomial")
1 - level Cluster Sampling design (with replacement)
With (3) clusters.
subset(a, education == 1)

Call: svyglm(formula = Y ~ sex, design = subset(a, education == 1),
family = "quasibinomial")

Coefficients:
(Intercept) sex2
-1.9718 -0.5729

Degrees of Freedom: 1823 Total (i.e. Null); 1 Residual
Null Deviance: 1143
Residual Deviance: 1131 AIC: NA

@jinseob2kim
Copy link
Owner

I run the code below with your data

library(survey);library(data.table);library(magrittr)

a <- fread("data (8).csv")
for (v in c("Sex", "Race","education.attainment")){
  a[[v]] <- factor(a[[v]])
}

svyglm(Y~Sex, design = subset(data.design, Race == 2), family = quasibinomial()) %>% summary

Then, P value can't be calculated. So interaction P can't be calculated too.

Call:
svyglm(formula = Y ~ Sex, design = subset(data.design, Race == 
    2), family = quasibinomial())

Survey design:
subset(data.design, Race == 2)

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.04358    0.13378 -15.275      NaN
Sex2         0.59439    0.08481   7.008      NaN

Zero or negative residual df; p-values not defined

(Dispersion parameter for quasibinomial family taken to be 1.002519)

Number of Fisher Scoring iterations: 4

@ltj-github
Copy link

Thank you for your response. I saw a related article that seems to be about issues with the survey package. The author of the article contacted Professor Thomas, the author of the survey package, and he said that the anova.svyglm function needs to be rewritten in order to work properly. The professor mentioned that the next version may improve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants