You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently using DoWhy library for some causal analysis where my treatment is continuous and my outcome is binary. For estimating effect I used tow methods: Logistic regression through GLM and DML. Please find code snippets for their implementation:
DML
from sklearn.ensemble import GradientBoostingClassifier, GradientBoostingRegressor
from sklearn.linear_model import LassoCV
from sklearn.preprocessing import PolynomialFeatures
Could you elaborate on how to interpret ATE for both of these methods? For ex: If my ATE is -0.02, is it okay to say that 'Increasing treatment from 0.1 to 0.2, leads to 3% decrease in the outcome'?
Is the ATE returned by Logistic regression model actually the coefficient of the treatment variable of the model? Also, when I compare the coefficient returned by estimate.estimator.model for GLM estimator with the mean estimate returned by estimate_effect(), they are drastically different. Is that expected behavior?
I wanted to create a sort of a dose-response curve to see the effect of change of treatment on my outcome. For ex- how my outcome changes when I increase by treatment from 0.1-0.2, 02-0.3, 0.3-0.4 etc. In order to accomplish this, I changed the values for my control_value and treatment_value parameters in estimate_effect(). When I do so, I get different ATE with logistic regression but same ATE with DML. Why is that?
How does DoWhy work in the back to calculate ATE when the control_value and the treatment_value is provided? Specially interested in the GLM and DML effect estimation methods.
Looking forward to hearing back on this. TIA!
The text was updated successfully, but these errors were encountered:
Hello,
I am currently using DoWhy library for some causal analysis where my treatment is continuous and my outcome is binary. For estimating effect I used tow methods: Logistic regression through GLM and DML. Please find code snippets for their implementation:
Logistic Regression
import statsmodels.api as sm
estimate = model.estimate_effect(est_ident,
method_name="backdoor.generalized_linear_model",
test_significance=True,
method_params = {
'num_null_simulations':20,
'num_simulations':20,
'num_quantiles_to_discretize_cont_cols':10,
'fit_method': "statsmodels",
'glm_family': sm.families.Binomial(), # logistic regression
'need_conditional_estimates':False
},
control_value= 0.2,
treatment_value= 0.3
)
print(estimate)
DML
from sklearn.ensemble import GradientBoostingClassifier, GradientBoostingRegressor
from sklearn.linear_model import LassoCV
from sklearn.preprocessing import PolynomialFeatures
dml_estimate = model.estimate_effect(est_ident, method_name="backdoor.econml.dml.DML",
control_value = 0.1,
treatment_value = 0.2,
confidence_intervals=False,
method_params={"init_params":{'model_y':GradientBoostingClassifier(random_state = 101),
'model_t': GradientBoostingRegressor(random_state = 101),
"model_final":LassoCV(random_state = 101),
'featurizer':PolynomialFeatures(degree=1, include_bias=True),
'discrete_treatment': False,
'random_state': 101
},
"fit_params":{}})
print(dml_estimate)
Looking forward to hearing back on this. TIA!
The text was updated successfully, but these errors were encountered: