Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to store nuisance parameter objects #1473

Open
MCKnaus opened this issue Nov 20, 2024 · 2 comments
Open

Option to store nuisance parameter objects #1473

MCKnaus opened this issue Nov 20, 2024 · 2 comments

Comments

@MCKnaus
Copy link

MCKnaus commented Nov 20, 2024

This is no issue, but a suggestion for a feature.

Currently the objects include the estimated nuisance parameters Y.hat, W.hat ... However, the underlying regression_forest objects are gone for good (at least as far as I can see).

For a new paper that extracts the outcome weights of the point estimates (https://arxiv.org/abs/2411.11559), I need to know the get_forest_weights of Y.hat. My workaround is currently to estimate Y.hat externally such that I have its object (see this notebook for an illustration why and how).

I perfectly understand that saving nuisance parameter (NP) objects is not attractive to save memory. However, a store_nuisance_parameters option would be quite useful for my purposes. In the best case with control about which of the mutiple NP objects should be saved.

Just FYI that there would be a consumer of such a feature.

Thank you in any case for your great work!

@erikcs
Copy link
Member

erikcs commented Nov 24, 2024

Hi @MCKnaus, thank you very much for the suggestion (and also for the interesting reference). For cases with custom outcome/propensity models, the intended design was to construct those models outside. If you need to carry those objects with you in downstream tasks, then maybe a very simple option could be to just store them in the returned causal forest object? Like my.forest = causal_forest(X,Y,W,Y.hat=...); my.forest$Y.model = Y.model, then whenever you need access to that causal forest's Y.model you'd just access my.forest$Y.model.

@MCKnaus
Copy link
Author

MCKnaus commented Nov 25, 2024

Thank you for the suggestion. I will add this as option when revising the OutcomeWeights package. Then, users do not need to explicitly store the smoother matrix. Also it would be immediately compatible if you decide to include this feature and use Y.model as label. In the best case, it eventually boils down to running

my.cf = causal_forest(X,Y,W, store_nuisance_parameter ="Y")
omega = get_outcome_weights(my.cf)

where get_outcome_weights() calls my.cf$Y.model internally.

But feel free to ignore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants