Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stratify and stratify_s methods exclude units with propensity score equal to zero #4

Open
diego-mazon opened this issue Oct 20, 2018 · 1 comment

Comments

@diego-mazon
Copy link

diego-mazon commented Oct 20, 2018

This happens regardless of trimming.

causal.reset()
print causal.summary_stats
print "There are 453 control units and 547 treatment units \n"

causal.est_propensity_s()
score = causal.propensity['fitted']
print "minimum propensity score = ", score.min()
print "number of units with null propensity score = ", len(score[score == 0])
print "\n"

print "stratify using implemented algorithm"
causal.stratify_s()
print "blocks", causal.blocks
print(causal.strata)
print "Why 7 controls only?!"
print "\n"

print "stratify using a single segment [0, 1]"
causal.stratify()
print "blocks", causal.blocks
print(causal.strata)
print "Why 7 controls only?!"
print "\n"


print "Replace null scores with 1e-100"
score[score == 0] = 1e-100
print "minimum propensity score = ", score.min()
causal.stratify_s()
print "blocks", causal.blocks
print(causal.strata)
print "453 controls as should be"


Summary Statistics

                   Controls (N_c=453)         Treated (N_t=547)             
   Variable         Mean         S.d.         Mean         S.d.     Raw-diff

          Y        0.031        1.389        6.097        1.287        6.066

                   Controls (N_c=453)         Treated (N_t=547)             
   Variable         Mean         S.d.         Mean         S.d.     Nor-diff

         X0       -0.021        0.976        7.004        0.980        7.184
         X1       -0.059        1.038       -0.054        0.993        0.004
         X2        0.506        0.501        0.497        0.500       -0.017
         X3        0.552        9.884        0.592       10.317        0.004
         X4        1.119        1.019        0.956        1.023       -0.160

There are 453 control units and 547 treatment units

minimum propensity score = 0.0
number of units with null propensity score = 446

stratify using implemented algorithm
blocks [0, 1]

Stratification Summary

          Propensity Score         Sample Size     Ave. Propensity   Outcome

Stratum Min. Max. Controls Treated Controls Treated Raw-diff

     1     0.000     1.000         7       547     0.000     1.000     2.984

Why 7 controls only?!

stratify using a single segment [0, 1]
blocks [0, 1]

Stratification Summary

          Propensity Score         Sample Size     Ave. Propensity   Outcome

Stratum Min. Max. Controls Treated Controls Treated Raw-diff

     1     0.000     1.000         7       547     0.000     1.000     2.984

Why 7 controls only?!

Replace null scores with 1e-100
minimum propensity score = 1e-100
blocks [0, 1]

Stratification Summary

          Propensity Score         Sample Size     Ave. Propensity   Outcome

Stratum Min. Max. Controls Treated Controls Treated Raw-diff

     1     0.000     1.000       453       547     0.000     1.000     6.066

453 controls as should be

@diego-mazon
Copy link
Author

I would fix it by modifying subset function in causal.py

def subset(p_low, p_high):
            # deal with first non-single-point segment separately 
            # to include vanishing pscore units within it
            if p_low == 0 and p_high != 0:  
                 return (p_low <= pscore) & (pscore <= p_high)
            else:
                return (p_low < pscore) & (pscore <= p_high)

I think that pscore=0 units should be included in this segment rather than in (0, 0] or [0, 0] segments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant