Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ch3 re write - combining research design and randomization #448

Merged
merged 97 commits into from
Jul 8, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
872067c
[ch3] add luizas skeleton headers
kbjarkefur May 6, 2020
f046885
[ch3] new section and subsection titles
kbjarkefur May 6, 2020
8bef870
[ch3] move sampling/assignment intuition sections
kbjarkefur May 6, 2020
bbbf712
[research design] firs kulling of darlings
kbjarkefur May 15, 2020
1ea35fa
[reserach design] second kulling
kbjarkefur May 18, 2020
5c55ced
[ch3] intro to designs impact on data
kbjarkefur May 20, 2020
6e44869
[ch3] experimental
kbjarkefur May 20, 2020
7a40bd2
[ch3] quasi, RD, and IV
kbjarkefur May 20, 2020
d8323b8
[ch3] matching
kbjarkefur May 20, 2020
ade8b4b
[ch3] cross sectional vs longitudinal data
kbjarkefur May 20, 2020
9019c54
Merge pull request #446 from worldbank/join-design-framework
kbjarkefur May 20, 2020
694240e
[ch3] combine design and measurement framework
kbjarkefur May 20, 2020
274ccfc
[ch3] new titles
kbjarkefur May 20, 2020
b083c94
[ch3] data map first draft
kbjarkefur May 20, 2020
42d394f
[ch3] first draft new intro to samprand section
kbjarkefur May 22, 2020
3baf72f
[ch3] monitor data
kbjarkefur May 22, 2020
1ea7916
[ch3] masterdata in data map
kbjarkefur May 22, 2020
e8d7c98
[ch3] great info to go on the wiki
kbjarkefur May 22, 2020
21c72ec
[ch3] proof read edits
kbjarkefur May 22, 2020
b37a0e5
[ch3] BBD proof + edits
bbdaniels May 22, 2020
4f8561d
[CH3] - intro not ready
kbjarkefur May 22, 2020
98d0cd1
Update sampling-randomization-power.tex
mariaruth May 27, 2020
1bc431c
Update sampling-randomization-power.tex
mariaruth May 27, 2020
5e5e9fb
[ch3] name section referred to
kbjarkefur May 28, 2020
ae81ecc
remove master in master list, as we compare to master data below
kbjarkefur May 28, 2020
26cea36
delete sentence about stats consequence of ex post sample edit
kbjarkefur May 28, 2020
069a645
move stats implication of rand assign to exp design section
kbjarkefur May 28, 2020
92985f8
Break up sentence
kbjarkefur May 28, 2020
36406ce
make advanced topics one subsection
kbjarkefur May 28, 2020
8146f00
remove RCT from section of balance table
kbjarkefur Jun 8, 2020
151ec71
Moved what section is about
kbjarkefur Jun 8, 2020
51e9ec2
longitudional subsection title
kbjarkefur Jun 8, 2020
3d80583
Flipped to positive
kbjarkefur Jun 8, 2020
44fc3e7
section divisors
kbjarkefur Jun 8, 2020
60c5987
move master data discuss from sampling to data map
kbjarkefur Jun 8, 2020
d4fc081
Start bbd reorg
bbdaniels Jun 9, 2020
d60933d
Clusters and strata
bbdaniels Jun 9, 2020
69e2005
More reorganization
bbdaniels Jun 9, 2020
fef1a96
Darling massacre 1
bbdaniels Jun 9, 2020
3e586cf
Darling massacre 2
bbdaniels Jun 9, 2020
6a2582c
Some cleanup
bbdaniels Jun 9, 2020
4a45a04
[randomization] reproducible and not replicable randomization
kbjarkefur Jun 10, 2020
5db42d0
smaller edits to datamap/randomization chapter
kbjarkefur Jun 10, 2020
ab2ad81
master datasets and data maps
kbjarkefur Jun 10, 2020
5e40f86
no missing values in the master data
kbjarkefur Jun 10, 2020
2f42e98
move longitudional aspect to sub-section
kbjarkefur Jun 10, 2020
c1e856a
master data / data map in research design
kbjarkefur Jun 10, 2020
5b18df9
monitoring data
kbjarkefur Jun 11, 2020
88319f9
update time periods subsection
kbjarkefur Jun 11, 2020
e4dc4f5
data set -> dataset
kbjarkefur Jun 11, 2020
449862a
randomized assignment, not random assign. or randomization
kbjarkefur Jun 11, 2020
584fdd9
re-writing intro
kbjarkefur Jun 11, 2020
f776265
bring back iebaltab, plz plz
kbjarkefur Jun 11, 2020
5a09841
tie sampling and random to master data and map
kbjarkefur Jun 11, 2020
5c54a1a
in field real time random best practices
kbjarkefur Jun 11, 2020
5bd9811
tie cluster and strata to master data and map
kbjarkefur Jun 11, 2020
a777683
italizice terms first mention but not yet defined
kbjarkefur Jun 11, 2020
6ed198f
intro - language edits
kbjarkefur Jun 12, 2020
9a6f293
iv regression master dataset reqs
kbjarkefur Jun 12, 2020
9a62fec
Ben edits section: Translating research design to master data
kbjarkefur Jun 12, 2020
ba60ff3
ben edits: Implementing random sampling and treatment assignments
kbjarkefur Jun 12, 2020
8ad645c
ben edits to example code
kbjarkefur Jun 12, 2020
5570cb0
ben edits: example code - 2
kbjarkefur Jun 12, 2020
85343a4
remove that all instruments needs to go in master dataset
kbjarkefur Jun 17, 2020
f98b46c
strange sentence
kbjarkefur Jun 22, 2020
3cdba9c
in panel data only
kbjarkefur Jun 22, 2020
1e838c7
better way to introduce monitoring data
kbjarkefur Jun 22, 2020
cc27081
longitudinal fix grammar
kbjarkefur Jun 22, 2020
6760d00
remove that rand is not reproducible in other software
kbjarkefur Jun 22, 2020
3588d59
escape underscores in URLs
kbjarkefur Jun 22, 2020
3abb35b
Luiza's review of measurement framework
kbjarkefur Jun 22, 2020
349cdc7
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
1106c16
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
87d6fea
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
34f0b8e
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
f4f5b13
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
cae22d3
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
5136698
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
23157a3
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
f72d160
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
19e3f60
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
a34b851
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
7ba3188
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
5f7db8b
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
ba52881
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
07747b4
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
5f78b2a
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
07a2ce3
Update chapters/sampling-randomization-power.tex
mariaruth Jun 22, 2020
43d9dbf
explain iematch
kbjarkefur Jun 22, 2020
f671353
explain iebaltab
kbjarkefur Jun 22, 2020
71a92ad
whnen to create master data set and from wa
kbjarkefur Jun 23, 2020
7b1482f
remove methods related to attrition
kbjarkefur Jun 24, 2020
d2594f2
ATE definition
kbjarkefur Jun 24, 2020
7e2e8c1
new ATE definition
kbjarkefur Jun 24, 2020
dcf2801
update to intro
kbjarkefur Jun 24, 2020
fc3bf62
real time sampling - documentation
kbjarkefur Jun 24, 2020
0c76937
remove sentence "its too late"
kbjarkefur Jun 24, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
620 changes: 0 additions & 620 deletions chapters/research-design.tex

This file was deleted.

1,098 changes: 682 additions & 416 deletions chapters/sampling-randomization-power.tex

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@
gen check3 = rnormal() // Create a third random number after resetting seed

* Visualize randomization results. See how check1 and check3 are identical,
* but check2 is random relative check1 and check3
* but check2 is random relative to check1 and check3
graph matrix check1 check2 check3 , half
17 changes: 9 additions & 8 deletions code/simple-multi-arm-randomization.do
Original file line number Diff line number Diff line change
@@ -1,20 +1,21 @@
* Set up reproducbilitiy - VERSIONING, SORTING and SEEDING
* Set up reproducible randmomization - VERSIONING, SORTING and SEEDING
ieboilstart , v(13.1) // Version
`r(version)' // Version
sysuse bpwide.dta, clear // Load data
isid patient, sort // Sort
set seed 654697 // Seed - drawn using https://bit.ly/stata-random

* Generate a random number and use it to sort the observation. Then
* the order the observations are sorted in is random.
* Generate a random number and use it to sort the observation.
* Then the order the observations are sorted in is random.
gen treatment_rand = rnormal() // Generate a random number
sort treatment_rand // Sort based on the random number

* See simple-sample.do example for an explanation of "(_n <= _N * X)". The code
* below randomly selects one third of the observations into group 0, one third into group 1 and
* one third into group 2. Typically 0 represents the control group and 1 and
* 2 represents two treatment arms
generate treatment = 0 // Set all observations to 0
* See simple-sample.do example for an explanation of "(_n <= _N * X)".
* The code below randomly selects one third of the observations into group 0,
* one third into group 1 and one third into group 2.
* Typically 0 represents the control group
* and 1 and 2 represents the two treatment arms
generate treatment = 0 // Set all observations to 0 (control)
replace treatment = 1 if (_n <= _N * (2/3)) // Set only the first two thirds to 1
replace treatment = 2 if (_n <= _N * (1/3)) // Set only the first third to 2

Expand Down
26 changes: 0 additions & 26 deletions code/simple-sample.do

This file was deleted.

26 changes: 26 additions & 0 deletions code/simple-uniform-probability-sampling.do
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
* Set up reproducible randmomization - VERSIONING, SORTING and SEEDING
ieboilstart , v(13.1) // Version
`r(version)' // Version
sysuse bpwide.dta, clear // Load data
isid patient, sort // Sort
set seed 215597 // Seed - drawn using https://bit.ly/stata-random

* Generate a random number and use it to sort the observations.
* Then the order the observations are sorted in is random.
gen sample_rand = rnormal() // Generate a random number
sort sample_rand // Sort based on the random number

* Use the sort order to sample 20% (0.20) of the observations.
*_N in Stata is the number of observations in the active dataset,
* and _n is the row number for each observation. The bpwide.dta has 120
* observations and 120*0.20 = 24, so (_n <= _N * 0.20) is 1 for observations
* with a row number equal to or less than 24, and 0 for all other
* observations. Since the sort order is randomized, this means that we
* have randomly sampled 20% of the observations.
gen sample = (_n <= _N * 0.20)

* Restore the original sort order
isid patient, sort

* Check your result
tab sample
10 changes: 1 addition & 9 deletions manuscript.tex
Original file line number Diff line number Diff line change
Expand Up @@ -60,16 +60,8 @@ \chapter{Chapter 2: Collaborating on code and data}
% CHAPTER 3
%----------------------------------------------------------------------------------------

\chapter{Chapter 3: Evaluating impact through research design}
\label{ch:3}

\input{chapters/research-design.tex}

%----------------------------------------------------------------------------------------
% CHAPTER 4
%----------------------------------------------------------------------------------------

\chapter{Chapter 4: Sampling, randomization, and power}
\chapter{Chapter 3: Establish a measurement framework}
mariaruth marked this conversation as resolved.
Show resolved Hide resolved
\label{ch:4}

\input{chapters/sampling-randomization-power.tex}
Expand Down