This is a work-in-progress website consisting of R panel data and optimization examples for Statistics/Econometrics/Economic Analysis. Book version: bookdown site and bookdown pdf. Materials gathered from various projects in which R code is used. Files are from Fan's R4Econ repository. This is not a R package, but a list of examples in PDF/HTML/Rmd formats. REconTools is a package that can be installed with tools used in projects involving R.

Bullet points show which base R, tidyverse or other functions/commands are used to achieve various objectives. An effort is made to use only base R and tidyverse packages whenever possible to reduce dependencies. The goal of this repository is to make it easier to find/re-use codes produced for various projects.

From Fan's other repositories: For dynamic borrowing and savings problems, see Dynamic Asset Repository; For code examples, see also Matlab Example Code and Stata Example Code; For intro econ with Matlab, see Intro Mathematics for Economists, and for intro stat with R, see Intro Statistics for Undergraduates. See here for all of Fan's public repositories.

Please contact FanWangEcon for issues or problems.

1 Array, Matrix, Dataframe

1.1 List

Multi-dimensional Named Lists: rmd | r | pdf | html
- Initiate Empty List. Named one and two dimensional lists.
- r: vector(mode = "list", length = it_N) + names(list) <- paste0('e',seq()) + dimnames(ls2d)[[1]] <- paste0('r',seq()) + dimnames(ls2d)[[2]] <- paste0('c',seq())
- tidyr: unnest()

1.2 Array

Arrays Operations in R: rmd | r | pdf | html
- Basic array operations in R.
- r: head() + tail() + na_if()
Generate Special Arrays: rmd | r | pdf | html
- Generate special arrays: log spaced array
- r: seq()
String Operations: rmd | r | pdf | html
- Split, concatenate, subset strings
- r: paste0() + sub() + gsub() + grepl() + sprintf() + tail() + strsplit() + basename() + dirname()
Array Combinations as Matrix: rmd | r | pdf | html
- Combinations of two arrays to matrix form (meshgrid)
- tidyr: expand_grid() + expand.grid()

1.3 Matrix

Matrix Basics: rmd | r | pdf | html
- Generate and combine fixed and random matrixes
- R: rbind() + matrix
Linear Algebra Operations: rmd | r | pdf | html

1.4 Variables in Dataframes

Tibble Basics: rmd | r | pdf | html
- generate tibbles, rename tibble variables, tibble row and column names
- rename numeric sequential columns with string prefix and suffix
- dplyr: as_tibble(mt) + rename_all(~c(ar_names)) + rename_at(vars(starts_with("xx")), funs(str_replace(., "yy", "yyyy")) + rename_at(vars(num_range('',ar_it)), funs(paste0(st,.))) + rowid_to_column() + colnames + rownames
Label and Combine Factor Variables: rmd | r | pdf | html
- Convert numeric variables to factor variables, generate joint factors, and label factors.
- Graph MPG and 1/4 Miles Time (qsec) from the mtcars dataset over joint shift-type (am) and engine-type (vs) categories.
- forcats: as_factor() + fct_recode() + fct_cross()
Examples of Random Draws in R: rmd | r | pdf | html
R Tibble Dataframe NA Values: rmd | r | pdf | html
R Tibble Dataframe String Manipulations: rmd | r | pdf | html

2 Summarize Data

2.1 Counting Observation

Counting Basics: rmd | r | pdf | html
- uncount to generate panel skeleton from years in survey
- dplyr: uncount(yr_n) + group_by() + mutate(yr = row_number() + start_yr)

2.2 Sorting, Indexing, Slicing

Sorted Index, Interval Index and Expand Value from One Row: rmd | r | pdf | html
- Sort and generate index for rows
- Generate negative and positive index based on deviations
- Populate Values from one row to other rows
- dplyr: arrange() + row_number() + mutate(lowest = min(Sepal.Length)) + case_when(row_number()==x ~ Septal.Length) + mutate(Sepal.New = Sepal.Length[Sepal.Index == 1])

2.3 Group Statistics

Count Unique Groups and Mean within Groups: rmd | r | pdf | html
- Unique groups defined by multiple values and count obs within group.
- Mean, sd, observation count for non-NA within unique groups.
- dplyr: group_by() + summarise(n()) + summarise_if(is.numeric, funs(mean = mean(., na.rm = TRUE), n = sum(is.na(.)==0)))
By Groups, One Variable All Statistics: rmd | r | pdf | html
- Pick stats, overall, and by multiple groups, stats as matrix or wide row with name=(ctsvar + catevar + catelabel).
- tidyr: group_by() + summarize_at(, funs()) + rename(!!var := !!sym(var)) + mutate(!!var := paste0(var,'str',!!!syms(vars))) + gather() + unite() + spread(varcates, value)
By within Individual Groups Variables, Averages: rmd | r | pdf | html
- By Multiple within Individual Groups Variables.
- Averages for all numeric variables within all groups of all group variables. Long to Wide to very Wide.
- tidyr: gather() + group_by() + summarise_if(is.numeric, funs(mean(., na.rm = TRUE))) + mutate(all_m_cate = paste0(variable, '_c', value)) + unite() + spread()

2.4 Distributional Statistics

Tibble Basics: rmd | r | pdf | html
- input multiple variables with comma separated text strings
- quantitative/continuous and categorical/discrete variables
- histogram and summary statistics
- tibble: ar_one <- c(107.72,101.28) + ar_two <- c(101.72,101.28) + mt_data <- cbind(ar_one, ar_two) + as_tibble(mt_data)

2.5 Summarize Multiple Variables

R Example Apply the Same Function Over Multiple Variables: rmd | r | pdf | html

3 Functions

3.1 Dataframe Mutate

Nonlinear Function over Rows: rmd | r | pdf | html
- Evaluate nonlinear function f(x_i, y_i, ar_x, ar_y, c, d), where c and d are constants, and ar_x and ar_y are arrays, both fixed. x_i and y_i vary over each row of matrix.
- dplyr: rowwise() + mutate(out = funct(inputs))
DPLYR Evaluate Functions at Many States and Choices Each State: rmd | r | pdf | html

3.2 Dataframe Do Anything

Evaluate Function Do Anything Group Stack Results: rmd | r | pdf | html
- Group dataframe by categories, compute category specific output scalar or arrays based on within category variable information.
- dplyr: group_by(ID) + do(inc = rnorm(.$N, mean=.$mn, sd=.$sd)) + unnest(c(inc)) + left_join(df, by="ID")
DPLYR Expand Dataframe with Function: rmd | r | pdf | html

3.3 Apply and pmap

Apply and Mutate over Rows: rmd | r | pdf | html
- Evaluate function f(x_i,y_i,c), where c is a constant and x and y vary over each row of a matrix, with index i indicating rows.
- Get same results using apply, sapply, and dplyr mutate.
- r: do.call() + apply(mt, 1, func) + sapply(ls_ar, func, ar1, ar2)
- purrr: rowwise() + unnest(out) + pmap(func) + unlist()

4 Panel

4.1 Generate and Join

TIDYVERSE Generate Panel Data Structures: rmd | r | pdf | html
- Build skeleton panel frame with N observations and T periods.
- tidyr: rowid_to_column() + uncount() + group_by() + row_number() + ungroup()
R DPLYR Join Multiple Dataframes Together: rmd | r | pdf | html
- Join dataframes together with one or multiple keys. Stack dataframes together.
- dplyr: filter() + rename(!!sym(vsta) := !!sym(vstb)) + mutate(var = rnom(n())) + left_join(df, by=(c('id'='id', 'vt'='vt'))) + left_join(df, by=setNames(c('id', 'vt'), c('id', 'vt'))) + bind_rows()

4.2 Wide and Long

TIDYR Pivot Wider and Pivot Longer Examples: rmd | r | pdf | html
- Long roster to wide roster and cumulative sum attendance by date.
- dplyr: mutate(var = case_when(rnorm(n()) < 0 ~ 1, TRUE ~ 0)) + rename_at(vars(num_range('', ar_it)), list(~paste0(st_prefix, . , ''))) + mutate_at(vars(contains(str)), list(~replace_na(., 0))) + mutate_at(vars(contains(str)), list(~cumsum(.)))

5 Linear Regression

5.1 OLS and IV

IV/OLS Regression: rmd | r | pdf | html
- R Instrumental Variables and Ordinary Least Square Regression store all Coefficients and Diagnostics as Dataframe Row.
- aer: *library(aer) + ivreg(as.formula, diagnostics = TRUE) *
M Outcomes and N RHS Alternatives: rmd | r | pdf | html
- There are M outcome variables and N alternative explanatory variables. Regress all M outcome variables on N endogenous/independent right hand side variables one by one, with controls and/or IVs, collect coefficients.
- dplyr: bind_rows(lapply(listx, function(x)(bind_rows(lapply(listy, regf.iv))) + starts_with() + ends_with() + reduce(full_join)

5.2 Decomposition

Regression Decomposition: rmd | r | pdf | html
- Post multiple regressions, fraction of outcome variables' variances explained by multiple subsets of right hand side variables.
- dplyr: gather() + group_by(var) + mutate_at(vars, funs(mean = mean(.))) + rowSums(matmat) + mutate_if(is.numeric, funs(frac = (./value_var)))*

6 Nonlinear Regression

6.1 Logit Regression

Logit Regression: rmd | r | pdf | html
- Logit regression testing and prediction.
- stats: glm(as.formula(), data, family='binomial') + predict(rs, newdata, type = "response")

7 Optimization

7.1 Bisection

Concurrent Bisection over Dataframe Rows: rmd | r | pdf | html
- Post multiple regressions, fraction of outcome variables' variances explained by multiple subsets of right hand side variables.
- tidyr: pivot_longer(cols = starts_with('abc'), names_to = c('a', 'b'), names_pattern = paste0('prefix', "(.)_(.)"), values_to = val) + pivot_wider(names_from = !!sym(name), values_from = val) + mutate(!!sym(abc) := case_when(efg < 0 ~ !!sym(opq), TRUE ~ iso))
- gglot2: geom_line() + facet_wrap() + geom_hline()

8 Mathmatics and Statistics

8.1 Distributions

Integrate Normal Shocks: rmd | r | pdf | html
- Random Sampling (Monte Carlo) integrate shocks.
- Trapezoidal rule (symmetric rectangles) integrate normal shock.

8.2 Analytical Solutions

linear solve x with f(x) = 0: rmd | r | pdf | html
- Evaluate and solve statistically relevant problems with one equation and one unknown that permit analytical solutions.

8.3 Inequality Models

Gini for Discrete Samples: rmd | r | pdf | html
- Given sample of data points that are discrete, compute the approximate gini coefficient.
- r: sort() + cumsum() + sum()
CES abd Atkinson Utility: rmd | r | pdf | html
- Analyze how changing individual outcomes shift utility given inequality preference parameters.
- Draw Cobb-Douglas, Utilitarian and Leontief indifference curve
- r: apply(mt, 1, funct(x){}) + do.call(rbind, ls_mt)
- tidyr: expand_grid()
- ggplot2: geom_line() + facet_wrap()

Please contact for issues or problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_OLD.md

README_OLD.md

1 Array, Matrix, Dataframe

1.1 List

1.2 Array

1.3 Matrix

1.4 Variables in Dataframes

2 Summarize Data

2.1 Counting Observation

2.2 Sorting, Indexing, Slicing

2.3 Group Statistics

2.4 Distributional Statistics

2.5 Summarize Multiple Variables

3 Functions

3.1 Dataframe Mutate

3.2 Dataframe Do Anything

3.3 Apply and pmap

4 Panel

4.1 Generate and Join

4.2 Wide and Long

5 Linear Regression

5.1 OLS and IV

5.2 Decomposition

6 Nonlinear Regression

6.1 Logit Regression

7 Optimization

7.1 Bisection

8 Mathmatics and Statistics

8.1 Distributions

8.2 Analytical Solutions

8.3 Inequality Models

Files

README_OLD.md

Latest commit

History

README_OLD.md

File metadata and controls

1 Array, Matrix, Dataframe

1.1 List

1.2 Array

1.3 Matrix

1.4 Variables in Dataframes

2 Summarize Data

2.1 Counting Observation

2.2 Sorting, Indexing, Slicing

2.3 Group Statistics

2.4 Distributional Statistics

2.5 Summarize Multiple Variables

3 Functions

3.1 Dataframe Mutate

3.2 Dataframe Do Anything

3.3 Apply and pmap

4 Panel

4.1 Generate and Join

4.2 Wide and Long

5 Linear Regression

5.1 OLS and IV

5.2 Decomposition

6 Nonlinear Regression

6.1 Logit Regression

7 Optimization

7.1 Bisection

8 Mathmatics and Statistics

8.1 Distributions

8.2 Analytical Solutions

8.3 Inequality Models