diff --git a/articles/TQ00-introduction-to-tidyquant.html b/articles/TQ00-introduction-to-tidyquant.html index 0e19e2c5..ded4436c 100644 --- a/articles/TQ00-introduction-to-tidyquant.html +++ b/articles/TQ00-introduction-to-tidyquant.html @@ -101,7 +101,7 @@
vignettes/TQ00-introduction-to-tidyquant.Rmd
TQ00-introduction-to-tidyquant.Rmd
xts
, quantmod
,
TTR
, and PerformanceAnalytics
-tidyverse
tools in R
+tidyverse
tools in R
for Data Science
ggplot2
functionality for beautiful and
@@ -222,16 +222,17 @@ PerformanceAnalytics
package consolidates many of the most
widely used performance metrics as functions that can be applied to
-stock or portfolio returns. tidquant
implements the
+stock or portfolio returns. tidyquant
implements the
functionality with two primary functions:
tq_performance
implements the performance analysis
+tq_performance()
implements the performance analysis
functions in a tidy way, enabling scaling analysis using the split,
apply, combine framework.tq_portfolio
provides a useful toolset for aggregating
-a group of individual asset returns into one or many portfolios.tq_portfolio()
provides a useful toolset for
+aggregating a group of individual asset returns into one or many
+portfolios.Performance is based on the statistical properties of returns, and as
a result both functions use returns as opposed to stock
diff --git a/articles/TQ01-core-functions-in-tidyquant.html b/articles/TQ01-core-functions-in-tidyquant.html
index f6b1caf2..39ac428c 100644
--- a/articles/TQ01-core-functions-in-tidyquant.html
+++ b/articles/TQ01-core-functions-in-tidyquant.html
@@ -101,7 +101,7 @@ The The Get a Stock Index, Yahoo Japan stock prices can be retrieved using a similar call,
The Daily log returns follows a similar approach. Normally I go with a
-transmute function, Daily log returns follow a similar approach. Normally I go with a
+transmute function, Excel Users
Matt
Dancho
- 2023-10-01
+ 2023-10-03
Source: vignettes/TQ01-core-functions-in-tidyquant.Rmd
TQ01-core-functions-in-tidyquant.Rmd
2023-10-01
Overview
-tidyquant
package has a core functions with
-a lot of power. Few functions means less of a learning curve
-for the user, which is why there are only a handful of functions the
-user needs to learn to perform the vast majority of financial analysis
-tasks. The main functions are:tidyquant
package has a few core functions
+with a lot of power. Few functions means less of a learning
+curve for the user, which is why there are only a handful of functions
+the user needs to learn to perform the vast majority of financial
+analysis tasks. The main functions are:
tq_index()
, or a Stock
Exchange, tq_exchange()
: Returns the stock symbols
@@ -147,7 +147,7 @@ Prerequisites
# Loads tidyquant, lubridate, xts, quantmod, TTR
-library(dplyr)
+library(tidyverse)
library(tidyquant)
2.1 Yahoo! Finance
aapl_prices <- tq_get("AAPL", get = "stock.prices", from = " 1990-01-01")
aapl_prices
## # A tibble: 8,502 × 8
+
+## # ℹ 8,493 more rows## # A tibble: 8,503 × 8
## symbol date open high low close volume adjusted
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 1990-01-02 0.315 0.335 0.312 0.333 183198400 0.264
@@ -232,7 +232,7 @@
2.1 Yahoo! Finance## 8 AAPL 1990-01-11 0.324 0.324 0.308 0.308 211052800 0.244
## 9 AAPL 1990-01-12 0.306 0.310 0.301 0.308 171897600 0.244
## 10 AAPL 1990-01-15 0.308 0.319 0.306 0.306 161739200 0.243
-## # ℹ 8,492 more rows
get = "stock.prices.japan"
.
@@ -459,7 +459,7 @@
2.6 BloombergBloomberg
provides access to arguably the most comprehensive financial data and is
-actively used by most major financial instutions that work with
+actively used by most major financial institutions that work with
financial data. The
Rblpapi
package, an R interface to
Bloomberg, has been integrated into tidyquant
as follows.
The benefit of the integration is the scalability since we can
@@ -653,7 +653,7 @@ Mutate rolling regressions wi
diff --git a/articles/TQ02-quant-integrations-in-tidyquant.html b/articles/TQ02-quant-integrations-in-tidyquant.html
index 073fea6e..bf751f4e 100644
--- a/articles/TQ02-quant-integrations-in-tidyquant.html
+++ b/articles/TQ02-quant-integrations-in-tidyquant.html
@@ -101,7 +101,7 @@ tq_mutate(mutate_fun = rollapply)
Excel Users
Matt
Dancho
- 2023-10-01
+ 2023-10-03
Source: vignettes/TQ02-quant-integrations-in-tidyquant.Rmd
TQ02-quant-integrations-in-tidyquant.Rmd
Overviewxts,
zoo
, quantmod
,
TTR
, and PerformanceAnalytics
packages. This
vignette focuses on the following core functions to demonstrate
-how the integratation works with the quantitative finance packages:
+how the integration works with the quantitative finance packages:
tq_transmute()
: Returns a new tidy data
frame typically in a different periodicity than the input.Prerequisites
-
# Loads tidyquant, lubridate, xts, quantmod, TTR
+
# Loads tidyquant, xts, quantmod, TTR
library(tidyquant)
-library(lubridate)
-library(dplyr)
-library(tidyr)
-library(ggplot2)
1.0 Function Compatibility
@@ -196,10 +193,10 @@
xts Functionality## [1] "apply.daily" "apply.monthly" "apply.quarterly" "apply.weekly"
## [5] "apply.yearly" "diff.xts" "lag.xts" "period.apply"
## [9] "period.max" "period.min" "period.prod" "period.sum"
-## [13] "periodicity" "to_period" "to.daily" "to.hourly"
-## [17] "to.minutes" "to.minutes10" "to.minutes15" "to.minutes3"
-## [21] "to.minutes30" "to.minutes5" "to.monthly" "to.period"
-## [25] "to.quarterly" "to.weekly" "to.yearly"
+## [13] "periodicity" "to.daily" "to.hourly" "to.minutes"
+## [17] "to.minutes10" "to.minutes15" "to.minutes3" "to.minutes30"
+## [21] "to.minutes5" "to.monthly" "to.period" "to.quarterly"
+## [25] "to.weekly" "to.yearly" "to_period"
xts
functions that are compatible are listed above.
Generally speaking, these are the:
@@ -412,7 +409,7 @@
TTR FunctionalityExample 1A: Getting and
Example 1B: Getting Daily Log Returns
-tq_transmute
, because the
+tq_transmute()
, because the
periodReturn
function accepts different periodicity
options, and anything other than daily will blow up a mutation. But, in
our situation the period returns periodicity is the same as the stock
@@ -560,13 +557,13 @@ Example 1B: Getting Daily Log Retu
mutate_fun = periodReturn,
period = "daily",
type = "log",
- col_rename = "monthly.returns")
FANG_daily_log_returns %>%
- ggplot(aes(x = monthly.returns, fill = symbol)) +
+ ggplot(aes(x = daily.returns, fill = symbol)) +
geom_density(alpha = 0.5) +
labs(title = "FANG: Charting the Daily Log Returns",
- x = "Monthly Returns", y = "Density") +
+ x = "Daily Returns", y = "Density") +
theme_tq() +
scale_fill_tq() +
facet_wrap(~ symbol, ncol = 2)
tq_transmute()
select = open:volume
. Looking at the
documentation for to.period
, we see that it accepts a
-period
argument that we can set to "weeks"
.
+period
argument that we can set to "months"
.
The result is the OHLCV data returned with the dates changed to one day
-per week.
+per month.
FANG %>%
group_by(symbol) %>%
@@ -610,7 +607,7 @@ ## 10 META 2013-10-31 47.2 52 46.5 50.2 248809000
## # ℹ 182 more rows
A common usage case is to reduce the number of points to smooth time -series plots. Let’s check out difference between daily and monthly +series plots. Let’s check out the difference between daily and monthly plots.
The TTR::runCor
function can be used to evaluate rolling
correlations using the xy pattern. Looking at the documentation
@@ -882,7 +879,7 @@
Before we analyze a rolling regression, it’s helpful to view the
overall trend in returns. To do this, we use tq_get()
to
@@ -903,7 +900,7 @@
We can visualize the relationship between the returns of the stock pairs like so.
@@ -914,7 +911,7 @@Example 6 labs(title = "Visualizing Returns Relationship of Stock Pairs") + theme_tq()
We can get statistcs on the relationship from the lm
+
We can get statistics on the relationship from the lm
function. The model is highly correlated with a p-value of essential
zero. The coefficient estimate for V (Coefficient 1) is 0.8134
indicating a positive relationship, meaning as V increases MA also tends
@@ -928,12 +925,12 @@
While this characterizes the overall relationship, it’s missing the
-time aspect. Fortunately, we can use the rollapply
function
-from the zoo
package to plot a rolling regression, showing
-how the model coefficent varies on a rolling basis over time. We
-calculate rolling regressions with tq_mutate()
in two
-additional steps:
zoo::rollapply()
+function to plot a rolling regression, showing how the model coefficient
+varies on a rolling basis over time. We calculate rolling regressions
+with tq_mutate()
in two additional steps:
Return.clean
to clean outliers from the return data.
-The alpha
parameter is the percentage of oultiers to be
+The alpha
parameter is the percentage of outliers to be
cleaned. Finally, the excess returns are calculated using a risk-free
rate of 3% (divided by 252 for 252 trade days in one year).
diff --git a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-13-1.png b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-13-1.png index 086cdeba..2570ef5f 100644 Binary files a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-13-1.png and b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-13-1.png differ diff --git a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-20-1.png b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-20-1.png index a345c1d4..c93c72a3 100644 Binary files a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-20-1.png and b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-20-1.png differ diff --git a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-27-1.png b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-27-1.png index 8ccfaeeb..505f94e9 100644 Binary files a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-27-1.png and b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-27-1.png differ diff --git a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-31-1.png b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-31-1.png index 5043a4d0..8c0ddf49 100644 Binary files a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-31-1.png and b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-31-1.png differ diff --git a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-32-1.png b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-32-1.png index 21649a33..d50bbcdd 100644 Binary files a/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-32-1.png and b/articles/TQ02-quant-integrations-in-tidyquant_files/figure-html/unnamed-chunk-32-1.png differ diff --git a/articles/TQ03-scaling-and-modeling-with-tidyquant.html b/articles/TQ03-scaling-and-modeling-with-tidyquant.html index 83447365..49463e27 100644 --- a/articles/TQ03-scaling-and-modeling-with-tidyquant.html +++ b/articles/TQ03-scaling-and-modeling-with-tidyquant.html @@ -101,7 +101,7 @@Excel Users
Matt Dancho
-2023-10-01
+2023-10-03
Source:vignettes/TQ03-scaling-and-modeling-with-tidyquant.Rmd
@@ -118,7 +118,7 @@TQ03-scaling-and-modeling-with-tidyquant.Rmd
Overview
The greatest benefit to
tidyquant
is the ability to apply the data science workflow to easily model and scale your financial -analysis as described in R for +analysis as described in R for Data Science. Scaling is the process of creating an analysis for one asset and then extending it to multiple groups. This idea of scaling is incredibly useful to financial analysts because typically one @@ -135,9 +135,10 @@Overviewfilter,
group_by
,nest
/unnest
,spread
/gather
, etc -Use purrr
: mapping functions withmap
+Use -purrr
: mapping functions with +map()
Model financial analysis using the data science workflow in R for Data Science + Model financial analysis using the data science workflow in R for Data Science We’ll go through some useful techniques for getting and manipulating @@ -148,12 +149,8 @@
Prerequisites
-# Loads tidyquant, lubridate, xts, quantmod, TTR, and PerformanceAnalytics -library(lubridate) -library(dplyr) -library(purrr) -library(ggplot2) -library(tidyr) +
# Loads tidyquant, xts, quantmod, TTR, and PerformanceAnalytics +library(tidyverse) library(tidyquant)
The output is a single level tibble with all or the stock prices in
one tibble. The auto-generated column name is “symbol”, which can be
-pre-emptively renamed by giving the vector a name
+preemptively renamed by giving the vector a name
(e.g. stocks <- c("AAPL", "GOOG", "META")
) and then
piping to tq_get
.
## # A tibble: 31 × 8
## symbol company identifier sedol weight sector shares_held local_currency
## <chr> <chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
-## 1 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 2 GS GOLDMAN SAC… 38141G104 2407… 0.0635 - 5477820 USD
-## 3 MSFT MICROSOFT C… 594918104 2588… 0.0612 - 5477820 USD
-## 4 HD HOME DEPOT … 437076102 2434… 0.0592 - 5477820 USD
-## 5 CAT CATERPILLAR… 149123101 2180… 0.0539 - 5477820 USD
-## 6 AMGN AMGEN INC 031162100 2023… 0.0529 - 5477820 USD
-## 7 MCD MCDONALD S … 580135101 2550… 0.0518 - 5477820 USD
-## 8 V VISA INC CL… 92826C839 B2PZ… 0.0452 - 5477820 USD
-## 9 CRM SALESFORCE … 79466L302 2310… 0.0397 - 5477820 USD
-## 10 BA BOEING CO/T… 097023105 2108… 0.0372 - 5477820 USD
+## 1 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 2 MSFT MICROSOFT C… 594918104 2588… 0.0632 - 5477820 USD
+## 3 GS GOLDMAN SAC… 38141G104 2407… 0.0626 - 5477820 USD
+## 4 HD HOME DEPOT … 437076102 2434… 0.0589 - 5477820 USD
+## 5 CAT CATERPILLAR… 149123101 2180… 0.0534 - 5477820 USD
+## 6 AMGN AMGEN INC 031162100 2023… 0.0523 - 5477820 USD
+## 7 MCD MCDONALD S … 580135101 2550… 0.0506 - 5477820 USD
+## 8 V VISA INC CL… 92826C839 B2PZ… 0.0454 - 5477820 USD
+## 9 CRM SALESFORCE … 79466L302 2310… 0.0400 - 5477820 USD
+## 10 BA BOEING CO/T… 097023105 2108… 0.0369 - 5477820 USD
## # ℹ 21 more rows
…or, get an exchange.
@@ -259,20 +256,20 @@Method 2B: Use index or exchangetq_index("DOW") %>% slice(1:3) %>% tq_get(get = "stock.prices")
## # A tibble: 8,115 × 15
+## # A tibble: 8,118 × 15
## symbol company identifier sedol weight sector shares_held local_currency
## <chr> <chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
-## 1 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 2 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 3 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 4 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 5 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 6 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 7 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 8 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 9 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## 10 UNH UNITEDHEALT… 91324P102 2917… 0.0996 - 5477820 USD
-## # ℹ 8,105 more rows
+## 1 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 2 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 3 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 4 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 5 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 6 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 7 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 8 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 9 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## 10 UNH UNITEDHEALT… 91324P102 2917… 0.101 - 5477820 USD
+## # ℹ 8,108 more rows
## # ℹ 7 more variables: date <date>, open <dbl>, high <dbl>, low <dbl>,
## # close <dbl>, volume <dbl>, adjusted <dbl>
You can use any applicable “getter” to get data for every
@@ -338,7 +335,7 @@ 3.0 Modeling Financial Data using p
Eventually you will want to begin modeling (or more generally
applying functions) at scale! One of the best features
of the tidyverse
is the ability to map functions to nested
-tibbles using purrr
. From the Many Models chapter of “R for Data Science”, we can apply the
+tibbles using purrr
. From the Many Models chapter of “R for Data Science”, we can apply the
same modeling workflow to financial analysis. Using a two step
workflow:
@@ -461,7 +458,7 @@ Analyze a Single Stock}
Testing it out on a single stock. We can see that the “term” that contains the direction of the trend (the slope) is “year(date)”. The -interpetation is that as year increases one unit, the annual returns +interpretation is that as year increases one unit, the annual returns decrease by 3%.
get_model(AAPL)
We can now apply our analysis function to the stocks using
-dplyr::mutate
and purrr::map
. The
+dplyr::mutate()
and purrr::map()
. The
mutate()
function adds a column to our tibble, and the
-map()
function maps our custom get_model
+map()
function maps our custom get_model
function to our tibble of stocks using the symbol
column.
-The tidyr::unnest
function unrolls the nested data frame so
-all of the model statistics are accessable in the top data frame level.
-The filter
, arrange
and select
-steps just manipulate the data frame to isolate and arrange the data for
-our viewing.
tidyr::unnest()
function unrolls the nested data frame
+so all of the model statistics are accessible in the top data frame
+level. The filter
, arrange
and
+select
steps just manipulate the data frame to isolate and
+arrange the data for our viewing.
stocks_model_stats <- stocks_tbl %>%
select(symbol, company) %>%
@@ -509,13 +506,13 @@ Scale to Many Stocks# Nest
group_by(symbol, company) %>%
- nest() %>%
+ nest() %>%
# Apply the get_model() function to the new "nested" data column
- mutate(model = map(data, get_model)) %>%
+ mutate(model = map(data, get_model)) %>%
# Unnest and collect slope
- unnest(model) %>%
+ unnest(model) %>%
filter(term == "year(date)") %>%
arrange(desc(estimate)) %>%
select(-term)
@@ -523,13 +520,13 @@ Scale to Many Stocksstocks_model_stats
## # A tibble: 5 × 7
## # Groups: symbol, company [5]
-## symbol company data estimate std.error statistic p.value
-## <chr> <chr> <list> <dbl> <dbl> <dbl> <dbl>
-## 1 ATO ATMOS ENERGY CORP <tibble> 0.0291 0.0126 2.32 0.0490
-## 2 EMN EASTMAN CHEMICAL CO <tibble> 0.00566 0.0435 0.130 0.900
-## 3 TPR TAPESTRY INC <tibble> 0.000725 0.0382 0.0190 0.985
-## 4 FCX FREEPORT MCMORAN INC <tibble> -0.0460 0.0974 -0.472 0.649
-## 5 HWM HOWMET AEROSPACE INC <tibble> NA NA NA NA
+## symbol company data estimate std.error statistic p.value
+## <chr> <chr> <list> <dbl> <dbl> <dbl> <dbl>
+## 1 IEX IDEX CORP <tibble> 0.0178 0.0264 0.673 0.520
+## 2 FRT FEDERAL REALTY INVS TRUST <tibble> 0.0170 0.0165 1.03 0.334
+## 3 ALLE ALLEGION PLC <tibble> 0.0157 0.0850 0.185 0.870
+## 4 VRSN VERISIGN INC <tibble> 0.00669 0.0411 0.163 0.875
+## 5 PXD PIONEER NATURAL RESOURCE… <tibble> 0.00664 0.0686 0.0969 0.925
We’re done! We now have the coefficient of the linear regression that tracks the direction of the trend line. We can easily extend this type of analysis to larger lists or stock indexes. For example, the entire @@ -566,7 +563,7 @@
Pros: Long running scripts are not interrupted because of one error
Cons: Errors can be inadvertently handled or -flow downstream if the users does not read the warnings
## # A tibble: 5,410 × 8
+## # A tibble: 5,412 × 8
## symbol date open high low close volume adjusted
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2013-01-02 19.8 19.8 19.3 19.6 560518000 16.8
@@ -601,7 +598,7 @@ Bad Apples Fail Gracefully, tq_get## 8 AAPL 2013-01-11 18.6 18.8 18.5 18.6 350506800 15.9
## 9 AAPL 2013-01-14 18.0 18.1 17.8 17.9 734207600 15.3
## 10 AAPL 2013-01-15 17.8 17.8 17.3 17.4 876772400 14.9
-## # ℹ 5,400 more rows
+## # ℹ 5,402 more rows
Now switching complete_cases = FALSE
will retain any
errors as NA
values in a nested data frame. Notice that the
error message and output change. The error message now states that the
@@ -615,7 +612,7 @@
## # A tibble: 5,411 × 8
+## # A tibble: 5,413 × 8
## symbol date open high low close volume adjusted
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2013-01-02 19.8 19.8 19.3 19.6 560518000 16.8
@@ -628,15 +625,14 @@ Bad Apples Fail Gracefully, tq_get## 8 AAPL 2013-01-11 18.6 18.8 18.5 18.6 350506800 15.9
## 9 AAPL 2013-01-14 18.0 18.1 17.8 17.9 734207600 15.3
## 10 AAPL 2013-01-15 17.8 17.8 17.3 17.4 876772400 14.9
-## # ℹ 5,401 more rows
+## # ℹ 5,403 more rows
In both cases, the prudent user will review the warnings to determine
what happened and whether or not this is acceptable. In the
complete_cases = FALSE
example, if the user attempts to
perform downstream computations at scale, the computations will likely
-fail grinding the analysis to a hault. But, the advantage is that the
-user will more easily be able to filter to the problem childs to
-determine what happened and decide whether this is acceptable or
-not.