Skip to content

Commit

Permalink
README
Browse files Browse the repository at this point in the history
  • Loading branch information
hrbrmstr committed Jul 25, 2019
1 parent c233760 commit 6e2d108
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 36 deletions.
10 changes: 4 additions & 6 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,12 @@ editor_options:
chunk_output_type: console
---
```{r pkg-knitr-opts, include=FALSE}
knitr::opts_chunk$set(collapse=TRUE, fig.retina=2, message=FALSE, warning=FALSE)
options(width=120)
hrbrpkghelpr::global_opts()
```

[![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/tdigest.svg?branch=master)](https://travis-ci.org/hrbrmstr/tdigest)
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/hrbrmstr/tdigest?branch=master&svg=true)](https://ci.appveyor.com/project/hrbrmstr/tdigest)
[![Coverage Status](https://codecov.io/gh/hrbrmstr/tdigest/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/tdigest)
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/tdigest)](https://cran.r-project.org/package=tdigest)
```{r badges, results='asis', echo=FALSE, cache=FALSE}
hrbrpkghelpr::stinking_badges()
```

# tdigest

Expand Down
71 changes: 41 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,62 +1,73 @@

[![Travis-CI Build
[![Project Status: Active – The project has reached a stable, usable
state and is being actively
developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![Signed
by](https://img.shields.io/badge/Keybase-Verified-brightgreen.svg)](https://keybase.io/hrbrmstr)
![Signed commit
%](https://img.shields.io/badge/Signed_Commits-95.7%25-lightgrey.svg)
[![Linux build
Status](https://travis-ci.org/hrbrmstr/tdigest.svg?branch=master)](https://travis-ci.org/hrbrmstr/tdigest)
[![AppVeyor Build
Status](https://ci.appveyor.com/api/projects/status/github/hrbrmstr/tdigest?branch=master&svg=true)](https://ci.appveyor.com/project/hrbrmstr/tdigest)
[![Windows build
status](https://ci.appveyor.com/api/projects/status/github/hrbrmstr/tdigest?svg=true)](https://ci.appveyor.com/project/hrbrmstr/tdigest)
[![Coverage
Status](https://codecov.io/gh/hrbrmstr/tdigest/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/tdigest)
[![CRAN\_Status\_Badge](https://www.r-pkg.org/badges/version/tdigest)](https://cran.r-project.org/package=tdigest)
![Minimal R
Version](https://img.shields.io/badge/R%3E%3D-3.5.0-blue.svg)
![License](https://img.shields.io/badge/License-MIT-blue.svg)

# tdigest

Wicked Fast, Accurate Quantiles Using ‘t-Digests’

## Description

The t-digest construction algorithm uses a variant of 1-dimensional
The t-Digest construction algorithm uses a variant of 1-dimensional
k-means clustering to produce a very compact data structure that allows
accurate estimation of quantiles. This t-digest data structure can be
accurate estimation of quantiles. This t-Digest data structure can be
used to estimate quantiles, compute other rank statistics or even to
estimate related measures like trimmed means. The advantage of the
t-digest over previous digests for this purpose is that the t-digest
handles data with full floating point resolution. With small changes,
the t-digest can handle values from any ordered set for which we can
compute something akin to a mean. The accuracy of quantile estimates
produced by t-digests can be orders of magnitude more accurate than
those produced by previous digest algorithms.
t-Digest over previous digests for this purpose is that the t-Digest
handles data with full floating point resolution. The accuracy of
quantile estimates produced by t-Digests can be orders of magnitude more
accurate than those produced by previous digest algorithms. Methods are
provided to create and update t-Digests and retreive quantiles from the
accumulated distributions.

See [the original paper by Ted
Dunning](https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf)
for more details on t-Digests.
See [the original paper by Ted Dunning & Otmar
Ertl](https://arxiv.org/abs/1902.04023) for more details on t-Digests.

## What’s Inside The Tin

The following functions are implemented:

- `td_add`: Add a value to the t-digest with the specified count
- `td_add`: Add a value to the t-Digest with the specified count
- `td_create`: Allocate a new histogram
- `td_merge`: Merge one t-digest into another
- `td_merge`: Merge one t-Digest into another
- `td_quantile_of`: Return the quantile of the value
- `td_total_count`: Total items contained in the t-digest
- `td_total_count`: Total items contained in the t-Digest
- `td_value_at`: Return the value at the specified quantile
- `tquantile`: Calculate sample quantiles from a t-digest
- `tquantile`: Calculate sample quantiles from a t-Digest

## Installation

``` r
install.packages("tdigest", repos = "https://cinc.rud.is")
# or
devtools::install_git("https://git.rud.is/hrbrmstr/tdigest.git")
remotes::install_git("https://git.rud.is/hrbrmstr/tdigest.git")
# or
devtools::install_git("https://git.sr.ht/~hrbrmstr/tdigest")
remotes::install_git("https://git.sr.ht/~hrbrmstr/tdigest")
# or
devtools::install_gitlab("hrbrmstr/tdigest")
remotes::install_gitlab("hrbrmstr/tdigest")
# or
devtools::install_bitbucket("hrbrmstr/tdigest")
remotes::install_bitbucket("hrbrmstr/tdigest")
# or
devtools::install_github("hrbrmstr/tdigest")
remotes::install_github("hrbrmstr/tdigest")
```

NOTE: To use the ‘remotes’ install options you will need to have the
[{remotes} package](https://github.com/r-lib/remotes) installed.

## Usage

``` r
Expand Down Expand Up @@ -145,19 +156,19 @@ microbenchmark::microbenchmark(
r_quantile = quantile(x, c(0, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.99, 1))
)
## Unit: microseconds
## expr min lq mean median uq max neval cld
## tdigest 7.943 9.4015 20.94626 11.957 32.9395 48.487 100 a
## r_quantile 52305.639 53309.4185 55386.25517 54038.227 56644.9055 94300.294 100 b
## expr min lq mean median uq max neval
## tdigest 5.324 6.712 19.18354 12.0475 26.941 84.919 100
## r_quantile 61442.143 64031.655 68172.17037 66155.0690 70321.910 132065.801 100
```

## tdigest Metrics

| Lang | \# Files | (%) | LoC | (%) | Blank lines | (%) | \# Lines | (%) |
| :----------- | -------: | ---: | --: | ---: | ----------: | ---: | -------: | ---: |
| C | 3 | 0.27 | 350 | 0.65 | 46 | 0.36 | 26 | 0.11 |
| R | 6 | 0.55 | 139 | 0.26 | 31 | 0.24 | 135 | 0.58 |
| Rmd | 1 | 0.09 | 36 | 0.07 | 40 | 0.31 | 45 | 0.19 |
| C/C++ Header | 1 | 0.09 | 10 | 0.02 | 10 | 0.08 | 28 | 0.12 |
| R | 6 | 0.55 | 140 | 0.26 | 31 | 0.24 | 139 | 0.57 |
| Rmd | 1 | 0.09 | 36 | 0.07 | 40 | 0.31 | 52 | 0.21 |
| C/C++ Header | 1 | 0.09 | 10 | 0.02 | 10 | 0.08 | 26 | 0.11 |

## Code of Conduct

Expand Down

0 comments on commit 6e2d108

Please sign in to comment.