Skip to content

Commit

Permalink
Merge pull request #13 from camillescott/patch-1
Browse files Browse the repository at this point in the history
FIx typos
  • Loading branch information
ryneches authored Apr 4, 2018
2 parents 3e7c02f + 5f26904 commit ed0a599
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@ Python has several packages for working with phylogenetic trees, each focused on
* The [`Bio.Phylo`](http://dx.doi.org/10.1186/1471-2105-13-209) subpackage in [biopython](http://biopython.org/) collects useful tools for working with common (and not so common) file formats in phylogenetics, along with utilities for analysis and visualization [@biophylo]
* The [`skbio.tree`](http://scikit-bio.org/docs/latest/tree.html) module in [`scikit-bio`](http://scikit-bio.org/) is a base class for phylogenetic trees providing analytical and file processing functions for working with phylogenetic trees [@skbio]

Each of these packages allow trees to be manipulated, edited and reshaped. To make this possible, they must strike a balance between raw performance and flexibility, and most prioritize flexibility and a rich set of features. This is desireable for most use cases, but computational scaling challanges arise when using these packages to work with very large trees. Trees representing microbial communities may contain tens of thousands to tens of millions of taxa, depending on the community diversity and the survey methodology.
Each of these packages allow trees to be manipulated, edited and reshaped. To make this possible, they must strike a balance between raw performance and flexibility, and most prioritize flexibility and a rich set of features. This is desireable for most use cases, but computational scaling challenges arise when using these packages to work with very large trees. Trees representing microbial communities may contain tens of thousands to tens of millions of taxa, depending on the community diversity and the survey methodology.

`SuchTree` is designed purely as a backend for analysis of large trees. Significant advantages in memory layout, parallelism and speed are achieved by sacrificing the ability to manipulate, edit or reshape trees (these capabilities exist in other packages). It scales to millions of taxa, and the key algorithms and data structures permit concurrent threads without locks.

![](nj_vs_ml.png)

**Figure 1 :** Two phylogenetic trees of 54,327 taxa were constructed using different methods (approximate maximum likelihood using [`FastTree`](http://www.microbesonline.org/fasttree/) and the [`neighbor joining`](https://en.wikipedia.org/wiki/Neighbor_joining)) agglomerative clustering method). To explore the different topologies of the trees, pairs of taxa were chosen at random and the patristic distance between each pair was computed through each of the two trees. This plot shows 1,000,000 random pairs sampled from 1,475,684,301 possible pairs (0.07%). The two million distances calculations required about 12.5 seconds using a single thread.
**Figure 1 :** Two phylogenetic trees of 54,327 taxa were constructed using different methods (approximate maximum likelihood using [`FastTree`](http://www.microbesonline.org/fasttree/) and the [`neighbor joining`](https://en.wikipedia.org/wiki/Neighbor_joining) agglomerative clustering method). To explore the different topologies of the trees, pairs of taxa were chosen at random and the patristic distance between each pair was computed through each of the two trees. This plot shows 1,000,000 random pairs sampled from 1,475,684,301 possible pairs (0.07%). The two million distances calculations required about 12.5 seconds using a single thread.

`SuchTree` supports co-phylogenies, with functions for efficiently extracting graphs and subgraphs for network analysis, and has native support for [`igraph`](http://igraph.org/) and [`networkx`](https://networkx.github.io/).

Expand Down

0 comments on commit ed0a599

Please sign in to comment.