-
Notifications
You must be signed in to change notification settings - Fork 7
Distance tree
The executables are in the directory $DT/phylogeny/
.
Optimize an existing distance tree or create a distance tree.
Main parameters:
-
-input_tree
tree: tree file which has been produced by the-output_tree
parameter; - dissimilarity data:
-
-data
data: data file in Data Master format, see Data Master format, or an incremental distance tree directory ending with/
; -
-dissim_attr
dissim_attr: dissimilarity attribute in the data file;
-
- dissimilarity transformations:
-
-dissim_coeff
c: dissimilarities are multiplied by c; -
-dissim_power
p: dissimilarities are raised to the power of p;
-
- dissimilarity variance:
-
-variance
{ lin | sqr | pow | exp | linExp }: -
-variance_power
p: non-negative power for-variance pow
; -
-variance_dissim
: flag indicating that variance function is applied to dissimilarities rather than to tree distances; -
-variance_min
m: minimum dissimilarity variance to be added to the computed dissimilarity variance;
-
- deletion of objects:
-
-delete
obj_list: list of objects to delete from the tree; -
-keep
obj_list: list of objects to keep in the tree and delete all the other objects;
-
- optimization:
-
-optimize
: flag indicating that tree must be optimized; -
-subgraph_iter_max
i: maximum number of iterations of subgraph optimizations; -
-skip_len
: flag indicating that arc length optimization should be skipped; -
-reinsert
: flag indicating the usage of optimization by reinsertion;
-
- fitness outliers:
-
-delete_criterion_outliers
criterion_outlier_list: output file to save the list of criterion outliers; -
-criterion_outlier_num_max
n: maximum length of criterion_outlier_list; -
-delete_deformation_outliers
deformation_outlier_list: output file to save the list of deformation outliers; -
-deformation_outlier_num_max
n: maximum length of deformation_outlier_list;
-
- hybrid outliers:
-
-hybridness_min
hybridness_min: minimum hybridness of hybrid triangles; -
dissim_boundary
b: dissimilarity threshold at which two different dissimilarities are merged causing discontinuity.
Hybrid triangles are not identified for dissimilarities close to this value; -
-delete_hybrids
hybrid_triangles: output file with hybrid triangles;
-
-
-reroot_at
obj1:obj2: make the middle of the arc of the least common ancestor of the objects named obj1 and obj2 the root of the tree; -
-output_tree
tree: create a tree file in internal format; -
-threads
n: use n processor threads.
Create a tree using the Data Master file $DT/phylogeny/data/Saccharomyces.dm
:
$DT/phylogeny/makeDistTree -threads 3 -data $DT/phylogeny/data/Saccharomyces \
-variance linExp -optimize -subgraph_iter_max 2 \
-hybridness_min 1.2 -delete_hybrids Saccharomyces.hybrid -dissim_boundary 0.675 \
-output_tree Saccharomyces.tree
Remove all objects from a tree in.tree
which are not in the list list
:
$DT/phylogeny/makeDistTree -input_tree in.tree -keep list -output_tree out.tree
Find genogroups in a tere given a distance threshold.
Main parameters:
- input_tree: Input tree file;
- genogroup_dist: Max. distance between objects of the same genogroup;
-
-genogroup_table
table: Output file with lines:<object> <genogroup leader>
; -
-genogroups
genogroups: Output file with the names of the interior nodes which are genogroup roots; -
-genogroup_under_genogroup
table: Output file with lines:<node1 LCA name> <node2 LCA name>
, where nodes belong to different genogroups, but node1 is a child of node2.
Print the list of objects of a distance tree.
Parameter: Input distance tree made by makeDistTree
.
Optimize of an existing tree using a subset of dissimilarities with a change of dissimilarity variance:
$DT/phylogeny/tree2obj.sh Saccharomyces.tree > Saccharomyces.list
$DT/dm/dm2subset $DT/phylogeny/data/Saccharomyces Saccharomyces.list > subset.dm
$DT/phylogeny/makeDistTree -threads 3 -input_tree Saccharomyces.tree -data subset
-variance pow -variance_power 3 -optimize -subgraph_iter_max 2
Extract the list of hybrid objects from the file hybrid_triangles made by makeDistTree
and print it.
Parameter: file hybrid_triangles.
Main parameters:
- Input tree file
-
-name_match
name_match: File with lines:<name_old> <tab> <name_new>
, to replace leaf names; -
-decimals
decimals: Number of decimals in arc lengths, default = 6; -
-format
{ newick | itree (makeDistTree output) | ASNT (textual ASN.1) } : default =newick
; -
-ext_name
: Extended leaf names fornewick
; -
-order
: Order subtrees by the number of leaves descending,
Convert a tree from an internal format to Newick adding normalized object criterion to each leaf:
$DT/phylogeny/printDistTree -data data/Enterobacteriaceae -dissim_attr Conservation \
-variance linExp Enterobacteriaceae.tree \
-order -decimals 4 -ext_name > Enterobacteriaceae.nw
Convert a tree from an internal format to Newick without adding normalized object criterion to each leaf:
$DT/phylogeny/printDistTree Enterobacteriaceae.tree -order -decimals 4 \
> Enterobacteriaceae.nw
Convert a newick tree to the makeDistTree
tree format.
Parameter: Input newick tree.
PAUP* version used: Portable version 4.0b10 for Unix
$DT/phylogeny/attr2_2paup $DT/phylogeny/data/Saccharomyces cons map > Saccharomyces.nex
$ paup Saccharomyces.nex
paup> Set criterion=distance;
paup> dset objective=lsfit power=2;
paup> hsearch
...
Elapsed Taxa Rearr. -- Number of trees -- Best
time added tried saved left-to-swap tree(s)
--------------------------------------------------------------
0:01:00 - 247 1 1 3148.9391
...
1:00:07 - 14984 1 1 1334.1309
^C
$DT/phylogeny/makeDistTree -data $DT/phylogeny/data/Saccharomyces -variance sqr \
-variance_dissim -optimize
Takes 2 min.
Abs. criterion = 6.4861e+02.