Python 0.5.0
Major Feature Release
Breaking Changes
-
The JSON metadata codec now interprets the empty string as an empty object. This means
that applying a schema to an existing table will no longer necessitate modifying the
existing rows. (@benjeffery, #2064, #2104) -
Remove the previously deprecated
as_bytes
argument toTreeSequence.variants
.
If you need genotypes in byte form this can be done following the code in the
to_macs
method on line5573
oftrees.py
.
This argument was initially deprecated more than 3 years ago when the code was part of
msprime
.
(@benjeffery, #605, #2172) -
Arguments after
ploidy
inwrite_vcf
marked as keyword only
(@jeromekelleher, #2329, #2315). -
When metadata equal to
b''
is printed to text or HTML tables it will render as
an empty string rather than"b''"
. (@hyanwong, #2349, #2351)
Changes
-
A
min_time
parameter indraw_svg
enables the youngest node as the y axis min
value, allowing negative times.
(@hyanwong, #2197, #2215) -
VcfWriter.write
now prints the site ID of variants in the ID field of the
output VCF files.
(@roohy, #2103, #2107) -
Make dumping of tables and tree sequences to disk a zero-copy operation.
(@benjeffery, #2111, #2124) -
Add
copy
argument toTreeSequence.variants
which if False reuses the
returnedVariant
object for improved performance. Defaults to True.
(@benjeffery, #605, #2172) -
tree.mrca
now takes 2 or more arguments and gives the common ancestor of them all.
(@savitakartik, #1340, #2121) -
Add a
edge
attribute to theMutation
class that gives the ID of the
edge that the mutation falls on.
(@jeromekelleher, #685, #2279). -
Add the
TreeSequence.split_edges
operation which inserts nodes into
edges at a specific time.
(@jeromekelleher, #2276, #2296). -
Add the
TreeSequence.decapitate
(and closely related
TableCollection.delete_older
) operation to remove topology and mutations
older than a give time.
(@jeromekelleher, #2236, #2302, #2331). -
Add the
TreeSequence.individuals_time
andTreeSequence.individuals_population
methods to return arrays of per-individual times and populations, respectively.
(@petrelharp, #1481, #2298). -
Add the
sample_mask
andsite_mask
towrite_vcf
to allow parts
of an output VCF to be omitted or marked as missing data. Also add the
as_vcf
convenience function, to return VCF as a string.
(@jeromekelleher, #2300). -
Add support for missing data to
write_vcf
, and add theisolated_as_missing
argument. (@jeromekelleher, #2329, #447). -
Add
Tree.num_children_array
andTree.num_children
. Returns the counts of
the number of child nodes for each or a single node in the tree respectively.
(@GertjanBisschop, #2318, #2319, #2332) -
Add
Tree.path_length
.
(@jeremyguez, #2249, #2259). -
Add B1 tree balance index.
(@jeremyguez, @jeromekelleher, #2251, #2281, #2346). -
Add B2 tree balance index.
(@jeremyguez, @jeromekelleher, #2252, #2353, #2354). -
Add Sackin tree imbalance index.
(@jeremyguez, @jeromekelleher, #2246, #2258). -
Add Colless tree imbalance index.
(@jeremyguez, @jeromekelleher, #2250, #2266, #2344). -
Add
direction
argument toTreeSequence.edge_diffs
, allowing iteration
over diffs in the reverse direction. NOTE: this comes with a ~10% performance
regression as the implementation was moved from C to Python for simplicity
and maintainability. Please open an issue if this affects your application.
(@jeromekelleher, @benjeffery, #2120). -
Add
Tree.edge_array
andTree.edge
. Returns the edge id of the edge encoding
the relationship of each node with its parent.
(@GertjanBisschop, #2361, #2357)