forked from pycogent/pycogent
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathChangeLog
executable file
·642 lines (563 loc) · 29 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
*********
Changelog
*********
Cogent 1.5.3 - 1.9
==================
NOTE: There have been so many changes since the 1.5.3 release that it is not
viable to list themn all! Suffice it to say, many new features added and many bugs
have been squashed. The best way to see them is using either git or mercurial
to examine the commit logs.
Cogent 1.5.2 - 1.5.3
====================
New Features
------------
* Added a withoutLostSpans() method to Feature objects in
cogent.core.annotation. Useful after projecting features from one aligned
sequence across to another. Implemented for ordinary Features and
SimpleVariables but not xxy_list Variables.
Changes
-------
* Make Span.remapWith() a little clearer in cogent.core.location.
* Tidy annotation remapping code in cogent.core.annotation and
cogent.core.sequence so that new positions are only calculated once when
slicing, projecting, or otherwise remapping parts of sequences. The old code
was needlessly doing it twice.
Bug Fixes
---------
* Fixed bug in BLAT application controller (cogent.app.blat) which would drop
some input sequences when running assign_dna_reads_to_protein_database.
* Prevent negative widths from arising in cogent.draw.compatibility when
alignment is too wide.
Cogent 1.5.1 - 1.5.2
====================
New Features
------------
* Added new mantel_test function to cogent.maths.stats.test that allows the
type of significance test to be specified. This function is meant to replace
the pre-existing mantel function.
* Added new correlation_test function to cogent.maths.stats.test that computes
the correlation (pearson or spearman) between two vectors, in addition to
parametric and nonparametric tests of significance and confidence intervals.
This function gives more control and information than the pre-existing
correlation function. The spearman function is also a new addition.
* Added new mc_t_two_sample function to cogent.maths.stats.test that performs a
two-sample t-test and uses Monte Carlo permutations to determine
nonparametric significance (similar to R's Deducer::perm.t.test).
* Added guppy 1.1, pplacer 1.1, ParsInsert 1.04, usearch 5.2.32, rtax 0.981,
raxml 7.3.0, BLAT 34, and BWA 0.6.2 application controllers.
* Added new functions to cogent.maths.stats.rarefaction that provide
alternative ways to perform rarefaction subsampling.
* Added convenience wrappers assign_dna_reads_to_database,
assign_dna_reads_to_protein_database, and assign_dna_reads_to_dna_database
for BLAT, BWA, and usearch with consistent interface across all three.
Changes
-------
* Minimum matplotlib version now set to 1.1.0.
* Minimum Vienna package version now set to 1.8.5.
* The pearson function in cogent.maths.stats.test has more robust
error-checking.
* The mantel and mantel_test functions in cogent.maths.stats.test now check for
symmetric, hollow distance matrices as input by default, with an option to
disable these checks.
* cogent.draw.distribution_plots now uses matplotlib proxy Artists for legend
creation (this simplifies the code a bit). Added ability to set the size of
plot figures through two new optional parameters to generate_box_plots and
generate_comparative_plots. More robust checks have been put in place in
case making room for labels fails (this now uses matplotlib 1.1.0's new
tight_layout() functionality, but this can still fail in some cases).
* cogent.app.raxml (version 7.0.3) is now deprecated and will be removed in
1.6.0. Please use cogent.app.raxml_v730 instead (version 7.3.0).
* cogent.app.muscle (version 3.6) is now deprecated and will be removed in
1.6.0. Please use cogent.app.muscle_v38 instead (version 3.8).
* Updated cogent.app.uclust to handle --stepwords and --w.
Bug Fixes
---------
* improve handling of reading frames from Ensembl
* actually included the test_ensembl/test_metazoa.py file that was
accidentally overlooked.
* fixed small diff in postcript output from RNAfold
* Deprecation and discontinued warnings are now not ignored by default.
cogent.util.warning was ignored in Python 2.7 because it uses
DeprecationWarnings. These warnings are temporarily forced to not be ignored.
* Included test_app/test_formatdb.py and test_app/test_mothur.py files in
alltests.py.
* Fixed test_safe_md5 in tests.test_util.test_misc to no longer run an MD5 over
a file in PyCogent (this caused the test to break when a new release went out
because the MD5 changes due to the new version string). The test now writes a
temporary file populated with fixed data and computes the MD5 from that.
* Fixed data_file_links.html in the PyCogent documentation to correctly point
to several data files that were previously unreachable.
Cogent 1.5 - 1.5.1
==================
New Features
------------
* Alignments can now add sequences that are pairwise aligned to a sequence
already present in the alignment.
* Alignment.addSeqs has more flexibility with the specific order of sequences
now controllable by the user. Thanks to Jan Kosinski for these two very
useful patches!
* Increased options for reading Table data from files: limit keyword; and,
line based (as distinct from column-based) type-casting of delimited files.
* Flexible parser for raw Greengenes 16S records
* Add fast pairwise distance estimation for DNA/RNA sequences. Currently only
Jukes-Cantor 1969 and Tamura-Nei 1993 distances are provided. A cookbook
entry was added to building_phylogenies.
* Added a PredefinedNucleotide substitution model class. This class uses
Cython implementations for nucleotide models where analytical solutions
are available. Substantial speedups are achieved. These implementations
do not support obtaining the rate matrix. Use the older style implementation
if you require that (toggled by the rate_matrix_required argument).
* Added fit_function function. This allows to fit any model to a x and y
dataset using simplex to reduce the error between the model and the data.
* Added parsers for bowtie and for BLAT's PSL format
* Table can now read/write gzipped files.
* GeneticCode class has a getStopIndices method. Returns the index positions
of stop codons in a sequence for a user specified translation frame.
* Added LogDet metric to cogent.evolve.pairwise_distance. With able assistance
from Yicheng Zhu. Thanks Yicheng!
* Added jackknife code to cogent.maths.stats.jackknife. This can be used to
measure confidence of an estimate from a single vector or a matrix. Thanks
to Anuj Pahwa for help implementing this!
* Added abundance-based Jaccard beta diversity index (Chao et. al., 2005)
Changes
-------
* python 2.6 is now the minimum required version
* We have removed code authored by Ziheng Yang as it is not available under
an open source license. We note a modest performance hit for nucleotide and
dinucleotide models. Codon models are not affected. The PredefinedNucleotide
models recently added are faster than the older approach that used Yang's
code.
* The PredefinedNucleotide models are now available via cogent.evolve.models.
The old-style (slower) nucleotide models can be obtained by setting
rate_matrix_required=True.
* RichGenbankParser can now return WGS blocks
Bug Fixes
---------
* fixed bug that crept into doing consensus trees from tree collections.
Thanks to Klara Verbyla for catching this one!
* fixed a bug (#3170464) affecting obtaining sequences from non-chromosome
level coordinate systems. Thanks to brandoninvergo for reporting and Hua
Ying for the patch!
* fixed a bug (#2987278) associated with missing unit tests for gbseq.py
* fixed a bug (#2987264) associated with missing unit tests for paml_matrix.py
* fixed a bug (#2987238) associated with missing unit tests for tinyseq.py
Cogent 1.4.1 - 1.5
==================
New Features
------------
* major additions to Cookbook. Thanks to the many contributors (too many to
list here)!
* added AlleleFreqs attribute to ensembl Variation objects.
* added getGeneByStableId method to genome objects.
* added Introns attribute to Transcript objects and an Intron class. (Thanks
to Hua Ying for this patch!)
* added Mann-Whitney test and a Monte-Carlo version
* exploratory and confirmatory period estimation techniques (suitable for
symbolic and continuous data)
* Information theoretic measures (AIC and BIC) added
* drawing of trees with collapsed nodes
* progress display indicator support for terminal and GUI apps
* added parser for illumina HiSeq2000 and GAiix sequence files as
cogent.parse.illumina_sequence.MinimalIlluminaSequenceParser.
* added parser to FASTQ files, one of the output options for illumina's
workflow, also added cookbook demo.
* added functionality for parsing of SFF files without the Roche tools in
cogent.parse.binary_sff
Changes
-------
* thousand fold performance improvement to nmds
* >10-fold performance improvements to some Table operations
Bug Fixes
---------
* Fixed a Bug in cogent.core.alphabet that resulted in 4 tests err'ing out
when using NumPy v1.4.1
* Sourceforge bugs 2987289, 2987277, 2987378, 2987272, 2987269 were addressed
and fixed
Cogent 1.4 - 1.4.1
==================
New Features
------------
* Simplified getting genetic variation from Ensembl and provide the protein
location of nonsynonymous variants.
* rate heterogeneity variants of pre-canned continuous time substitution
models easier to define.
* Added implementation of generalised neighbour joining.
* New capabilities for examining genetic variants using Ensembl.
* Phylogenetic methods that can return collections of trees do so as a
TreeCollections object, which has writeToFile and getConsensusTree methods.
* Added uclust application controller which currently supports uclust
v1.1.579.
Changes
-------
* Major additions to Cookbook documentation courtesy of Tom Elliot. Thanks
Tom!
* Improvements to parallelisation.
Bug Fixes
---------
Cogent 1.3 - 1.4
================
New Features
------------
* added support for manipulating and handling macromolecular structures.
This includes a PDB file format parser, a hierarchical data structure
to represent molecules. Various utilities to manipulate e.g. clean-up
molecules, efficient surface area and proximity-contact calculation via
cython. Expansion into unit-cells and crystal lattices is also possible.
* added a KD-Tree class for fast nearest neighbor look-ups.
Supports k-nearest neighbours and all neighbours within radius queries
currently only in 3D.
* added new tools for evaluating clustering stresses, goodness_of_fit.
In cogent.cluster .
* added a new clustering tool, procrustes analysis. In cogent.cluster .
* phylo.distance.EstimateDistances class has a new argument, modify_lf. This
allows the use the modify the likelihood function, possibly constraining
parameters, freeing them, setting bounds, pre-optimising etc..
* cogent Table mods; added transposed and normalized methods while the summed
method can now return column or row sums.
* Added a new context dependent model class. The conditional nucleotide
frequency weighted class has been demonstrated to be superior to the model
forms of Goldman and Yang (1994) and Muse and Gaut (1994). The publication
supporting this claim is In press at Mol Biol Evol, authored by Yap,
Lindsay, Easteal and Huttley.
* added new argument to LoadTable to facilitate speedier loading of large
files. The static_column_types argument auto-generates a separator format
parser with column conversion for numeric columns.
* added BLAST XML parser + tests.
* Add compatibility matrix module for determining reticulate evolution.
* Added 'start' and 'order' options to the WLS and ML tree finding method
.trex() These allow the search of tree-space to be constrained to start at a
particular subtree and to proceed in a specified order of leaf additions.
* Consensus tree of weighted trees from phylo.maximum_likelihood.
* Add Alignment.withGapsFrom() aka mirrorGaps, mirrors the gaps into a
provided alignment.
* added ANOVA to maths.stats.test
* LoadTable gets an optional argument (static_column_types) to simplify speedy
loading of big files.
Changes
-------
* Python 2.4 is no longer supported.
* NumPy 1.3 is now the minimum supported NumPy version.
* zlib is now a dependency.
* cogent.format.table.asReportlabTable is being discontinued in version 1.5.
This is the last dependency on reportlab and removal will simplify
installation.
* the conditional nucleotide model (Yap et al 2009) will be made the default
model form for context dependent models in version 1.5.
* Change required MPI library from PyxMPI to mpi4py.
* Move all of the cogent.draw.matplotlib.* modules up to cogent.draw.*
* Substitute matplotlib for reportlab throughout cogent.draw
* cogent.db.ensembl code updated to work with the latest Ensembl release (56)
* motif prob pseudocount option, used for initial values of optimisable mprobs
* The mlagan application controller has been removed.
Bug Fixes
---------
* Fix and test for two bugs in multiple alignment, One in the pairwise.py
Hirschberg code and the other in indel_positions.py where gaps at the end
were effectively taken to be deletions and never insertions, unlike gaps at
the start.
* fix #2811993 alignment getMotifProbs. allow_gap argument now has an effect.
Cogent 1.2 - 1.3
================
New Features
------------
* Python2.6 is now supported
* added cogent.cluster.nmds, code to perform nonmetric multidimensional
scaling. Not as fast as others (e.g.: R, MASS package, isoMDS)
* Documentation ported to using the Sphinx documentation generator.
* Major additions to documentation in doc/examples.
* Added partial support for querying the Ensembl MySQL databases. This
capacity has additional dependencies (MySQL-python and SQLAlchemy). This
module should be considered alpha level code, (although it has worked
reliably for some time in the hands of the developers).
* Introduced a new substitution model family. This family has the same form as
that originally described by Muse and Gaut (Mol Biol Evol, 11, 715-24).
These models were applied in the article by Lindsay et al. (2008, Biol
Direct, 3, 52). Model state defaults to the tuple weighted matrix (eg. the
Goldman and Yang codon models). Selecting the nucleotide weighted matrix is
done using the mprob_model argument.
* Likelihood functions now have a getStatistics method. This returns cogent
Table objects. Optional arguments are with_motif_probs and with_titles where
the latter refers to the Table.Title attribute being set.
* Added rna_struct formatter and rna_plot parser
* A fast unifrac method implementation.
* Added new methods on tree related objects: TreeNode.getNodesDict;
TreeNode.reassignNames; PhyloNode.tipsWithinDistance;
PhyloNode.totalDescendingBranchLength
* Adopted Sphinx documentation system, added many new use cases, improved
existing ones.
* added setTablesFormat to likelihood function. This allows setting the
spacing, display precision of the stats tables resulting from printing a
likelihood function.
* Added non-parametric multidimensional scaling (nmds) method.
* Added a seperate app controller for FastTree v1.0
* new protein MolType, PROTEIN_WITH_STOP, that supports the stop codon new
sequence objects, ProteinWithStopSequence and ModelProteinWithStopSequence
to support the new MolType.
* Support for Cython added.
Changes
-------
* reconstructAncestralSequences has been deprecated to
reconstructAncestralSeqs. It will be removed in version 1.4.
* updated parsers
* TreeNode.getNewick is now iterative. For recursive, use
TreeNode.getNewickRecursive. Both the iterative and recursive methods of
getNewick now support the keyword 'escape_name'. DndParser now supports the
keyword 'unescape_name'. DndParser unescape_name will now try to remove
underscores (like underscore_unmunge).
* Generalized MinimalRnaalifoldParser to parse structures and energies from
RNAfold as well.
* PhyloNode.tipToTipDistances can now work on a subset of tips by passing
either a list of tip names or a list of tip nodes using the endpoints param.
* deprecating reconstructAncestralSequences to deprecating
reconstructAncestralSeqs.
* updated app controller parameters for FastTree v1.1
* Allow and require a more recent version of Pyrex.
* LoadTree is now a method of cogent.__init__
.. warning:: Pyrex is no longer the accepted way to develop extensions. Use
`Cython <http://www.cython.org/>`_ instead.
Bug Fixes
---------
* the alignment sample methods and xsample (randint had the wrong max
argument)
* Fixes the tests that no longer work with NCBI's API changes, and sticks a
big warning for the unwary in the ncbi module pointing users to the
"official" list of reported rettypes. Note that the rettypes changed
recently and NCBI says they do not plan to support the old behavior.
* The TreeNode operators cmp, contains, and any operator that relies on those
methods will now only perform comparisons against the objects id. Prior
behavior first checked the TreeNode's Name attribute and then object id if
the Name was not present. This resulted in ambiguous behavior in certain
situations.
* Added type conversion to Mantel so it works on lists.
* kendall tau fix, bug 2794469
* Table now raises a RuntimeError if provided malformed data.
* Fixed silent type conversion in TreeNode, bug 2804431
* RangeNode is now properly passing kwargs to baseclass, bug 2804441
* DndParser was not producing correct trees in niche cases, bug 2798580
Cogent 1.1 - 1.2
================
New Features
------------
* Code for performing molecular coevolution/covariation analyses on multiple
sequence alignments, plus support files. (Described in J. Caporaso et al.
BMC Evol Biol, 8(1):327, 2008.)
* App controller for `CD-HIT <http://www.bioinformatics.org/cd-hit/>`_
* A ParameterEnumerator object is now available in cogent.app.util. This
method will iterate over a range of parameters, returning parameter dicts
that can be used with the relevant app controller.
* Sequence and alignment objects that inherit from Annotatable can now mask
regions of sequence, returning new objects where the observed sequence
characters in the regions spanned by the annotations are replaced by a mask
character (mask_char).
* Table.count method. Counts the number of rows satisfying some condition.
* Format writer for stockholm and clustal formats.
* App controllers for dotur, infernal, RNAplot, RNAalifold. Parsers for
infernal and dotur.
* Empirical nucleotide substitution matrix estimation code (Described in
M. Oscamou et al. BMC Bioinformatics, 9(1):511, 2008)
New Documentation
-----------------
Usage examples (see doc/) were added for the following:
* Querying NCBI
* The motif module.
* UPGMA clustering
* For using the ParameterCombinations object and generating command lines
* Coevolution modelling
* Sequence annotation handling
* Table manipulation
* Principal components analysis (PCoA)
* Genetic code objects
* How to construct profiles, consensus seqs etc ..
Changes
-------
* PyCogent no longer relies on the Python math module. All math functions are
now imported from numpy. The main motivator was to remove casting between
numpy and Python types. Such as, a 'numpy.float64' variable unknowingly
being converted to a Python 'float' type.
* Tables.getDistinctValues now handles multiple columns.
* Table.Header is now an immutable property of Tables. Use the withNewHeader
method modifying Header labels.
* The TreeNode comparison methods now only check against the objects ID
Bug Fixes
---------
* LoadTable was ignoring title argument for standard file read.
* Fixed bug in Table.joined. When a join produces no result, now returns a
Table with 0 rows.
* Improved consistency of LoadTable with previous behaviour of cogent.Table
* Added methods to detect large sequences/alphabets and handle counts from
sequence triples correctly.
* goldman_q_dna_pair() and goldman_q_rna_pair() now average the frequency
matrix used.
* reverse complement of annotations with disjoint spans correctly preserve
order.
* Fixed ambiguity in TreeNode comparison methods which resulted in the prune
method incorrectly removing entire subtrees.
Cogent 1.0.1 - 1.1
==================
New Features
------------
* Added functionality to cogent.util.unit_test.TestCase
assertSameObj - use in place of 'assert a is b'
assertNotSameObj - use in place of 'assert a is not b'
assertIsPermutation - checks if observed is a permutation of items
assertIsProb - checks whether a value(s) are probabilities
assertIsBetween - use in place of 'assert a < obs < b'
assertLessThan - use in place of 'assert obs < value'
assertGreaterThan - use in place of 'assert obs > value'
assertSimiliarFreqs - compares frequency distributions using a G-test
assertSimiliarMeans - compares samples using a t-test
_set_suite_pvalue - set a suite wide pvalue
.. note:: both the similiarity assertions can have a pvalue specified in the
testing module. This pvalue can be overwritten during alltests.py by
calling TestCase._set_suite_pvalue(pvalue)
.. note:: All of these new assert methods can take lists as well. For instance:
obs = [1,2,3,4]
value = 5
self.assertLessThan(obs, value)
* Alignment constructor now checks for iterators (e.g. results from
parsers) and lists() them -- this allows direct construction like
Alignment(MinimalFastaParser(open(myfile.fasta))). Applies to both
dense and sparse alignments, and SequenceCollections.
* Parameterized LoadTree underscore stripping in node names, and turned it off
by default.
* new Table features and refactor
Trivial edits of the code provided by Felix Schill for SQL-like table
joining. Principally a unification of the different types of table
joins (inner- and outer-join) between 2 tables, and porting of all
testing code into test_table.rest. The method Table.joined provides
the interface (see tests/test_table.rest for usage).
* added a Table.count method
simply counts the number of rows satisfying some condition. Method
has the same args as for Table.filtered.
* Functions for obtaining the rate matrix for 2 or 3 sequences using the
Goldman method. Support for RNA and DNA.
* Additions to clustalw and muscle app controllers
muscle.py
add_seqs_to_alignment
align_two_alignments
align_unaligned_seqs
align_and_build_tree
build_tree_from_alignment
clustalw.py
align_unaligned_seqs
bootstrap_tree_from_alignment
build_tree_from_alignment
align_and_build_tree
* App controllers for Clearcut, ClustalW, Mafft
* Added midpoint rooting
* Accept FloatingPointError as well as ZeroDivisionError to accommodate numpy.
* Trees can now compare themselves to other trees using a couple of methods.
subsets: compare based on fraction of subsets of labels defined by clades that
are the same in the two trees.
tip_to_tip: compare based on correlations of tip_to_tip distances.
Both of these are fairly badly behaved statistically, so should always be
compared to a distribution of values from random (e.g. label-permuted) trees
using Monte Carlo.
* Added ability to exclude non-shared taxa from subsets tree cmp method.
* Added Zongzhi's combination and permutation implementations to transform.py.
* Added some docs to UPGMA_cluster.
* Added median in cogent.maths.stats.test
Added because the numpy version does not support an axis
parameter. This function now works like numpy functions
(sum, mean, etc...) where you can specify axis. This function should
be safe in place of numpy.median.
Changes
-------
* Many changes to the core objects, mainly for compatibility. Major changes in
this update:
- ModelSequence now inherits from SequenceI and supports the various Sequence
methods (e.g. nucleic acids can reverse-complement, etc.). Type checking is
still performed using strings (e.g. for ambiguous characters, etc.) and
could be improved, but everything seems to work. Bug # 1851959.
- ModelProteinSequence added. Bug # 1851961.
- DenseAlignment and ModelSequence can now handle the '?' character, which is
added to the Alphabet during install. Bug # 1851483.
- Fixed a severe bug in moltype constructors that mutated the dict of ambiguous
states after construction of each of the standard moltypes (for example,
preventing re-instantiation of a similar moltype after the initial install:
bug # 1851482. This would have been very confusing for anyone trying to
experiment with custom MolTypes.
- DenseAlignment now implements many methods of Alignment (some of which have
actually been moved into SequenceCollection), e.g. getGappedSeq() as per
bug # 1816573.
* Added parameter to MageListFromString and MageGroupFromString. Can now handle
'on' as well as 'off'.
* SequenceCollection, Alignment, etc. now check for duplicate seq labels and
raise exception or strip duplicates as desired. Added unit test to cover this
case.
- SequenceCollection now also produces FASTA as default __str__ behavior
like the other objects.
- DenseAlignment now iterates over its mapped items, not the indices for
those items, by default. This allows API compatibility with Alignment
but is slow: it may be worth optimizing this for cases such as detecting
ambiguous chars, as I have already implemented for gaps.
* Updated std in cogent.maths.stats.test
- std now takes an axis parameter like numpy functions (sum, mean,etc...).
- also added in a docstring and tests.
.. note::
cogent.maths.stats.test import sqrt from numpy instead of math
in order to allow std to work on arrays.
* Tree now uses iterative implementation of traverse().
.. warning::
If you **do** modify the tree while using traverse(), you will get
undesired results. If you need to modify the tree, use
traverse_recursive() instead. This only applies to the tree topology
(e.g. if you are adding or deleting nodes, or moving nodes around;
doesn't apply if you are changing branch lengths, etc.). The only two
uses I found in Cogent where the tree is modified during iteration
are in rna2d (some of the structure tree operations) and the prune()
method. I have changed both to use traverse_recursive for now.
However, there might be issues with other code. It might be worth
figuring out how to make the iterative method do the right thing
when the tree is modified -- suggestions are welcome provided they do
not impose substantial performance penalties.
* Made compatible with Python 2.4
* Changed dev status in setup call
* Dropping comments indicating windows support
Bug Fixes
---------
* Fixed bug 1850981: UPGMA does not check diagonals.
This bug was caused because the UPGMA algorithm picks the smallest
distances between nodes at each step but should not ever pick something on
the diagonal. To prevent a diagonal choice we set it to a large number,
but sometimes, for very large matrices, the diagonal sometimes is chosen
becuase the number decreases in value as the distances are averaged during
node collapse. To prevent this error, the program now checks to make sure
that the selected smallest_index is not on the diagonal. If it is, it
reassigns the diagonal to the large number.
* Fixed bug in gff.py attribute string parsing.
If attribute string did not contain double quotes, find() returned -1, so
the last character of the string was inadvertently omitted.
* Fixed error in taxonomy convenience functions that failed to pass in the specified database.
This used to be masked by ncbi's automatic conversion
between protein and nucleotide ids, but apparently this conversion no
longer operates in the tested cases.
* fixed unittest methods
Zongzhi noticed that assertFloatEqual would compare two against when a
shape (4,0) array was compared against a shape (4,4) array.
I added tests for assertFloatEqual, assertFloatEqualAbs,
assertFloatEqualRel and assertEqual. The same bug was noticed in
assertFloatEqualRel. They are now fixed. These fixes resulted in errors
in maths.stats.test.std and correlation_matrix. The std function needed
a work over, but the correlation_matrix was a fault in the test case
itself.
* fixed bug in reading tab-delimited tables
failure when a record had a missing observation in the last field
has been fixed. Line stripping of only line-feed characters is now
done.
* Fixed important bug in metric_scaling.
numpy eig() produces eigenvector array that is the transpose of Numeric eig().
Therefore, any code that does not take this into account will produce results
that are TOTALLY INCORRECT when fed to downstream analyses.
Coordinates from this module prior to this patch are incorrect and are not
to be trusted.
* Fixed a typo in dialign test
* tree __repr__ now more robust to non-str Name entries
* seqsim.rangenode traverse now compatible /w base class.
* Fixed line color bug in PR2 bias plots.
* Added method to dump raw coords from dendrogram.
* Fixed called to eigenvector when no pyrex
* Fixed bug in nonrecursive postorder traversal if not root
Cogent 1.0 - (9/8/2007)
=======================
* Initial release