2.0.0.beta3.20241014
Pre-releaseDiscrete Colorscale: We have implemented an RGB 3-color bar with
programmable cutoffs, similar to the original SDT. The cutoff values set the
percent identity boundaries at which the different color scales start and end.
This option is selectable at the very bottom of the "Colorscale" dropdown in
the heatmap plot.
Error Reporting: Users should now be prompted by a dialog box that
displays the error associated with the crash along with the option to email or
post directly to the SDT2 GitHub repository. Please note that some antivirus
software may not allow the email option unless an exception is issued.
Multiprocessing: SDT2 is capable of utilizing multi-core computing to
dramatically speed up computation times. Each new core effectively doubles
the memory requirements of an analysis. Macs and typical Windows systems
default to automatic management of the page file ("swap" on non-Windows
systems,) which allows for disk space to be used in place of memory should a
user exceed system memory. However, this should be approached with
caution. If configured incorrectly, the application—and possibly the system—
could crash. For systems with an inadequate or missing page file, we've
added error recovery during alignment that allows the user to restart
alignment with different settings. We have changed the recommended cores
dropdown to instead display the recommendation directly under the slider.
The estimates should conservatively remain below available system memory
as we have implemented fairly accurate system memory calculations, which
account for available system memory and an estimate of page file use. It is
possible, with a high enough disk allocation, to align fairly large sequences
(>160k), but sequences of this length are almost always better run on a
single core, regardless of memory estimates.
Graph Clustering: The export option offers a rudimentary graph clustering
output, and we are still debating its inclusion in the final build. It converts
the similarity output matrix to binary and then clusters by connected
components. Threshold 1 sets the initial clustering, where all sequences
that share that identity will be connected and considered as a distinct
cluster. Threshold 2 performs the same but within each established cluster.
This was originally intended to be a genus/species level clustering by
programmable input.