forked from pghysels/STRUMPACK
-
Notifications
You must be signed in to change notification settings - Fork 0
/
ChangeLog.txt
181 lines (157 loc) · 7.23 KB
/
ChangeLog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
Version 2.3.0 (in progress) -> 3.0.0 ??
=======================================
- The scalability of the sparse HSS preconditioner has been
drastically improved. Extraction of elements (for the diagonal
blocks and the B_ij generators) is now done for multiple blocks
concurrently.
- Add option to use Combinatorial BLAS approximate weight perfect
matching instead of MC64. Only works in parallel, requires a square
number of processes. This currently requires a special version of
CombBLAS.
- Fix integer overflow in the counting of nonzeros in the factors.
- Allow compilation without MPI (and ScaLAPACK, ParMetis..)
- Add experimental support for Block Low-Rank compression, for both
sparse (preconditioning) and dense. For now, this is only shared
memory.
- Improvements in the build system: In the CMake script, check
whether OpenMP task dependencies and OpenMP taskloops are
supported.
- Removed the CSCMatrix class for compressed sparse column storage,
this was not tested/used.
- Revamped website (thanks Lucy), and doxygen documentation/manual
- Added lots of doxygen documentation/comments
- Several bugfixes!
Version 2.2.0 Released March 32, 2017
=====================================
- Changes in the build system
- ParMETIS, (PT)Scotch are now optional
- Work to make STRUMPACK compatible with the xSDK policies
- Moved header files in subfolders
- Made interface const correct
- Support for multiple right-hand sides
- Improved threading in HSS code, improved performance for the
hybrid MPI+OpenMP code: many unnecessary matrix copies are
now avoided
- Flops are now counted/reported correctly when running with
MPI and or OpenMP.
- geometric nested dissection now supports wider stencils and
multiple degrees of freedom per node (TODO add in MPI code)
- Many performance and several bug fixes
Version 2.1.0 Released October 26, 2017
=======================================
- Cleanup in the C interface.
- Removed (set_)HSS_min_front_size and --sp_hss_min_front_size from
the manual, it is not supported (yet)
- In some of the examples, the PBLAS blocksize was set to 3 (for
debugging). This has been removes, and the default blocksize of 32
is now used, leading to much better performance for those examples.
- Small bugfix in the adaptive HSS compression stopping criterion.
- Disable replacement of tiny pivots by default. It can lead to
convergence problems for large matrices.
Version 2.0.1 Released October 7, 2017
======================================
- Critical bug fix in HSS Schur complement update.
- Some minor performance improvements.
- Valgrind fixes, compiler warning fixes (when not compiling with
OpenMP).
Version 2.0.0 Released October 1, 2017
======================================
This is a major revision. From now on STRUMPACK is released as a
single library, including both the sparse and the dense
components. Unfortunately, the dense code is not documented yet.
Additionally, the development version of the code is now available
from the public github repository
https://github.com/pghysels/STRUMPACK
The main changes since release 1.1.1 are:
- The options for StrumpackSparseSolver are now set through an object
of type SPOptions, stored in the StrumpackSparseSolver class.
- The template parameter for the real type has been removed. It is
now derived from the scalar type.
- The HSS code has been completely rewritten, performance is much
improved, memory leaks and valgrind errors have been fixed.
- A new, more robust adaptive HSS compression algorithm has been
implemented. This was developed in collaboration with Theo Mary
from Université Toulouse.
- We have automated testing of the code.
- Many bugs have been fixed. However, some bugs probably still
remain. If you happen to encounter any problems while running
STRUMPACK, please do not hesitate to contact us.
Version 1.1.1 Released July 16, 2017
====================================
- Add function STRUMPACK_set_from_options_no_warning_unrecognized(..)
suppress warning
Version 1.1: Released November 8, 2016
======================================
- Rewrite reordering code, can now use sequential METIS and SCOTCH
from distributed interface.
- Change default minimum HSS front size to 2500 (as used in ipdps17 paper).
- Performance improvements in HSS code, mainly HSS compression.
- Add RCM ordering.
- Adaptive HSS compression in MPI code. This changes the default
HSS preconditioning strategy to ADAPTIVE.
- Add option to choose between METIS_NodeND and METIS_NodeNDP.
- Add a program to read matrix market file and print out binary file.
- Several other bug fixes.
Version 1.0.4: Released August 4, 2016
======================================
- Moved examples to the exmples/ folder, deleted test folder.
- Add pde900.mtx test matrix market file.
- Add README to examples.
- Fixes for memory leaks!
- Fix bug in separator reordering and another fix in distributed
separator reordering.
- Print message if LU fails, tell user to try to enable MC64 if not
already enabled.
- Include missing stdlib.h.
- Small performance enhancements in extend-add.
- Small improvement in proportional mapping.
- Various performance improvements throughout the code:
Several HSS algorithms were done serially, first one child, then the other.
- Added some timers for profiling, enable with -DUSE_TASK_TIMER.
- Performance improvements in front_multiply_2d (for random sampling
of front).
- Add a faq in the manual.
- Describe all command line options in the manual.
- Avoid recursion in the e-tree.
- Change the default relative compression tolerance.
Version 1.0.3: Released June 6, 2016
====================================
- Fix for building a shared library (thanks to Barry Smith)
- Fix for complex numbers (thanks to Barry Smith)
- Add an example folder with an example simple Makefile
Version 1.0.2: Released June 2, 2016
====================================
- Explain how to tune the preconditioner in the manual.
- Suppress output from DenseMPI.
- Print warnings/errors to cerr iso cout.
- Allow compilation without OpenMP.
- Fix some compiler warnings.
- Fizes for (Apple)Clang.
- Improve CMake detection of ScaLAPACK.
Version 1.0.1: Released May 23, 2016
====================================
- Small fix for GCC 6.1
- Update TODO
- FIX: set min_rand_HSS equal to number of columns in R.
- Code to print some stats about front size and nr random vectors and ranks.
- Change to the default rank pattern strategy.
- Check fgets return code.
- Update build script, check for ScaLAPACK.
- Print clear warning if BLAS/LAPACK not found.
- Update STRUMPACK README.
- Print correct Metis error code.
- Micro optimizations.
- Fix MPI communicator bug.
- Fix memory leaks.
- Fix very slow separator reordering for MPIDist interface.
- Add description of block-distributed CSR to the manual.
- Use CSR iso triplets in prop dist sparse matrix.
- Missing return statements.
- Do not use __gnu_parallel.
- Fix bug in Redistribute (integer_t -> float).
- Add 64 bit MPIDist test/example.
- Add 64 integer test.
- Check PETSc input for NULL.
Version 1.0.0: Released May 4, 2016
===================================
- Initial release of the STRUMPACK-sparse code with support for MPI+OpenMP