forked from bkainz/fetalReconstruction
-
Notifications
You must be signed in to change notification settings - Fork 2
/
README
executable file
·194 lines (145 loc) · 10.6 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
__Fast Volume Reconstruction from Motion Corrupted 2D Slices__
======================
This software was used to produce the results shown in
Bernhard Kainz, Markus Steinberger, Wolfgang Wein, Maria Kuklisova-Murgasova,
Christina Malamateniou, Kevin Keraudren, Thomas Torsney-Weir, Mary Rutherford,
Paul Aljabar, Joseph V. Hajnal, and Daniel Rueckert: Fast Volume Reconstruction
from Motion Corrupted Stacks of 2D Slices. IEEE Transactions on Medical Imaging,
in print, 2015. doi:10.1109/TMI.2015.2415453
Announcements
======================
- The --useCPU flag has some bug currently and might have caused some confusion. CPU only does not provide
the full functionality of the GPU version. For example PSF sampling is different and the result of the
CPU version will be worse. -- I am working on it.
- Hacked a fix for the working CPU version -- please report if you experience more bugs with that
- There seem to be an issue with CUDA 7.0. Please use CUDA 6.5 until I found the problem.
- The code requires refactoring to cleanly separate data from data access and CPU and GPU parts.
Abstract
======================
Moving objects cause motion artefacts when their enclosing volume is acquired as a stack of image slices.
In this paper we present a fast multi-GPU accelerated implementation of slice-to-volume registration based super-resolution reconstruction with automatic outlier rejection and intensity bias correction.
We introduce a novel fully automatic selection procedure for the image stack with the least motion, which will serve as an initial registration template. We fully evaluate our method and its high dimensional parameter space. Testing is done using artificially motion corrupted phantom datasets and using real world scenarios for the reconstruction of foetal organs from in-utero prenatal Magnetic Resonance Imaging and for the motion compensation of freehand compound Ultrasound data.
On average we achieve a speed-up of more than 40x compared to a single CPU system, and another 1.70x for each additional GPU, while maintaining the same image quality as if calculated on a CPU. Our framework is qualitatively more accurate and on average $10\times$ times faster than currently available state-of-the-art multi-core methods.
The source code for this approach is open source and publicly available for download.
INSTALL
======================
The source code was successfully compiled on x86_64, CUDA 6.5, VS2012 (std-c++11), VS2013 (boost), and gcc 4.8. System: Intel Xeon E5-2630 v2 \@ 2.60GHz system with 16 GB RAM, an Nvidia Tesla K40 with 12 GB RAM and a Geforce 780 Graphics card with 6 GB RAM and also tested on a Nvidia Titan GPU.
#### Necessary Hardware requirements
* x86_64 CPU
#### Optional Hardware requirements
* Nvidia GPU compute capability >= 3.0
In case of no GPU: only CPU reconstruction will be available. Use --useCPU flag [CUDA is still necessary to compile!]
HOWEVER: The CPU version only allows a simplified PSF evaluation (only linearly low sampled Gaussian PSF)!
ONLY THE GPU ACCELERATED VERSION WILL PROVIDE ALL IN THE PAPER DESCRIBED FEATURES!
#### Necessary third party libraries
* Nvidia CUDA >= 6.0 https://developer.nvidia.com/cuda-downloads
* Boost (compiled libraries, we used 1.55.0) http://www.boost.org/users/download/
* Intel's TBB https://www.threadingbuildingblocks.org/
#### Optional third party libraries
* CULA http://www.culatools.com/
* IRTK http://www.doc.ic.ac.uk/~dr/software/ (This is already included in the source code repository as minimal build lib. We only use image handling and CPU registration functionality from IRTK)
BUILD INSTRUCTIONS
======================
* go to the source directory
* make a new directory "build"
* execute cmake .. in this directory (give paths to TBB and other third party libs if necessary using cmakegui .. or ccmake ..)
* make or open the Visual Studio Solution
Execute
======================
* get two or more overlapping stacks of image slices
* [optional]: generate a region of interest and sace as .nii or .nii.gz
* run the program __reconstruction_GPU2__
* Example: ./reconstruction_GPU2 -o 3T_GPUtest.nii -i ../../data/14_3T_nody_001.nii.gz ../../data/10_3T_nody_001.nii.gz ../../data/23_3T_nody_001.nii.gz ../../data/21_3T_nody_001.nii.gz -m ../../data/mask_10_3T_brain_smooth.nii.gz
* Options:
-h [ --help ] Print usage messages
-o [ --output ] arg Name for the reconstructed volume. Nifti
or Analyze format.
-m [ --mask ] arg Binary mask to define the region od
interest. Nifti or Analyze format.
-i [ --input ] arg [stack_1] .. [stack_N] The input stacks.
Nifti or Analyze format.
-t [ --transformation ] arg The transformations of the input stack to
template in 'dof' format used in IRTK.
Only rough alignment with correct
orienation and some overlap is needed. Use
'id' for an identity transformation for at
least one stack. The first stack with 'id'
transformation will be resampled as
template.
--thickness arg [th_1] .. [th_N] Give slice
thickness.[Default: twice voxel size in z
direction]
-p [ --packages ] arg Give number of packages used during
acquisition for each stack. The stacks
will be split into packages during
registration iteration 1 and then into odd
and even slices within each package during
registration iteration 2. The method will
then continue with slice to volume
approach. [Default: slice to volume
registration only]
--iterations arg (=4) Number of registration-reconstruction
iterations.
--sigma arg (=12) Stdev for bias field. [Default: 12mm]
--resolution arg (=0.75) Isotropic resolution of the volume.
[Default: 0.75mm]
--multires arg (=3) Multiresolution smooting with given number
of levels. [Default: 3]
--average arg (=700) Average intensity value for stacks
[Default: 700]
--delta arg (=150) Parameter to define what is an edge.
[Default: 150]
--lambda arg (=0.02) Smoothing parameter. [Default: 0.02]
--lastIterLambda arg (=0.01) Smoothing parameter for last iteration.
[Default: 0.01]
--smooth_mask arg (=4) Smooth the mask to reduce artefacts of
manual segmentation. [Default: 4mm]
--global_bias_correction arg (=0) Correct the bias in reconstructed image
against previous estimation.
--low_intensity_cutoff arg (=0.01) Lower intensity threshold for inclusion of
voxels in global bias correction.
--force_exclude arg force_exclude [number of slices] [ind1]
... [indN] Force exclusion of slices with
these indices.
--no_intensity_matching arg Switch off intensity matching.
--log_prefix arg Prefix for the log file.
--debug arg (=0) Debug mode - save intermediate results.
--debug_gpu Debug only GPU results.
--rec_iterations_first arg (=4) Set number of superresolution iterations
--rec_iterations_last arg (=13) Set number of superresolution iterations
for the last iteration
--num_stacks_tuner arg (=0) Set number of input stacks that are
really used (for tuner evaluation, use
only first x)
--no_log arg (=0) Do not redirect cout and cerr to log
files.
-d [ --devices ] arg Select the CP > 3.0 GPUs on which the
reconstruction should be executed.
Default: all devices > CP 3.0
--tfolder arg [folder] Use existing slice-to-volume
transformations to initialize the
reconstruction.
--sfolder arg [folder] Use existing registered slices
and replace loaded ones (have to be
equally many as loaded from stacks).
--referenceVolume arg Name for an optional reference volume.
Will be used as inital reconstruction.
--T1PackageSize arg is a test if you can register T1 to T2
using NMI and only one iteration
--useCPU use CPU for reconstruction and
registration; performs superresolution and
robust statistics on CPU. Default is using
the GPU
--useCPUReg use CPU for more flexible CPU
registration; performs superresolution and
robust statistics on GPU. [default, best
result]
--useGPUReg use faster but less accurate and flexible
GPU registration; performs superresolution
and robust statistics on GPU.
--useAutoTemplate select 3D registration template stack
automatically with matrix rank method.
--disableBiasCorrection disable bias field correction for cases
with little or no bias field
inhomogenities (makes it faster but less
reliable for stron intensity bias)