-
Notifications
You must be signed in to change notification settings - Fork 4
Data reduction
The data reduction step can be divided into the following processes:
- Mapping detector pixels
- Background estimation and peak finding
- Lattice parameter estimation
- Finding probable orientations
Here we generate the necessary files for the EMC reconstruction. For the test dataset, this step can be skipped by directly using the reduced data provided by us, which will be described in Skip data reduction.
We start by updating the experimental parameters in the [make-detector]
section of config.ini
:
[make-detector]
# pixel
num_row = 2527
num_col = 2463
cx = 1285.5
cy = 1262.0
Rstop = 115.0
# meter
detd = 0.45
px = 172e-6
# angstrom
wl = 1.03324
res_max = 1.95
# beam incidence direction
sx = 0.005
sy = -0.01
sz = -1
The parameters num_row
and num_col
specify the detector size in pixels.
The pixels are labeled by coordinates (x,y)
, with x = 0, 1,..., num_row−1
and y = 0, 1,..., num_col−1
.
With this choice of coordinates, the upper-left detector pixel has coordinates (0,0)
, and the X-ray beam is incident in the -z
direction.
We assume the X-ray polarization is in the y
direction, because the main application of our program is on the analysis of SMX data taken at storage ring synchrotron sources.
The parameters (cx, cy)
label the beam incidence point on the detector, and Rstop
is the beamstop radius in pixels.
The other parameters include detd
, the sample-to-detector distance, px
, the squared detector pixel size, wl
, the incident X-ray wavelength, and res_max
, the maximum resolution of the pixels that will be considered in the reconstruction.
The vector (sx,sy,sz)
indicates the beam incidence direction (does not have to be normalized), and is typically set as (0, 0, −1)
.
After updating the parameters, we move to the directory make-detector
and execute the command
python make-mask.py [path to frame] > run.log
to generate the file mask.dat
in the directory aux
to exclude the detector gaps and the pixels shadowed by the beamstop holder.
Here [path to frame]
is the path to one of the data frames in the cbf format.
You should expect to see a masked data frame that looks like:
The beamstop region will be masked out in the files that record the mapping of the detector pixels to reciprocal space, which are obtained by executing the commands:
gcc make-detector.c -O3 -lm -o det
./det ../config.ini >> run.log
After moving to the directory make-background
, we generate the lists of the filenames associated with each data frame using the command:
python make-filelists.py [raw-data-dir]
Here [raw-data-dir]
is the path to the directory that contains the cbf files downloaded from CXIDB.
Then we update the parameters in the [make-background]
section in config.ini
:
[make-background]
num_raw_data = 79992
hot_pix_thres = 1e4
qlen = 500
The execution of make-filelists.py
has automatically updated the value of num_raw_data
, the total number of data frames.
The parameter hot_pix_thres
is the threshold value beyond which a pixel is identified as defective and masked out.
In our analysis, we assume that the background scatter in each data frame is azimuthally symmetric about the incident X-ray beam, and qlen
represents the number of bins that divide the spatial frequency magnitudes with equal spacing for the background estimation.
Finally, we execute the commands:
make
mpirun -np [nproc] ./ave_bg ../config.ini > run.log &
to estimate the pixel-wise background values and identify the outlier pixels in each frame, where [nproc]
is the number of processors used in the parallel processing.
Next, we move to the directory make-powder
to estimate the lattice parameters.
The parameters in the [make-powder]
section in config.ini
are:
[make-powder]
min_patch_sz = 2
max_patch_sz = 10
min_num_peak = 3
max_num_peak = 20
A Bragg peak candidate is assumed to contain at least min_patch_sz
but no more than max_patch_sz
contiguous outlier pixels identified from the diffuse background scatter.
Only the data frames with at least min_num_peak
but no more than max_num_peak
candidate peaks are kept for the later analysis.
The enforcement of data sparsity can be removed by making max_num_peak
a large integer.
By executing the commands
gcc make-powder.c -O3 -lm -o powder
./powder ../config.ini > run.log
we generate the files frame-peak-count.dat
, patch-sz-count.dat
, 1d-pseudo-powder.dat
and 2d-pseudo-powder.dat
.
The number of candidate peaks in each data frame is recorded in frame-peak-count.dat
.
The file patch-sz-count.dat
represents the histogram of the size of contiguous outlier pixels, from which we can check if the original choice of max_patch_sz
is reasonable.
The file 1d-pseudo-powder.dat
contains three columns: the spatial frequency magnitudes, the counts of inter-peak distances in reciprocal space in each frame, and the counts of spatial frequency magnitudes of the candidate peaks.
Finally, the file 2d-pseudo-powder.dat
records the maximum photon count in each detector pixel.
In the analysis of the test dataset, we fit the lattice parameters a = 79.1 Å
and c = 38.4 Å
by assuming a primitive tetragonal lattice.
This choice can be assessed by executing the command
python plot-1d-powder.py
to plot the histograms of the inter-peak distances and the spatial frequency magnitudes of the candidate peaks. For general crystal lattices, the lattice parameters have to be estimated by fitting the histogram of the inter-peak distances. By executing the command
python plot-2d-powder.py
we plot the 2D pseudo-powder pattern to check if the original estimates of the parameters (cx, cy)
, the beam incidence point on the detector, and (sx, sy, sz)
, the beam incidence direction, are reasonable.
The whole data processing from Mapping detector pixels to here should be rerun if these parameters have to be changed.
The values of the estimated lattice parameters are stored as a 3×3 matrix
u[0] v[0] w[0]
u[1] v[1] w[1]
u[2] v[2] w[2]
in the file basis-vec.dat
in the directory aux
, where u
, v
and w
denote the basis vectors of the primitive unit cell in units of Å.
This file should be created by the user for general crystal lattices.