-
Notifications
You must be signed in to change notification settings - Fork 16
SHRIMP: Step 1
2016.02.25 (separated to new 'Step 1' page 2016.03.08)
[The following is intended as a commentary illuminating the data-reduction workflow, rather than a forensic description of the calculations: the latter detail is incorporated in the documented code.]
The first stage of data reduction visible to SQUID users is the transformation of the XML file into analysis-specific arrays of "total counts at peak" values. For each analysis in a SQUID-book, this array has Nscans rows, and comprises 5 columns per species:
- time_stamp_sec at peak (harvested directly from XML)
- total ion counts at peak (calculated below)
- +/-1sigma value for total ion counts at peak (calculated below)
- total SBM counts at peak (calculated below)
- trim_mass at peak (harvested directly from XML).
In a SQUID-book, these arrays are presented in a worksheet with a name very similar to that of the source XML file, and at the right-hand end of each array are analysis-specific parameters harvested from the XML file, some of which (e.g. deadtime, sbm_zero) are used in early-stage calculations.
In brief, moving from the parsed XML file to this worksheet involves determining (on a peak by peak basis) for each set of 10 constituent integrations:
- a robust mean value for SBM counts, and
- a mean (either robust or arithmetic, depending principally on count-rate) and +/- 1sigma value for species-specific counts, incorporating a correction for deadtime in the ion counter (i.e. electron multiplier).
For SBM data, the treatment is invariant: "total counts SBM" (as reported in SQUID-book)
TotCtsSBM = 10 * BV[SBM]
where BV[SBM] is the Tukey's biweight mean value of the 10 integrations, calculated with tuning factor = 6.
For species-specific data, calculate the median of the 10 integrations.
If median > 100.0, calculate a biweight value (BV) and biweight sigma (BS), with tuning factor = 9. Convert the BV to counts/second:
BVcps = BV * 10 / count_time_sec
Correct the measured BVcps for ion counter deadtime (i.e. BVcpsDT) via the classical equation:
BVcpsDT = BVcps / (1 – [BVcps * Deadtime])
where Deadtime is in seconds. Then "total counts at peak" (as reported in SQUID-book)
TotCtsPk = BVcpsDT * count_time_sec
There are two candidates for the "+/-1sigma" associated with the "total counts at peak": BS and sqrt(BV). The larger of the two (i.e. max[ BS,sqrt(BV )] ) is chosen. Then "+/-1sigma at mass-station" (as reported in SQUID-book):
TotCtsPkSigma = Max[ BS,sqrt(BV) ] / sqrt(10) * BVcps * count_time_sec / BV
If median <= 100.0, set NumInt = 10, and calculate:
sumX = sum of the 10 integrations
sumX2 = sum of the squares of the 10 integrations
Now examine the integrations, and consider the possibility of rejecting the single largest outlier (that with the largest residual relative to the median). Start by establishing LowerLim and UpperLim values, inside of which no integration can be considered an outlier, and no rejection will be attempted.
-
LowerLim = i-th element of the 101-element, zero-indexed vector array ArrL:
ArrL = Array(0, 0, 0, 0, 1, 1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 9, 9, 10, 11, 12, 13, 13, 14, 15, 16, 17, 17, 18, 19, 20, 21, 21, 22, 23, 24, 25, 26, 26, 27, 28, 29, 30, 31, 31, 32, 33, 34, 35, 36, 37, 38, 38, 39, 40, 41, 42, 43, 44, 44, 45, 46, 47, 48, 49, 50, 51, 51, 52, 53, 54, 55, 56, 57, 58, 59, 59, 60, 61, 62, 63, 64, 65, 66, 67, 67, 68, 69, 70, 71, 72, 73, 74, 75, 75, 76, 77, 78, 79, 80, 81)
where i = median (and where non-integer medians are rounded to the nearest EVEN integer). -
UpperLim = i-th element of the 101-element, zero-indexed vector array ArrU:
ArrU = Array(6, 2, 4, 6, 7, 9, 10, 12, 13, 14, 16, 17, 18, 20, 21, 22, 23, 25, 26, 27, 28, 29, 31, 32, 33, 34, 35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48, 50, 51, 52, 53, 54, 55, 56, 58, 59, 60, 61, 62, 63, 64, 66, 67, 68, 69, 70, 71, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90, 91, 93, 94, 95, 96, 97, 98, 99, 100, 101, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 114, 115, 116, 117, 118, 119)
where i = median (and where non-integer medians are rounded to the nearest EVEN integer).
[Vectors ArrL and ArrU represent the 95% confidence limits for the range of integers <= 100, based on a Monte Carlo simulation.]
Check each of the 10 integrations to see whether any value is either less than LowerLim or greater than UpperLim.
- If so, calculate each of the residuals from the median, find the (single) integration with the largest residual. (If the value of the largest residual is shared by multiple integrations, select as the single outlier the integration acquired first in the sequence.) Once the outlier is defined, remove it from further consideration by subtracting its value from sumX, subtracting its square from sumX2, and setting NumInt to 9.
- If not, retain sumX, sumX2 and NumInt as originally calculated.
Then PkMeanCts is simply the arithmetic mean of the surviving integrations:
PkMeanCts = sumX / NumInt
Convert PkMeanCts to counts/second:
PeakMeancps = PkMeanCts * 10 / count_time_sec
Correct the measured PeakMeancps for ion counter deadtime (i.e. PeakMeancpsDT) via the classical equation:
PeakMeancpsDT = PeakMeancps / (1 – [PeakMeancps * Deadtime])
where Deadtime is in seconds. Then "total counts at peak" (as reported in SQUID-book)
TotCtsPk = PeakMeancpsDT * count_time_sec
There are two candidates for the "+/-1sigma" associated with this value: SigmaPkCts and PoissonSig, defined respectively as:
SigmaPkCts = sqrt { [sumX2 – (sumX ^ 2 / NumInt)] / [NumInt – 1] }
PoissonSig = sqrt (PkMeanCts)
The larger of the two (i.e. max[ SigmaPkCts,PoissonSig ] ) is chosen. Then "+/-1sigma at peak" (as reported in SQUID-book)
TotCtsPkSigma = Max[ SigmaPkCts,PoissonSig ] / sqrt (NumInt) * PeakMeancps * count_time_sec / PkMeanCts
Next: Step 2: Background- and SBMzero-correction, and generation of 'total cps' columns