-
Notifications
You must be signed in to change notification settings - Fork 16
SHRIMP: Sub FindCoherentGroup
Using the median, stepwise reject points with greatest weighted residuals until the probability (Probfit) is greater than MinCoherentProb.
FindCoherentGroup(AllN, CoherentN, BadCt, BadOnes, AllX, AllSigmaX, CoherentX, CoherentSigmaX, MeanVal, MeanErr, MSWD, Probfit, MinCoherentProb, MinCoherentFract, OkVal)
AllN: Integer specifying the number of data-rows passed to this subroutine.
AllX: Vector of length AllN, containing double-precision ages/values, for all the data-rows passed to this subroutine.
AllSigmaX: Vector of length AllN, containing double-precision uncertainties (1sigma ABSOLUTE) associated with the ages/values, for all the data-rows passed to this subroutine.
MinCoherentProb: Double-precision number defining the minimum (threshold) value for probability of equivalence required for a coherent group to be valid. In SQUID 2.50, the target value of MinCoherentProb is user-specified (via spinner) in the "Group Me" screen.
MinCoherentFract: Double-precision number defining the minimum (threshold) value for the minimum proportion of data-rows, relative to the total number of input data-rows (i.e. CoherentN / AllN), required for inclusion in a coherent group before that group is considered valid. In SQUID 2.50, the target value of MinCoherentFract (i.e. Coherent N / All N) is user-specified (via spinner) in the "Group Me" screen.
CoherentN: Integer count of the number of data-rows included in the coherent group, if such a group is found.
BadCt: Integer count of the number of analyses rejected from the coherent group during the calculation procedure, if such a group is found.
BadOnes: Vector of integers corresponding to the row-indexes of the analyses rejected from the coherent group during the calculation procedure, if such a group is found.
CoherentX: Vector containing double-precision numbers corresponding to the ages/values that have been included in the coherent group, if it is found.
CoherentSigmaX: Vector of double-precision uncertainties (1sigma ABSOLUTE) associated with the ages/values that have been included in the coherent group, if it is found.
MeanVal: Double-precision value of the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.
MeanErr: Double-precision 1SIGMA ABSOLUTE uncertainty on the value of the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.
MSWD: Double-precision value of the MSWD associated with the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.
Probfit: Double-precision value of the probability of equivalence associated with the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.
OkVal: Boolean specifying whether there exists a coherent group that satisfies BOTH of the user-defined constraints imposed by MinCoherentProb and MinCoherentFract.
Values of type Integer
CoherentCt, MaxResIndx, NumCoherent, SpotCt
Values of type Double
MaxRes, MedianVal, MinProb, WtdResid
Arrays of type Double
SCX, ScXe
This subroutine begins by assuming that there is NO coherent group (OkVal = FALSE):
CoherentN = 0
OkVal = FALSE
BadCt = 0
ReDim BadOnes[1 To AllN]
Firstly, eliminate all (non-zero) spot-values for which the 1sigma absolute uncertainty exceeds the value:
For SpotCt = 1 To AllN
If AllX[SpotCt] <> 0
If ABS( AllSigmaX[SpotCt] / AllX[SpotCt] ) < 1
CoherentN = 1 + CoherentN
--Lengthen vectors to accommodate values:
ReDim Preserve CoherentX[1 To CoherentN]
ReDim Preserve CoherentSigmaX[1 To CoherentN]
--Insert values:
CoherentX[CoherentN] = AllX[SpotCt]
CoherentSigmaX[CoherentN] = AllSigmaX[SpotCt]
End If --ABS( AllSigmaX[SpotCt] / AllX[SpotCt] ) < 1
End If --AllX[SpotCt] <> 0
Next SpotCt
Now check that any such exclusions are not terminal:
If (CoherentN < 2) OR ( (CoherentN / AllN) < MinCoherentFract )
Exit Sub
End If
12 December 2019: The next step is to check immediately to see whether the retained values satisfy the coherent group, by calculating a "simple weighted average". Ludwig defined a dedicated subroutine for this, called SimpleWtdAv, and In suspect it is very old code, written for SQUID-1 in 1999. I have documented SimpleWtdAv separately, in order to record unambiguously exactly what the intended functionality is (https://github.com/CIRDLES/ET_Redux/wiki/SHRIMP:-Sub-SimpleWtdAv). But I think it would be much sounder practice if Squid3 instead invoked the relevant variant of our existing Isoplot3.WtdAv-based function, and so that is what I have tried to do in the code-block below.
In broad terms, the "simple weighted average" ignores the possibility of an "external error", and makes no independent attempt to identify or reject outliers, because the outlier-rejection process is instead handled explicitly by the surrounding Sub FindCoherentGroup code that invokes the "simple weighted average" calculation. Essentially, its aim is to take input vectors of double-precision values (and their 1sigma absolute uncertainties), and perform the arithmetic necessary to determine a MeanVal, its 1sigma absolute MeanErr, its MSWD and its Probfit.
In the code-block below, I have preserved (commented-out) the SimpleWtdAv-based invocation written by Ludwig. Beneath it, I have written what I believe to be the equivalent invocation using our existing Isoplot3-based "Ludwig Library" function WtdAv (which Squid3 has already used before, as part of Sub WtdMeanAcalc):
--SimpleWtdAv CoherentN, CoherentX, CoherentSigmaX, MeanVal, MeanErr, MSWD, Probfit
--Previous line is Ludwig original code; SimpleWtdAv subroutine documented
--separately, in case more clarity is needed regarding its exact function
--Re-expressing SimpleWtdAv in terms of Isoplot3.WtdAv, the sequenced arguments are:
--InpR = two-column array of length CoherentN, comprising CoherentX and CoherentSigmaX
--PercentErrorsOut = FALSE, because FindCoherentGroup relates exclusively to AGES
--PercentErrorsIn = FALSE, because FindCoherentGroup relates exclusively to AGES
--SigmaLevelIn = 1
--CanReject = FALSE, because subsequent Do-Loop handles this explicitly
--ConstantExternalErr = FALSE, because Unknown spots should have comprehensive errors
--SigmaLevelOut = 1
Isoplot3.WtdAv(InpR, FALSE, FALSE, 1, FALSE, FALSE, 1)
--Previous line is Bodorkos-derived equivalent of "SimpleWtdAv"
--This calculation will yield MeanVal, MeanErr (1sigma absolute), MSWD and Probfit
If Probfit > MinCoherentProb
OkVal = TRUE --Job done! No further culling needed
Exit Sub
End If
But in the vast majority of cases, the remaining set of analyses will be dispersed beyond the limit imposed by MinCoherentProb, and an iterative process is necessary.
Do --Reduce CoherentN until Probfit exceeds MinCoherentProb
If ( ( (CoherentN - 1) / AllN ) < MinCoherentFract ) OR (CoherentN < 2)
--Both conditions relate to CoherentN becoming too small for a coherent group
Exit Sub
End If
MaxRes = 0
MedianVal = MEDIAN( CoherentX ) --i.e. median of values in the vector
For CoherentCt = 1 To CoherentN
WtdResid = ABS( CoherentX[CoherentCt] - MedianVal ) / CoherentSigmaX[CoherentCt]
If WtdResid > MaxRes --i.e. if WtdResid is the largest value thus far
MaxRes = WtdResid
MaxResIndx = CoherentCt --row-index of data-point that will be rejected
End If --WtdResid > MaxRes
Next CoherentCt
BadCt = 1 + BadCt --in case Loop departure happens via Exit Sub above, I think
ReDim SCX[1 To CoherentN], ScXe[1 To CoherentN] --empties SCX and ScXe, and
--shortens each vector by 1, for each Loop
For CoherentCt = 1 To CoherentN --write CohX and CohSigmaX to SCX and ScXe
SCX[CoherentCt] = CoherentX[CoherentCt]
ScXe[CoherentCt] = CoherentSigmaX[CoherentCt]
Next CoherentCt
CoherentN = CoherentN - 1 --now decrease CoherentN
NumCoherent = 0
ReDim CoherentX[1 To CoherentN], CoherentSigmaX[1 To CoherentN]
--empties CoherentX and CoherentSigmaX (contents preserved in SCX and ScXe),
--and shortens each vector by 1, for each Loop
For CoherentCt = 1 To ( CoherentN + 1 ) --because now reading from SCX and ScXe,
--which at this point are vectors with 1 extra element than CohX and CohSigmaX
If CoherentCt <> MaxResIndx --i.e. if NOT the biggest outlier in SCX, then write
--SCX values back to CoherentX, and ScXe values back to CoherentSigmaX:
NumCoherent = 1 + NumCoherent
CoherentX[NumCoherent] = SCX[CoherentCt]
CoherentSigmaX[NumCoherent] = ScXe[CoherentCt]
End If --CoherentCt <> MaxResIndx
Next CoherentCt
--SimpleWtdAv CoherentN, CoherentX, CoherentSigmaX, MeanVal, MeanErr, MSWD, Probfit
--Previous line is Ludwig original; next is Bodorkos equivalent (definitions above)
--which will yield MeanVal, MeanErr (1sigma absolute), MSWD and Probfit
Isoplot3.WtdAv(InpR, FALSE, FALSE, 1, FALSE, FALSE, 1)
Loop Until Probfit > MinCoherentProb --End of 12 December 2019 revisions
We only come out the end of this Loop if the MinCoherentProb condition has been satisfied, so now we just need one last check that MinCoherentFract is still satisfied:
If ( (CoherentN - 1) / AllN ) > MinCoherentFract
OkVal = TRUE
End If
BadCt = 0 --reset from Loop
Finally, reconcile the elements of the final CoherentX against the elements of the original AllX, and use value-differences to identify both the number of 'bad' data-rows (i.e. those eliminated by stepwise rejection as implemented in the Do... Loop), and the index-numbers of those bad rows:
For SpotCt = 1 To AllN
For CoherentCt = 1 To CoherentN
If CoherentX[CoherentCt] = AllX[SpotCt]
Exit For
End If --CoherentX[CoherentCt] = AllX[SpotCt]
Next CoherentCt
If CoherentCt > CoherentN --i.e. mismatch because AllX has an element missing from
--CoherentX because it was designated 'bad' by the Do... Loop
BadCt = 1 + BadCt
BadOnes[BadCt] = CoherentCt --i.e. index-number of 'bad' data-row
End If
Next SpotCt
So far, so good... but this subroutine finishes quite strangely, with an If that appears to completely circumvent the MinCoherentFract constraint!
If CoherentN > 1 --What is the point of the MinCoherentFract test, if OkVal will be
--set TRUE for *ANY* CoherentX vector with two or more elements? Don't understand
OkVal = TRUE
End If --CoherentN > 1
Lastly, tidy up the vector containing the index-numbers of rejected spots (which still has length AllN), preserving its existing contents:
If BadCt > 0
ReDim Preserve BadOnes[1 To BadCt]
End If
End Sub