Skip to content

SHRIMP: Sub FindCoherentGroup

sbodorkos edited this page Dec 11, 2019 · 5 revisions

SHRIMP: Sub FindCoherentGroup

Using the median, stepwise reject points with greatest weighted residuals until the probability (Probfit) is greater than MinCoherentProb.

Usage

FindCoherentGroup(AllN, CoherentN, BadCt, BadOnes, AllX, AllSigmaX, CoherentX, CoherentSigmaX, MeanVal, MeanErr, MSWD, Probfit, MinCoherentProb, MinCoherentFract, OkVal)

Mandatory variables

AllN: Integer specifying the number of data-rows passed to this subroutine.

AllX: Vector of length AllN, containing double-precision ages/values, for all the data-rows passed to this subroutine.

AllSigmaX: Vector of length AllN, containing double-precision uncertainties (1sigma ABSOLUTE) associated with the ages/values, for all the data-rows passed to this subroutine.

MinCoherentProb: Double-precision number defining the minimum (threshold) value for probability of equivalence required for a coherent group to be valid. In SQUID 2.50, the target value of MinCoherentProb is user-specified (via spinner) in the "Group Me" screen.

MinCoherentFract: Double-precision number defining the minimum (threshold) value for the minimum proportion of data-rows, relative to the total number of input data-rows (i.e. CoherentN / AllN), required for inclusion in a coherent group before that group is considered valid. In SQUID 2.50, the target value of MinCoherentFract (i.e. Coherent N / All N) is user-specified (via spinner) in the "Group Me" screen.

Optional variables

CoherentN: Integer count of the number of data-rows included in the coherent group, if such a group is found.

BadCt: Integer count of the number of analyses rejected from the coherent group during the calculation procedure, if such a group is found.

BadOnes: Vector of integers corresponding to the row-indexes of the analyses rejected from the coherent group during the calculation procedure, if such a group is found.

CoherentX: Vector containing double-precision numbers corresponding to the ages/values that have been included in the coherent group, if it is found.

CoherentSigmaX: Vector of double-precision uncertainties (1sigma ABSOLUTE) associated with the ages/values that have been included in the coherent group, if it is found.

MeanVal: Double-precision value of the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.

MeanErr: Double-precision 1SIGMA ABSOLUTE uncertainty on the value of the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.

MSWD: Double-precision value of the MSWD associated with the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.

Probfit: Double-precision value of the probability of equivalence associated with the weighted mean of the coherent group defined by CoherentX and CoherentSigmaX, if it is found.

OkVal: Boolean specifying whether there exists a coherent group that satisfies BOTH of the user-defined constraints imposed by MinCoherentProb and MinCoherentFract.


Definition of variables

Values of type Integer
CoherentCt, MaxResIndx, NumCoherent, SpotCt

Values of type Double
MaxRes, MedianVal, MinProb, WtdResid

Arrays of type Double
SCX, ScXe


This subroutine begins by assuming that there is NO coherent group (OkVal = FALSE):

CoherentN = 0
OkVal = FALSE
BadCt = 0

ReDim BadOnes[1 To AllN]

Firstly, eliminate all (non-zero) spot-values for which the 1sigma absolute uncertainty exceeds the value:

For SpotCt = 1 To AllN
  If AllX[SpotCt] <> 0 

    If ABS( AllSigmaX[SpotCt] / AllX[SpotCt] ) < 1 

      CoherentN = 1 + CoherentN

      --Lengthen vectors to accommodate values:
      ReDim Preserve CoherentX[1 To CoherentN]
      ReDim Preserve CoherentSigmaX[1 To CoherentN]

      --Insert values:
      CoherentX[CoherentN] = AllX[SpotCt]
      CoherentSigmaX[CoherentN] = AllSigmaX[SpotCt]

    End If --ABS( AllSigmaX[SpotCt] / AllX[SpotCt] ) < 1

  End If --AllX[SpotCt] <> 0
Next SpotCt

Now check that any such exclusions are not terminal:

If (CoherentN < 2) OR ( (CoherentN / AllN) < MinCoherentFract )

  Exit Sub

End If

12 December 2019: The next step is to check immediately to see whether the retained values satisfy the coherent group, by calculating a "simple weighted average". Ludwig defined a dedicated subroutine for this, called SimpleWtdAv, and In suspect it is very old code, written for SQUID-1 in 1999. I have documented SimpleWtdAv separately, in order to record unambiguously exactly what the intended functionality is (https://github.com/CIRDLES/ET_Redux/wiki/SHRIMP:-Sub-SimpleWtdAv). But I think it would be much sounder practice if Squid3 instead invoked the relevant variant of our existing Isoplot3.WtdAv-based function, and so that is what I have tried to do in the code-block below.

In broad terms, the "simple weighted average" ignores the possibility of an "external error", and makes no independent attempt to identify or reject outliers, because the outlier-rejection process is instead handled explicitly by the surrounding Sub FindCoherentGroup code that invokes the "simple weighted average" calculation. Essentially, its aim is to take input vectors of double-precision values (and their 1sigma absolute uncertainties), and perform the arithmetic necessary to determine a MeanVal, its 1sigma absolute MeanErr, its MSWD and its Probfit.

In the code-block below, I have preserved (commented-out) the SimpleWtdAv-based invocation written by Ludwig. Beneath it, I have written what I believe to be the equivalent invocation using our existing Isoplot3-based "Ludwig Library" function WtdAv (which Squid3 has already used before, as part of Sub WtdMeanAcalc):

--SimpleWtdAv CoherentN, CoherentX, CoherentSigmaX, MeanVal, MeanErr, MSWD, Probfit 
--Previous line is Ludwig original code; SimpleWtdAv subroutine documented
--separately, in case more clarity is needed regarding its exact function

--Re-expressing SimpleWtdAv in terms of Isoplot3.WtdAv, the sequenced arguments are:
--InpR = two-column array of length CoherentN, comprising CoherentX and CoherentSigmaX 
--PercentErrorsOut = FALSE, because FindCoherentGroup relates exclusively to AGES
--PercentErrorsIn = FALSE, because FindCoherentGroup relates exclusively to AGES
--SigmaLevelIn = 1
--CanReject = FALSE, because subsequent Do-Loop handles this explicitly
--ConstantExternalErr = FALSE, because Unknown spots should have comprehensive errors
--SigmaLevelOut = 1

Isoplot3.WtdAv(InpR, FALSE, FALSE, 1, FALSE, FALSE, 1) 
--Previous line is Bodorkos-derived equivalent of "SimpleWtdAv"
--This calculation will yield MeanVal, MeanErr (1sigma absolute), MSWD and Probfit

If Probfit > MinCoherentProb

  OkVal = TRUE --Job done! No further culling needed
  Exit Sub
  
End If

But in the vast majority of cases, the remaining set of analyses will be dispersed beyond the limit imposed by MinCoherentProb, and an iterative process is necessary.

Do --Reduce CoherentN until Probfit exceeds MinCoherentProb

  If ( ( (CoherentN - 1) / AllN ) < MinCoherentFract ) OR (CoherentN < 2)
  --Both conditions relate to CoherentN becoming too small for a coherent group

    Exit Sub
  
  End If
  
  MaxRes = 0
  MedianVal = MEDIAN( CoherentX ) --i.e. median of values in the vector

  For CoherentCt = 1 To CoherentN
    WtdResid = ABS( CoherentX[CoherentCt] - MedianVal ) / CoherentSigmaX[CoherentCt]

    If WtdResid > MaxRes --i.e. if WtdResid is the largest value thus far
  
      MaxRes = WtdResid
      MaxResIndx = CoherentCt --row-index of data-point that will be rejected
    
    End If --WtdResid > MaxRes
  Next CoherentCt

  BadCt = 1 + BadCt --in case Loop departure happens via Exit Sub above, I think

  ReDim SCX[1 To CoherentN], ScXe[1 To CoherentN] --empties SCX and ScXe, and 
  --shortens each vector by 1, for each Loop

  For CoherentCt = 1 To CoherentN --write CohX and CohSigmaX to SCX and ScXe

    SCX[CoherentCt] = CoherentX[CoherentCt]
    ScXe[CoherentCt] = CoherentSigmaX[CoherentCt]

  Next CoherentCt

  CoherentN = CoherentN - 1 --now decrease CoherentN
  NumCoherent = 0

  ReDim CoherentX[1 To CoherentN], CoherentSigmaX[1 To CoherentN]
  --empties CoherentX and CoherentSigmaX (contents preserved in SCX and ScXe),
  --and shortens each vector by 1, for each Loop

  For CoherentCt = 1 To ( CoherentN + 1 ) --because now reading from SCX and ScXe,
  --which at this point are vectors with 1 extra element than CohX and CohSigmaX

    If CoherentCt <> MaxResIndx --i.e. if NOT the biggest outlier in SCX, then write
    --SCX values back to CoherentX, and ScXe values back to CoherentSigmaX:
  
      NumCoherent = 1 + NumCoherent
      CoherentX[NumCoherent] = SCX[CoherentCt] 
      CoherentSigmaX[NumCoherent] = ScXe[CoherentCt]

    End If --CoherentCt <> MaxResIndx

  Next CoherentCt

  --SimpleWtdAv CoherentN, CoherentX, CoherentSigmaX, MeanVal, MeanErr, MSWD, Probfit
  --Previous line is Ludwig original; next is Bodorkos equivalent (definitions above)
  --which will yield MeanVal, MeanErr (1sigma absolute), MSWD and Probfit
  
  Isoplot3.WtdAv(InpR, FALSE, FALSE, 1, FALSE, FALSE, 1) 

Loop Until Probfit > MinCoherentProb --End of 12 December 2019 revisions

We only come out the end of this Loop if the MinCoherentProb condition has been satisfied, so now we just need one last check that MinCoherentFract is still satisfied:

If ( (CoherentN - 1) / AllN ) > MinCoherentFract 

  OkVal = TRUE
  
End If

BadCt = 0 --reset from Loop

Finally, reconcile the elements of the final CoherentX against the elements of the original AllX, and use value-differences to identify both the number of 'bad' data-rows (i.e. those eliminated by stepwise rejection as implemented in the Do... Loop), and the index-numbers of those bad rows:

For SpotCt = 1 To AllN

  For CoherentCt = 1 To CoherentN

    If CoherentX[CoherentCt] = AllX[SpotCt] 

      Exit For

    End If --CoherentX[CoherentCt] = AllX[SpotCt]

  Next CoherentCt

  If CoherentCt > CoherentN --i.e. mismatch because AllX has an element missing from
  --CoherentX because it was designated 'bad' by the Do... Loop

    BadCt = 1 + BadCt
    BadOnes[BadCt] = CoherentCt --i.e. index-number of 'bad' data-row

  End If

Next SpotCt

So far, so good... but this subroutine finishes quite strangely, with an If that appears to completely circumvent the MinCoherentFract constraint!

If CoherentN > 1 --What is the point of the MinCoherentFract test, if OkVal will be
--set TRUE for *ANY* CoherentX vector with two or more elements? Don't understand

  OkVal = TRUE
  
End If --CoherentN > 1

Lastly, tidy up the vector containing the index-numbers of rejected spots (which still has length AllN), preserving its existing contents:

If BadCt > 0 
  
  ReDim Preserve BadOnes[1 To BadCt]

End If

End Sub
Clone this wiki locally