-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DDMRA results are dissimilar to original paper's findings #17
Comments
Based on a conversation with Jonathan Power, here are some ideas about what could be causing the issues:
|
I've been trying to figure out how to identify outlier scans across the edges. Mahalanobis distance, which is probably the optimal choice for data with more samples than variables, won't work in this case because there are definitely more variables than samples. If no one is able to provide any recommendations, I'm leaning toward switching to a different distance measure (e.g., Euclidean) and flagging any subjects that are >2 SD from the mean distance. @handwerkerd does that make sense to you? The other idea, to look at median motion and identify dataset-specific thresholds, hasn't really borne out. I have plotted median and mean motion across datasets, and I don't see any clearly identifiable peaks on the right sides of the median FD distributions. Below is a figure with the KDE plots for the two measures, across the three datasets we use in Experiment 2. |
Unfortunately, I never heard back from the person I emailed, so I'm thinking I'll use a combination of PCA and Mahalanobis distance to identify outliers. The idea is to use PCA to reduce the 700x35000 array to 700x5 (or something else), and then calculate distance on that, as in this example. Does that sound reasonable? EDIT: The problem I'm noticing is that the first 5-10 components only explain about 5-6% of the variance, but trying to use the 500 or so that explain 90-95% of the variance leads to a bunch of warnings during the Mahalanobis calculation. Also, even using a threshold of p < 0.001 tends to leave me with a lot of outliers (80-140). |
Sorry for a bit of a delay responding. That said something like Mahalanobis distance would be good to plot the distance between each dataset and a normal distribution, which seems to be what you're planning to do. I don't completely follow what your samples and variables are in this specific analysis, but, instead of fitting to a model based on all your conditions, could you just calculate the distance between all (or many) pairs of distributions and cluster each data set based on those distances. Then outliers might appear as clusters that are farther away from other datasets than expected. Hope this helps. |
I should have said that the Fisher's z transform enforces a roughly normal distribution for random correlation coefficients, rather than saying that the transformed z values will be normal. Still, I think it's safe to work under the assumption that the ROI-to-ROI coefficients will be approximately normal after being transformed.
In this case, the samples are all of the subjects going into the DDMRA analyses, while the variables are the ROI-to-ROI correlation coefficients for all of the Power ROI pairs. The DDMRA analyses are performed across subjects on each ROI-to-ROI edge, and then a sliding window average is performed across those edges, according to the distance between the ROIs, to get the smoothing curves we see if the DDMRA figures, like this one. I'm not sure if finding outliers using subsets of the ROI pairs will work, unless I have some way of aggregating those findings across all of the pairs (e.g., subjects who are outliers in X of the 35000 would be removed from the analysis). I think using dimensionality reduction via PCA makes sense, and has some precedent, though I'm less sure about the thresholds and number of components to use. |
It might not be of direct relevance, but Javier & I used PCA to reduce connectivity matrices in https://www.pnas.org/content/112/28/8762#F11 and Sup Fig 5 shows how much cognitive state-relevant info survives for different levels of variance retained by the PCA. I haven't personally calculated DDMRA measures so this might be my own confusion, but sometimes it's better to identify outliers a step or two away from your measure of interest. I'm sticking with the idea that you want to identify weird distributions compared to other subjects. One possible way to reduce that space would be to calculate the average connectiviy matrix for each dataset. Then, for each subject, you'd be calculating the distance from that average. That wouldn't let you cluster groups of subjects with similar patterns, but it would highlight any subjects that are particularly far from the mean. |
I agree. In this case, the measure of interest is generally going to be a correlation of correlation coefficients against mean framewise displacement, a difference of correlation coefficients between high and low motion subjects, or a difference of correlation coefficients between the raw time series and the the censored version. What I tried to reject outliers from is the correlation coefficients going into each of those.
That's sort of what the Mahalanobis distance was doing, but it's for multivariate normally distributed data. I was thinking of Euclidean distance from the average matrix because Mahalanobis doesn't work directly when the number of variables > number of samples, but it seemed like PCA + Mahalanobis was more established since I could find examples of its use. However... the results were still extremely significant/nonsignificant when I removed outliers, and I'm totally lost as to what to do at this point. Outlier correlation coefficients don't seem be driving the significant findings, so I wonder if there's just a problem with how the significance tests are done. |
Have you re-generated these for individual datasets, rather than collapsed across @tsalo ? I believe you mentioned that for Cambridge only the DDMRA findings were similar to the original work. is that still true ? |
I ran the dataset-specific analyses, but I haven't reviewed them yet. I'll update this comment when I have something to show. |
The DDMRA results I get are pretty dissimilar to the original paper's findings. While I expect some differences driven by differences in preprocessing and denoising methods, these go beyond my expectations.
From the replication manuscript, our predictions were:
Analysis of Cambridge, CamCAN, and DuPre datasets
Here are the results using the Cambridge, CamCAN, and DuPre datasets, only dropping subjects with missing or zero-variance data in one or more ROIs (as well as subjects that had already been flagged from other steps). The table shows the p-values for the analyses.
The code used to perform the analysis is this script, and the version of the DDMRA package I used is at https://github.com/tsalo/ddmra/tree/f9be4687d6465c09aae158509566083d456f2567.
Replication on just the Cambridge dataset
Just in case the problem was due to intersite differences (as mentioned in Power et al., 2017), I ran the analyses just using the Cambridge dataset, which should have produced very similar results to the original paper.
Note that the scrubbing analysis results still look terrible, but also the MEDN intercepts were nowhere near significant and the MEDN+GODEC QC:RSFC intercept was significant (and slope was nearly significant).
Replication on subjects with mean FD < 0.2mm
Per Power et al., 2017, DDMRA analyses of multiple datasets can be corrupted by differences in baseline levels of motion, as well as outliers.
Specifically, what it says is:
Also, in the original paper's supplement, they re-run the analyses with only subjects with mean FD < 0.3 and < 0.2 mm, separately, and say that the results were basically the same. I was concerned that the CamCAN dataset, which has higher levels of motion (probably because there are older participants), might be causing problems in the main analysis, so I ran the supplement's analysis of only subjects with < 0.2mm mean FD. However, the results are still quite dissimilar from the original paper.
I don't know if there is a problem with the analyses/dataset or if the CamCAN dataset is simply driving different, but still valid, results.
The text was updated successfully, but these errors were encountered: