Files here provide a unified summary of all Year 2 and later CPTAC3 genomic analysis results and their location on CPTAC-DCC.
DCC Analysis Summary files have the following initial columns:
1. case
2. disease
3. pipeline_name
4. pipeline_version
5. timestamp
6. C3Y
7. DCC_path
8. filesize
9. file_format
10. md5sum
NEW: Column C3Y indicates "CPTAC3 Year" and takes values Y1
, Y2
, etc. It is used for administrative purposes.
Additional columns are specific to individual pipelines and will typically indicate the input data associated with this analysis. Pipelines which generate multiple result files per case will have multiple entries in the analysis summary file.
Counts of cases processed per disease and pipeline. Counts of results to reference other than GDC hg38 excluded.
Pipeline | AML | CCRCC | CM | GBM | HNSCC | LSCC | LUAD | PDA | SAR | UCEC | Total |
---|---|---|---|---|---|---|---|---|---|---|---|
Methylation Array | 43 | 222 | 8 | 116 | 111 | 113 | 229 | 164 | 19 | 246 | 1271 |
miRNA-Seq | 42 | 222 | 8 | 114 | 111 | 113 | 229 | 164 | 19 | 247 | 1269 |
miRNA-Seq QC | 11 | 28 | 5 | 52 | 17 | 27 | 10 | 77 | 19 | 39 | 285 |
RNA-Seq Expression | 11 | 221 | 8 | 114 | 111 | 113 | 164 | 164 | 19 | 181 | 1106 |
RNA-Seq Fusion | 11 | 112 | 8 | 119 | 111 | 113 | 164 | 83 | 19 | 77 | 817 |
RNA-Seq Transcript + Splicing | 1 | 112 | 3 | 119 | 111 | 113 | 53 | 83 | 0 | 77 | 672 |
RNA-Seq QC | 11 | 112 | 8 | 119 | 111 | 113 | 53 | 83 | 19 | 77 | 706 |
WGS SV | 0 | 0 | 0 | 59 | 109 * | 113 | 111 | 77 | 0 | 39 | 508 |
WGS CNV Somatic | 0 | 117 | 0 | 59 | 21 | 113 | 121 | 77 | 0 | 39 | 547 |
WGS QC | 0 | 0 | 0 | 60 | 21 | 113 | 0 | 77 | 0 | 39 | 310 |
WXS MSI | 0 | 105 | 0 | 118 | 111 | 113 | 111 | 0 | 0 | 143 | 701 |
WXS Normal Adjacent | 0 | 80 | 0 | 0 | 5 | 22 | 101 | 0 | 0 | 21 | 229 |
WXS QC | 5 | 87 | 4 | 118 | 111 | 113 | 101 | 44 | 9 | 51 | 643 |
WXS Somatic | 0 | 0 | 0 | 0 | 0 | 113 | 0 | 0 | 0 | 0 | 113 |
WXS Somatic SW | 0 | 0 | 0 | 0 | 0 | 113 | 109 | 0 | 0 | 0 | 222 |
Last update: 2/19/20
- HNSCC WGS SV analysis is ad hoc analysis of data aligned by UMich to custom reference.
Processing performed during CPTAC3 Year 1 consisted analyses for CCRCC, LUAD, and UCEC discovery cohort, and a visual summary of processing per batch can be found in this processing update description. A subset of Year 1 calls is included in the DCC analysis summaries here, mainly those calls whose pipeline versions are consistent with those in Year 2. Year 1 analyses can be identified by "Y1" in C3Y column, and do not have details about input data.
Details and notes about pipelines and processing status below. More complete pipeline details are included in documentation included with data files on DCC.
Methylation array analysis has been performed for all cases available on DCC
through December 2019, details in
Methylation_Array.DCC_analysis_summary.dat
CPTAC3 Methylation pipeline details
Analysis details miRNA-Seq.DCC_analysis_summary.dat
Note that each sample has results for mature miRNA, precursor miRNA, and total miRNA.
miRNA-Seq pipeline documentation and processing description.
Analysis details miRNA-Seq_QC.DCC_analysis_summary.dat
965 cases (consisting of all Y1 and Y2 cases) been analyzed
Analysis details RNA-Seq_Expression.DCC_analysis_summary.dat
CPTAC3 RNA-Seq Expression pipeline
1259 samples across 817 cases analyzed, including all of Y2 cases and LUAD from Y1.
Analysis details RNA-Seq_Fusion.DCC_analysis_summary.dat
, and
pipeline documentation on GitHub
Analysis details RNA-Seq_Transcript.DCC_analysis_summary.dat
Pipeline documentation on GitHub
Analysis details RNA-Seq_QC.DCC_analysis_summary.dat
Year 1 LUAD analyses are included.
Also in the DCC analysis summary file are 109 HNSCC cases aligned to a custom reference (GRCh38_full_analysis_set_plus_decoy_hla
) for UMich group
Analysis details WGS_SV.DCC_analysis_summary.dat
CPTAC3 SomaticSV pipeline on GitHub
Analysis details WGS_CNV_Somatic.DCC_analysis_summary.dat
All Y1 analyses with pipeline version v2.0 have been added to the analysis summary.
Analysis details WGS_QC.DCC_analysis_summary.dat
Analysis details WXS_MSI.DCC_analysis_summary.dat
Analysis details WXS_Normal_Adjacent.DCC_analysis_summary.dat
WXS Normal Adjacent analysis generated using TinDaisy pipeline
Analysis details WXS_QC.DCC_analysis_summary.dat
WXS Somatic analysis is new for CPTAC3 Year 3. So far only 113 LSCC cases have been analyzed using TinDaisy variant caller v2.1
Analysis details WXS_Somatic.DCC_analysis_summary.dat
113 LSCC and 109 LUAD cases in WXS_Somatic_Variant_SW.DCC_analysis_summary.dat
.
This is an ad hoc upload of calls generated by SomaticWrapper v1.5
and is provided for backwards compatiblity with prior SomaticWrapper calls.
WXS_Somatic_Variant
pipeline above (based on TinDaisy) is expected to
ultimately replace these calls.