This is a summary of data mirrored from the Genomic Data Commons (GDC) and processed by the GDCtools package. Note that some sample data will be filtered as unsuitable for downstream pipelines, through one of three mechanisms: redactions, replicate filtering, and blacklisting. The report lists the counts and types of the sample data, in both hyperlinked tables and heatmap images; describes the three filtering mechanisms; lists the samples removed by filtering, why they were removed; and (eventually will) catalog how the data have been annotated by the respective projects that submitted them to the GDC.
There were 0 redactions, 1138 replicate aliquots, 0 blacklisted aliquots, and 0 FFPE aliquots. The table below represents the sample counts for those samples that were ingested into firehose after filtering out redactions, replicates, and blacklisted data, and segregating FFPEs.
Table 1. Get Full Table Summary of TCGA Tumor Data. Click on a tumor type to display a tumor type specific Samples Report.
Cohort | BCR | Clinical | CN | mRNA | miR | MAF | Methylation |
---|---|---|---|---|---|---|---|
TCGA-ACC | 92 | 92 | 90 | 79 | 80 | 92 | 80 |
TCGA-BLCA | 412 | 412 | 412 | 408 | 409 | 412 | 412 |
TCGA-BRCA | 1097 | 1096 | 1094 | 1085 | 1078 | 1044 | 1094 |
TCGA-CESC | 307 | 307 | 295 | 304 | 307 | 305 | 307 |
TCGA-CHOL | 51 | 45 | 36 | 36 | 36 | 51 | 36 |
TCGA-COAD | 461 | 459 | 450 | 456 | 444 | 432 | 458 |
TCGA-COADREAD | 633 | 629 | 614 | 622 | 605 | 589 | 623 |
TCGA-DLBC | 58 | 48 | 48 | 48 | 47 | 48 | 48 |
TCGA-ESCA | 185 | 185 | 184 | 161 | 184 | 184 | 185 |
TCGA-GBM | 616 | 595 | 590 | 154 | 0 | 396 | 422 |
TCGA-GBMLGG | 1131 | 1109 | 1104 | 665 | 512 | 909 | 937 |
TCGA-HNSC | 527 | 527 | 517 | 500 | 523 | 510 | 527 |
TCGA-KICH | 113 | 113 | 66 | 65 | 66 | 66 | 66 |
TCGA-KIPAN | 940 | 940 | 886 | 883 | 873 | 693 | 889 |
TCGA-KIRC | 536 | 536 | 530 | 530 | 516 | 339 | 532 |
TCGA-KIRP | 291 | 291 | 290 | 288 | 291 | 288 | 291 |
TCGA-LAML | 200 | 200 | 143 | 151 | 103 | 149 | 140 |
TCGA-LGG | 515 | 514 | 514 | 511 | 512 | 513 | 515 |
TCGA-LIHC | 377 | 377 | 375 | 371 | 372 | 375 | 377 |
TCGA-LUAD | 584 | 521 | 518 | 513 | 513 | 569 | 577 |
TCGA-LUSC | 503 | 503 | 503 | 501 | 478 | 497 | 502 |
TCGA-MESO | 87 | 87 | 87 | 86 | 87 | 83 | 87 |
TCGA-OV | 607 | 586 | 568 | 374 | 489 | 441 | 591 |
TCGA-PAAD | 185 | 185 | 184 | 177 | 178 | 183 | 184 |
TCGA-PANGI | 1261 | 1257 | 1240 | 1158 | 1225 | 1214 | 1251 |
TCGA-PCPG | 179 | 179 | 178 | 178 | 179 | 179 | 179 |
TCGA-PRAD | 500 | 500 | 495 | 495 | 494 | 498 | 497 |
TCGA-READ | 172 | 170 | 164 | 166 | 161 | 157 | 165 |
TCGA-SARC | 261 | 261 | 260 | 259 | 259 | 255 | 261 |
TCGA-SKCM | 470 | 470 | 368 | 367 | 352 | 368 | 368 |
TCGA-STAD | 443 | 443 | 442 | 375 | 436 | 441 | 443 |
TCGA-STES | 628 | 628 | 626 | 536 | 620 | 625 | 628 |
TCGA-TGCT | 150 | 134 | 134 | 150 | 150 | 150 | 150 |
TCGA-THCA | 506 | 506 | 505 | 502 | 506 | 496 | 506 |
TCGA-THYM | 124 | 124 | 124 | 119 | 124 | 123 | 124 |
TCGA-UCEC | 559 | 547 | 540 | 543 | 538 | 542 | 546 |
TCGA-UCS | 57 | 57 | 56 | 56 | 57 | 57 | 57 |
TCGA-UVM | 80 | 80 | 80 | 80 | 80 | 80 | 80 |
Totals | 11305 | 11150 | 10840 | 10088 | 10049 | 10323 | 10807 |
Figure 1. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-ACC.2017_10_17.low_res.heatmap.png)
Figure 2. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-BLCA.2017_10_17.low_res.heatmap.png)
Figure 3. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-BRCA.2017_10_17.low_res.heatmap.png)
Figure 4. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-CESC.2017_10_17.low_res.heatmap.png)
Figure 5. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-CHOL.2017_10_17.low_res.heatmap.png)
Figure 6. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-COAD.2017_10_17.low_res.heatmap.png)
Figure 7. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-COADREAD.2017_10_17.low_res.heatmap.png)
Figure 8. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-DLBC.2017_10_17.low_res.heatmap.png)
Figure 9. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-ESCA.2017_10_17.low_res.heatmap.png)
Figure 10. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-GBM.2017_10_17.low_res.heatmap.png)
Figure 11. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-GBMLGG.2017_10_17.low_res.heatmap.png)
Figure 12. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-HNSC.2017_10_17.low_res.heatmap.png)
Figure 13. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-KICH.2017_10_17.low_res.heatmap.png)
Figure 14. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-KIPAN.2017_10_17.low_res.heatmap.png)
Figure 15. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-KIRC.2017_10_17.low_res.heatmap.png)
Figure 16. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-KIRP.2017_10_17.low_res.heatmap.png)
Figure 17. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-LAML.2017_10_17.low_res.heatmap.png)
Figure 18. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-LGG.2017_10_17.low_res.heatmap.png)
Figure 19. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-LIHC.2017_10_17.low_res.heatmap.png)
Figure 20. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-LUAD.2017_10_17.low_res.heatmap.png)
Figure 21. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-LUSC.2017_10_17.low_res.heatmap.png)
Figure 22. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-MESO.2017_10_17.low_res.heatmap.png)
Figure 23. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-OV.2017_10_17.low_res.heatmap.png)
Figure 24. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-PAAD.2017_10_17.low_res.heatmap.png)
Figure 25. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-PANGI.2017_10_17.low_res.heatmap.png)
Figure 26. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-PCPG.2017_10_17.low_res.heatmap.png)
Figure 27. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-PRAD.2017_10_17.low_res.heatmap.png)
Figure 28. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-READ.2017_10_17.low_res.heatmap.png)
Figure 29. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-SARC.2017_10_17.low_res.heatmap.png)
Figure 30. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-SKCM.2017_10_17.low_res.heatmap.png)
Figure 31. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-STAD.2017_10_17.low_res.heatmap.png)
Figure 32. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-STES.2017_10_17.low_res.heatmap.png)
Figure 33. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-TGCT.2017_10_17.low_res.heatmap.png)
Figure 34. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-THCA.2017_10_17.low_res.heatmap.png)
Figure 35. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-THYM.2017_10_17.low_res.heatmap.png)
Figure 36. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-UCEC.2017_10_17.low_res.heatmap.png)
Figure 37. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-UCS.2017_10_17.low_res.heatmap.png)
Figure 38. Get High-res Image This figure depicts the distribution of available data on a per participant basis.
![](TCGA-UVM.2017_10_17.low_res.heatmap.png)
NOT IMPLEMENTED YET: redactions are not yet exposed at the GDC. For examples of the annotation-based filtering performed in the past by the Broad GDAC Firehose pipeline, explore this legacy GDAC Firehose sample report
The mRNA preprocess median module chooses the matrix for the platform(Affymetrix HG U133, Affymetrix Exon Array and Agilent Gene Expression) with the largest number of samples.
The mRNAseq preprocessor picks the "scaled_estimate" (RSEM) value from Illumina HiSeq/GA2 mRNAseq level_3 (v2) data set and makes the mRNAseq matrix with log2 transformed for the downstream analysis. If there are overlap samples between two different platforms, samples from illumina hiseq will be selected. The pipeline also creates the matrix with RPKM and log2 transform from HiSeq/GA2 mRNAseq level 3 (v1) data set.
The miRseq preprocessor picks the "RPM" (reads per million miRNA precursor reads) from the Illumina HiSeq/GA miRseq Level_3 data set and makes the matrix with log2 transformed values.
The methylation preprocessor filters methylation data for use in downstream pipelines. To learn more about this preprocessor, please visit the documentation.