stddata__2020_03_13 Samples Report

Overview

Introduction

This is a summary of data mirrored from the Genomic Data Commons (GDC) and processed by the GDCtools package. Note that some sample data will be filtered as unsuitable for downstream pipelines, through one of three mechanisms: redactions, replicate filtering, and blacklisting. The report lists the counts and types of the sample data, in both hyperlinked tables and heatmap images; describes the three filtering mechanisms; lists the samples removed by filtering, why they were removed; and (eventually will) catalog how the data have been annotated by the respective projects that submitted them to the GDC.

Summary

There were 0 redactions, 0 replicate aliquots, 0 blacklisted aliquots, and 0 FFPE aliquots. The table below represents the sample counts for those samples that were ingested into firehose after filtering out redactions, replicates, and blacklisted data, and segregating FFPEs.

Table 1. Get Full Table Summary of TCGA Tumor Data. Click on a tumor type to display a tumor type specific Samples Report.

Cohort	BCR	Clinical	CN	mRNA	miR	MAF	Methylation
TCGA-ACC	92	92	90	79	80	92	80
TCGA-BLCA	412	412	412	408	409	412	412
TCGA-BRCA	1098	1096	1093	1082	1075	1041	1091
TCGA-CESC	307	307	294	303	306	304	306
TCGA-CHOL	51	48	36	36	36	51	36
TCGA-COAD	461	458	452	456	444	432	458
TCGA-COADREAD	633	629	617	622	605	589	623
TCGA-DLBC	58	48	48	48	47	37	48
TCGA-ESCA	185	185	184	161	184	184	185
TCGA-GBM	617	598	596	154	0	396	422
TCGA-GBMLGG	1133	1112	1111	665	512	909	937
TCGA-HNSC	528	527	522	500	523	510	527
TCGA-KICH	113	113	66	65	66	66	66
TCGA-KIPAN	941	940	888	883	873	693	889
TCGA-KIRC	537	536	532	530	516	339	532
TCGA-KIRP	291	291	290	288	291	288	291
TCGA-LAML	200	200	200	151	188	149	140
TCGA-LGG	516	514	515	511	512	513	515
TCGA-LIHC	377	377	375	371	372	375	377
TCGA-LUAD	585	522	518	513	513	569	577
TCGA-LUSC	504	503	503	501	478	497	502
TCGA-MESO	87	87	87	86	87	83	87
TCGA-OV	608	586	587	374	489	441	591
TCGA-PAAD	185	185	184	177	178	183	184
TCGA-PANGI	1261	1257	1243	1158	1225	1214	1251
TCGA-PCPG	179	179	178	178	179	179	179
TCGA-PRAD	500	499	495	495	494	498	497
TCGA-READ	172	171	165	166	161	157	165
TCGA-SARC	261	261	260	259	259	255	261
TCGA-SKCM	470	470	368	367	352	368	368
TCGA-STAD	443	443	442	375	436	441	443
TCGA-STES	628	628	626	536	620	625	628
TCGA-TGCT	150	134	150	150	150	150	150
TCGA-THCA	507	506	505	502	506	496	506
TCGA-THYM	124	124	124	119	124	123	124
TCGA-UCEC	560	547	540	543	538	542	546
TCGA-UCS	57	57	56	56	57	57	57
TCGA-UVM	80	80	80	80	80	80	80
Totals	11315	11156	10947	10084	10130	10308	10803

Results

Sample Heatmaps

TCGA-ACC

Figure 1. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-BLCA

Figure 2. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-BRCA

Figure 3. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-CESC

Figure 4. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-CHOL

Figure 5. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-COAD

Figure 6. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-COADREAD

Figure 7. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-DLBC

Figure 8. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-ESCA

Figure 9. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-GBM

Figure 10. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-GBMLGG

Figure 11. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-HNSC

Figure 12. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-KICH

Figure 13. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-KIPAN

Figure 14. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-KIRC

Figure 15. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-KIRP

Figure 16. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-LAML

Figure 17. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-LGG

Figure 18. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-LIHC

Figure 19. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-LUAD

Figure 20. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-LUSC

Figure 21. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-MESO

Figure 22. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-OV

Figure 23. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-PAAD

Figure 24. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-PANGI

Figure 25. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-PCPG

Figure 26. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-PRAD

Figure 27. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-READ

Figure 28. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-SARC

Figure 29. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-SKCM

Figure 30. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-STAD

Figure 31. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-STES

Figure 32. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-TGCT

Figure 33. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-THCA

Figure 34. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-THYM

Figure 35. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-UCEC

Figure 36. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-UCS

Figure 37. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

TCGA-UVM

Figure 38. Get High-res Image This figure depicts the distribution of available data on a per participant basis.

Filtered Samples

Redactions

Replicate Samples

Blacklisted Samples

Methods & Data

Redactions and Other Annotations

NOT IMPLEMENTED YET: redactions are not yet exposed at the GDC. For examples of the annotation-based filtering performed in the past by the Broad GDAC Firehose pipeline, explore this legacy GDAC Firehose sample report

Preprocessors

mRNA Preprocessor

The mRNA preprocess median module chooses the matrix for the platform(Affymetrix HG U133, Affymetrix Exon Array and Agilent Gene Expression) with the largest number of samples.

mRNAseq Preprocessor

The mRNAseq preprocessor picks the "scaled_estimate" (RSEM) value from Illumina HiSeq/GA2 mRNAseq level_3 (v2) data set and makes the mRNAseq matrix with log2 transformed for the downstream analysis. If there are overlap samples between two different platforms, samples from illumina hiseq will be selected. The pipeline also creates the matrix with RPKM and log2 transform from HiSeq/GA2 mRNAseq level 3 (v1) data set.

miRseq Preprocessor

The miRseq preprocessor picks the "RPM" (reads per million miRNA precursor reads) from the Illumina HiSeq/GA miRseq Level_3 data set and makes the matrix with log2 transformed values.

Methylation Preprocessor

The methylation preprocessor filters methylation data for use in downstream pipelines. To learn more about this preprocessor, please visit the documentation.

Made with Nozzle