Mesothelioma (MESO) Samples Report
Overview
Introduction

The Broad GDAC mirrors data from the DCC on a daily basis. Although all data is mirrored, not every sample is ingested into Firehose. There are three main mechanisms that filter samples to ensure that only the most scientifically relevant samples make it into our standard data and analyses runs. These three mechanisms are redactions, replicate filtering, and blacklisting. This report summarizes the data that is ingested into Firehose, describes the three filtering mechanisms, lists those samples that are removed, and gives all available annotations from the DCC's Annotation Manager.

Summary

There were 0 redactions, 0 replicate aliquots, 0 blacklisted aliquots, and 0 FFPE aliquots. The table below represents the sample counts for those samples that were ingested into firehose after filtering out redactions, replicates, and blacklisted data, and segregating FFPEs.

Table 1.  This table provides a breakdown of sample counts on a per sample type and, if applicable, per subtype basis. Each count is a link to a table containing a list of the samples that comprise that count and details pertaining to each individual sample (e.g. platform, sequencing center, etc.). Please note, there are usually multiple protocols per data type, so there are typically many more rows than the count implies.

Sample Type BCR Clinical CN LowP Methylation mRNA mRNASeq miR miRSeq RPPA MAF
TP 37 0 0 0 0 0 0 0 0 0 0
Totals 37 0 0 0 0 0 0 0 0 0 0
MESO Primary Solid Tumor BCR Data

Table S1. 

TCGA Barcode Platform Center Data Level Protocol
TCGA-LK-A4NW Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4NY Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4NZ Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4O0 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4O2 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4O4 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4O5 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4O6 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-LK-A4O7 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A4KX Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A4LC Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A4LI Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A4LJ Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A4LM Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A4LP Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A4LV Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A6BL Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A6BN Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A6BQ Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A6BR Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-MQ-A6BS Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-NQ-A57I Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-NQ-A638 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SC-A6LM Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SC-A6LN Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SC-A6LP Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SC-A6LQ Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SC-A6LR Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SH-A7BC Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SH-A7BD Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-SH-A7BH Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-TS-A7OY Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-TS-A7P0 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-TS-A7P3 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-TS-A7P6 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-TS-A7P8 Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen
TCGA-TS-A7PB Biospecimen Metadata - Complete Set Nationwide Children's Hospital 1 biospecimen

The following platforms are outdated and are not included in the counts depicted in the table above.

  • Agilent SurePrint G3 Human CGH Microarray Kit 1x1M

  • Agilent Human Genome CGH Microarray 244A

  • Agilent Human Genome CGH Custom Microarray 2x415K

  • Affymetrix Human Exon 1.0 ST Array

  • Illumina DNA Methylation OMA002 Cancer Panel I

  • Illumina DNA Methylation OMA003 Cancer Panel I

  • Illumina Human1M-Duo BeadChip

  • Illumina 550K Infinium HumanHap550 SNP Chip

The sample type short letter codes in the table above are defined in the following list.

  • TP: Primary Solid Tumor

  • TR: Recurrent Solid Tumor

  • TB: Primary Blood Derived Cancer - Peripheral Blood

  • TAP: Additional - New Primary

  • TM: Metastatic

  • TAM: Additional Metastatic

  • NB: Blood Derived Normal

  • NT: Solid Tissue Normal

Figure 1.  Get High-res Image This figure depicts the distribution of available data on a per participant basis.

Results
Filtered Samples
Redactions

For TCGA data, redaction is the removal of cases from the data prior to publication or release. Redacted cases are generally rare, but cases must be redacted when the TSS/BCR subject link is incorrect ("unknown patient identity"), or in the case of genotype mismatch, completely wrong cancer, or completely wrong organ/tissue. Redaction occurs regardless of a case's analyte characterization or DCC data deposition status.

Rescission is the removal of samples from the list of redactions. This happens if the reason for redaction is eventually cleared up. For clarity, rescinded redactions do not appear in this report.

There were no redactions.

Replicate Samples

In many instances there is more than one aliquot for a given combination of individual, platform, and data type. However, only one aliquot may be ingested into Firehose. Therefore, a set of precedence rules are applied to select the most scientifically advantageous one among them. Two filters are applied to achieve this aim: an Analyte Replicate Filter and a Sort Replicate Filter.

Analyte Replicate Filter

The following precedence rules are applied when the aliquots have differing analytes. For RNA aliquots, T analytes are dropped in preference to H and R analytes, since T is the inferior extraction protocol. If H and R are encountered, H is the chosen analyte. This is somewhat arbitrary and subject to change, since it is not clear at present whether H or R is the better protocol. If there are multiple aliquots associated with the chosen RNA analyte, the aliquot with the later plate number is chosen. For DNA aliquots, D analytes (native DNA) are preferred over G, W, or X (whole-genome amplified) analytes, unless the G, W, or X analyte sample has a higher plate number.

Sort Replicate Filter

The following precedence rules are applied when the analyte filter still produces more than one sample. The sort filter chooses the aliquot with the highest lexicographical sort value, to ensure that the barcode with the highest portion and/or plate number is selected when all other barcode fields are identical.

There were no replicate samples.

Blacklisted Samples

In certain circumstances, replicate filtering may choose the less favorable among two or more aliquots. For instance, an analyst may manually review two aliquots for a given individual and determine that one is superior. If the replicate filtering would choose the inferior sample, then the inferior sample can be added to this blacklist. This would result in the desired sample being chosen. This table lists those blacklisted samples and a reason for their being blacklisted.

There were no blacklisted samples.

FFPE cases
FFPE description from TCGA Tissue Sample Requirements (2013)

FFPE (formalin fixed paraffin embedded) samples are not suitable for molecular analysis because the RNA and DNA are trapped in the nucleic acid-protein cross linking from the fixation process.

There were no FFPE cases.

Additional Annotations from the DCC's Annotations Manager
Methods & Data
Redactions and Other Annotations

Annotation data was taken from theTCGA Data Portalusing the query string:

https://tcga-data.nci.nih.gov/annotations/resources/searchannotations/json?item=TCGA

Redaction information was generated by filtering for the annotationClassificationName "Redaction"

FFPE information was generated by filtering for "FFPE" in annotation note text

Additional FFPEs were garnered from clinical data

Remaining annotations were sorted into sections by annotationClassificationName

Preprocessors
mRNA Preprocessor

The mRNA preprocess median module chooses the matrix for the platform(Affymetrix HG U133, Affymetrix Exon Array and Agilent Gene Expression) with the largest number of samples.

mRNAseq Preprocessor

The mRNAseq preprocessor picks the "scaled_estimate" (RSEM) value from Illumina HiSeq/GA2 mRNAseq level_3 (v2) data set and makes the mRNAseq matrix with log2 transformed for the downstream analysis. If there are overlap samples between two different platforms, samples from illumina hiseq will be selected. The pipeline also creates the matrix with RPKM and log2 transform from HiSeq/GA2 mRNAseq level 3 (v1) data set.

miRseq Preprocessor

The miRseq preprocessor picks the "RPM" (reads per million miRNA precursor reads) from the Illumina HiSeq/GA miRseq Level_3 data set and makes the matrix with log2 transformed values.

Methylation Preprocessor

The methylation preprocessor filters methylation data for use in downstream pipelines. To learn more about this preprocessor, please visit the documentation.