Analysis Overview - Colorectal Adenocarcinoma (Primary solid tumor)

Analysis Overview

Colorectal Adenocarcinoma (Primary solid tumor)

17 October 2017 | None

Maintainer Information

Maintained by Broad Institute GDAC (Broad Institute of MIT & Harvard)

Overview

Introduction

This is an overview of Colorectal Adenocarcinoma analysis pipelines from FireCloud run "17 October 2017".

Summary

Note: These results are offered to the community as an additional reference point, enabling a wide range of cancer biologists, clinical investigators, and genome and computational scientists to easily incorporate TCGA into the backdrop of ongoing research. While every effort is made to ensure that FireCloud input data and algorithms are of the highest possible quality, these analyses have not been reviewed by domain experts.

Results

Sequence and Copy Number Analyses

SNP6 Copy number analysis (GISTIC2)
View Report | There were 613 tumor samples used in this analysis: 41 significant arm-level results, 29 significant focal amplifications, and 37 significant focal deletions were found.

Correlations to Clinical Parameters

Correlation between copy number variation genes (focal events) and selected clinical features
View Report | Testing the association between copy number variation 66 focal events and 14 clinical features across 613 patients, 321 significant findings detected with Q value < 0.25.
Correlation between copy number variations of arm-level result and selected clinical features
View Report | Testing the association between copy number variation 81 arm-level events and 14 clinical features across 613 patients, 160 significant findings detected with Q value < 0.25.
Correlation between mRNAseq expression and clinical features
View Report | Testing the association between 18447 genes and 14 clinical features across 622 samples, statistically thresholded by P value < 0.05 and Q value < 0.3, 13 clinical features related to at least one genes.

Clustering Analyses

Clustering of copy number data by focal peak region with absolute value: consensus NMF
View Report | The most robust consensus NMF clustering of 613 samples using the 66 most variable genes was identified for k = 4 clusters. We computed the clustering for k = 2 to k = 10 and used the cophenetic correlation coefficient and the average silhouette width calculation to determine the robust clusters.
Clustering of copy number data by peak region with threshold value: consensus NMF
View Report | The most robust consensus NMF clustering of 613 samples using the 66 most variable genes was identified for k = 4 clusters. We computed the clustering for k = 2 to k = 10 and used the cophenetic correlation coefficient and the average silhouette width calculation to determine the robust clusters.
Clustering of lincRNA expression: consensus hierarchical
View Report | Median absolute deviation (MAD) was used to select 2500 most variable lincRNAs. Consensus ward linkage hierarchical clustering of 620 samples and 2500 lincRNAs identified 4 subtypes with the stability of the clustering increasing for k = 2 to k = 10.
Clustering of lincRNA expression: consensus NMF
View Report | The most robust consensus NMF clustering of 621 samples using the 2500 most variable lincRNAs was identified for k = 6 clusters. We computed the clustering for k = 2 to k = 10 and used the cophenetic correlation coefficient and the average silhouette width calculation to determine the robust clusters.
Clustering of miR mature expression: consensus hierarchical
View Report | Median absolute deviation (MAD) was used to select 272 most variable miRs. Consensus ward linkage hierarchical clustering of 604 samples and 272 miRs identified 4 subtypes with the stability of the clustering increasing for k = 2 to k = 10.
Clustering of miR mature expression: consensus NMF
View Report | The most robust consensus NMF clustering of 605 samples using the 272 most variable miRs was identified for k = 4 clusters. We computed the clustering for k = 2 to k = 10 and used the cophenetic correlation coefficient and the average silhouette width calculation to determine the robust clusters.
Clustering of Protein-coding gene expression: consensus hierarchical
View Report | Median absolute deviation (MAD) was used to select 2500 most variable genes. Consensus ward linkage hierarchical clustering of 621 samples and 2500 genes identified 4 subtypes with the stability of the clustering increasing for k = 2 to k = 10.

Other Analyses

Identification of putative miR direct targets by sequencing data
View Report | The CLR algorithm was applied on 731 miRs and 18447 mRNAs across 601 samples. After 2 filtering steps, the number of 69 miR:gene pairs were detected.
Methylation__HM27_Clustering_CNMF
View Report | The most robust consensus NMF clustering of 234 samples using the 2307 most variable genes was identified for k = 4 clusters. We computed the clustering for k = 2 to k = 10 and used the cophenetic correlation coefficient and the average silhouette width calculation to determine the robust clusters.
Methylation__HM450_Clustering_CNMF
View Report | The most robust consensus NMF clustering of 393 samples using the 8352 most variable genes was identified for k = 3 clusters. We computed the clustering for k = 2 to k = 10 and used the cophenetic correlation coefficient and the average silhouette width calculation to determine the robust clusters.
Methylation__HM450_Clustering_Consensus_Plus
View Report | Median absolute deviation (MAD) was used to select 2500 most variable genes. Consensus ward linkage hierarchical clustering of 392 samples and 2500 genes identified 4 subtypes with the stability of the clustering increasing for k = 2 to k = 10.

Methods & Data

Input

Summary Report Date = Thu Dec 14 13:45:51 2017
Protection = FALSE

Made with Nozzle