Stomach Adenocarcinoma: Correlations between copy number and mRNAseq expression

Maintained by TCGA GDAC Team (Broad Institute/Dana-Farber Cancer Institute/Harvard Medical School)

Overview

Introduction

A TCGA sample is profiled to detect the copy number variations and expressions of genes. This pipeline attempts to correlate copy number and Rnaseq data of genes across samples to determine if the copy number variations also result in differential expressions. This report contains the calculated correlation coefficients based on measurements of genomic copy number (log2) values and RNAseq expression of the corresponding feature across patients. High positive/low negative correlation coefficients indicate that genomic alterations result in differences in the expressions of mRNAseq the genomic regions transcribe.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are 1114, 2164.4, 2861.6, 3546.8, 4239, 4936, 5614, 6380.6, 7215, respectively.

Results

Correlation results

Number of genes and samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 features ordered by the value of correlation coefficients.

Table 1. Counts of mRNAseq and number of samples in copy number and expression data sets and common to both

Category	Copy number	Expression	Common
Sample	132	57	57
Genes	22749	19382	18553

Figure 1. Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2. Get Full Table Top 20 features (defined by the feature column) ranked by correlation coefficients

Locus ID	Gene Symbol	Cytoband	cor
84299	MIEN1	17q12	0.9411
84060	RBM48	7q21.2	0.9314
830	CAPZA2	7q31.2	0.922
55717	WDR11	10q26.12	0.913
10775	POP4	19q12	0.909
137492	VPS37A	8p22	0.9048
54994	C20orf11	20q13.33	0.904
4848	CNOT2	12q15	0.898
10210	TOPORS	9p21.1	0.8976
79648	MCPH1	8p23.1	0.8963
55610	CCDC132	7q21.3	0.8955
889	KRIT1	7q21.2	0.8923
91782	CHMP7	8p21.3	0.8897
51271	UBAP1	9p13.3	0.8891
54467	ANKIB1	7q21.2	0.8873
55915	LANCL2	7p11.2	0.8852
9862	MED24	17q21.1	0.8849
10564	ARFGEF2	20q13.13	0.884
6780	STAU1	20q13.13	0.8834
93210	PGAP3	17q12	0.8824

Methods & Data

Input

Gene level (TCGA Level III) mRNAseq expression data and copy number data of corresponding gene derived by GISTIC pipelinePearson correlation coefficients were calculated for each pair of genes shared by the two data sets across all the samples that were common.

Correlation across sample

Pairwise correlations between the log2 copy numbers and expressions of each gene across samples were calculated using Pearson correlation.

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.

Made with Nozzle