Ovarian Serous Cystadenocarcinoma: Correlations between copy number and mRNA expression

Overview

Introduction

A TCGA sample is profiled to detect the copy number variations and expressions of genes. This pipeline attempts to correlate copy number and expression data of genes across samples to determine if the copy number variations also result in differential expressions. This report contains the calculated correlation coefficients based on measurements of genomic copy number (log2) values and intensity of the expressions of the corresponding feature across patients. High positive/low negative correlation coefficients indicate that genomic alterations result in differences in the expressions of mRNA the genomic regions transcribe.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are 0.00099, 0.06, 0.13267, 0.2298, 0.33605, 0.41974, 0.4879, 0.55122, 0.61761, respectively.

Results

Correlation results

Number of genes and samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 features ordered by the value of correlation coefficients.

Table 1. Counts of mRNA and number of samples in copy number and expression data sets and common to both

Category	Copy number	Expression	Common
Sample	520	539	513
Genes	29390	12043	10944

Figure 1. Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2. Get Full Table Top 20 features (defined by the feature column) ranked by correlation coefficients

feature	r	chrom	start	end	geneid
FBXL12	0.8179	19	9781943	9790731	54850
CNOT2	0.8144	12	68923489	69033985	4848
KEAP1	0.8091	19	10457796	10475054	9817
USP14	0.8053	18	148553	202612	9097
PAF1	0.8032	19	44568109	44573519	54623
C19orf2	0.7983	19	35125265	35198456	8725
LSM1	0.7933	8	38140014	38153183	27257
CDC40	0.7904	6	110608317	110660116	51362
EIF2AK1	0.789	7	6029988	6065302	27102
UBE2L3	0.788	22	20251957	20308323	7332
WIPI2	0.786	7	5196361	5240012	26100
POP4	0.7822	19	34789041	34798547	10775
PSMC4	0.782	19	45168913	45179193	5704
PWP1	0.7778	12	106603720	106630387	11137
MTDH	0.777	8	98725583	98807714	92140
PIN1	0.7757	19	9806999	9821358	5300
WBP11	0.7744	12	14830679	14847668	51729
UBE3A	0.7738	15	23133489	23235221	7337
CSNK1A1	0.7726	5	148855038	148911200	1452
RIC8A	0.7707	11	198530	205113	60626

Methods & Data

Input

Gene level (TCGA Level III) expression data and copy number data of the corresponding loci derived by using the CNTools package of Bioconductor were used for the calculations. Pearson correlation coefficients were calculated for each pair of genes shared by the two data sets across all the samples that were common.

Correlation across sample

Pairwise correlations between the log2 copy numbers and expressions of each gene across samples were calculated using Pearson correlation.

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.