LUAD/00: Correlations between copy number and mRNA expression
Overview
Introduction

The same TCGA sample is profiled to detect the copy number variations and expressions of genes. This pipeline tries to correlate the copy number and expression data of genes across samples to if the copy number variations of genes also result in differential expressions

Summary

This page contains the calculated correlation coefficients based on measurements of genomic copy number (log2) values and intensitiy of the expressions of the the corresponding feature across patients. High positive/low negative correlation coefficients indicate that genomic alterations result in differences in the expressions of the features (microRNA or mRAN) the genomic regions transcribe.

Results
Correlation results

Number of genes and samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 features ordered by the value of correltion coefficients.

Table 1.  Counts of mRNA and number of samples in copy number and expression data sets and common to both

Category Copy number Expression Common
Sample 56 33 21
Genes 29390 17815 15551

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 features (defined by the Hybridization.REF column) ranked by correlation coefficients

Hybridization.REF cor p q chrom start end geneid
MRPL51 0.9581 8.97459884185992e-12 7.1649771478688e-08 12 6471577 6472718 51258
C14orf166 0.9546 1.92592608527775e-11 7.68792936188762e-08 14 51525943 51541163 51637
PSMC6 0.939 2.96587199244414e-10 7.89279663036257e-07 14 52243668 52264466 5706
ORAOV1 0.9301 1.04556230340336e-09 1.71044373810287e-06 11 69189512 69199346 220064
POP4 0.9299 1.07122088977007e-09 1.71044373810287e-06 19 34789041 34798547 10775
SCFD1 0.9276 1.45482514923856e-09 1.93579477889302e-06 14 30161272 30274769 23256
PIK3C3 0.9228 2.60701860099743e-09 2.97334877013278e-06 18 37789197 37915446 5289
POLR2F 0.9211 3.20894177896491e-09 3.20237078562836e-06 22 36679664 36693752 5435
MRPL21 0.9151 6.26991769614449e-09 4.63604728189718e-06 11 68415322 68427879 219927
PSMA6 0.9149 6.38764507954193e-09 4.63604728189718e-06 14 34831325 34856431 5687
UQCRFS1 0.9149 6.38487462900628e-09 4.63604728189718e-06 19 34390007 34395954 7386
TBL2 0.9132 7.69075980677769e-09 5.08673693909369e-06 7 72621935 72630908 26608
SEC61A1 0.9118 8.92005536101692e-09 5.08673693909369e-06 3 129253902 129273216 29927
VPS41 0.9118 8.90783380391724e-09 5.08673693909369e-06 7 38730068 38915325 27072
LSG1 0.9108 9.82679893013483e-09 5.23022744613477e-06 3 195842812 195874200 55341
FGFR1OP2 0.9086 1.23130021734141e-08 6.07653079184533e-06 12 26982583 27010129 26127
MRPL46 0.9081 1.29391235503817e-08 6.07653079184533e-06 15 86803714 86811623 26589
C18orf21 0.9004 2.68662936342423e-08 1.19161240722739e-05 18 31806586 31813239 83608
UBL7 0.8986 3.17753503509266e-08 1.26122292394471e-05 15 72525371 72540582 84993
KLHL11 0.8985 3.19446820107316e-08 1.26122292394471e-05 17 37263325 37275155 55175
Methods & Data
Input

Level III gene level expression data and gene by sample copy number data derived by using the CNTools package of Bioconductor were used for the calculations. Pearson correlation coefficients were calculated for each pair of genes shared by the two data sets across all the samples that were common.

Correlation across sample

Pairwise correlations between the log2 copy numbers and expressions of each gene across samples were calculated using Pearson correlation.