Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.1399, -0.0963, -0.0656, -0.0369, -0.0086, 0.0194, 0.0497, 0.08814, 0.14597, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 578 599 554

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
TMEM79|84283 0.3911 0 0
SLC38A2|54407 0.366 0 0
KRT1|3848 0.3654 8.88178419700125e-16 4.2687221279126e-13
C1orf74|148304 0.3643 0 0
GJB2|2706 0.3616 0 0
CAPNS2|84290 0.3598 1.82076576038526e-14 4.01510981325427e-12
KIAA1609|57707 0.3505 0 0
PERP|64065 0.3459 0 0
LGALS7|3963 0.3453 8.88178419700125e-16 4.2687221279126e-13
KRT14|3861 0.3437 2.59792187762287e-14 5.39688440457522e-12
PLS3|5358 0.3437 0 0
KRT16|3868 0.3434 4.44089209850063e-16 2.52242671194836e-13
DSC1|1823 0.3418 4.93161067538495e-13 5.70605620366762e-11
IGFL1|374918 0.3407 6.79190037544686e-12 5.46383607885733e-10
FAT2|2196 0.3404 2.22044604925031e-16 1.80956698900643e-13
APOBEC3A|200315 0.3397 4.44089209850063e-16 2.52242671194836e-13
PTHLH|5744 0.3397 2.22044604925031e-16 1.80956698900643e-13
KRT75|9119 0.3394 6.00298610820005e-09 1.83556234277491e-07
IL20RB|53833 0.3393 2.22044604925031e-16 1.80956698900643e-13
LYPD3|27076 0.3385 2.22044604925031e-16 1.80956698900643e-13
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.