Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.1028, -0.0685, -0.0426, -0.0206, 0.0011, 0.0233, 0.0464, 0.075, 0.1152, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 533 515 478

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
C13orf34|79866 0.3093 4.65094629475971e-12 5.33958461890194e-08
FBXO45|200933 0.308 5.82955905770177e-12 5.33958461890194e-08
RNF219|79596 0.301 1.81747950023237e-11 1.10981356549189e-07
SIM1|6492 0.2926 4.83646181610453e-05 0.00178267895390783
PNMA5|114824 0.2874 7.15645187554514e-09 7.74373391554395e-06
IL20RB|53833 0.2809 4.03600930454218e-10 1.8483913612477e-06
SPHKAP|80309 0.28 2.40952751846546e-05 0.00110989520062473
PAK2|5062 0.2765 7.80939313216322e-10 2.38433787980163e-06
SIAH2|6478 0.2751 9.50584055914305e-10 2.48767847432774e-06
FOXG1|2290 0.2738 0.000143770948294986 0.00380598266158359
TAC3|6866 0.2714 8.25541619990933e-05 0.0025588996508653
LOC100130386|100130386 0.2705 0.000391302644334068 0.00745142738207464
MAGEA6|4105 0.2681 9.28861340598885e-06 0.000559422827948153
TBL1XR1|79718 0.2655 3.75794817486508e-09 6.88418526153534e-06
MYO18B|84700 0.2653 4.90975794020798e-07 8.73221900064757e-05
ADAM21P1|145241 0.2652 1.43647370598465e-06 0.00016243680135761
C18orf2|56651 0.2649 1.75202592540558e-05 0.000857967434898444
ZNF675|171392 0.2637 4.7679415970947e-09 6.97312305741882e-06
ZNF200|7752 0.2633 5.03707542343079e-09 6.97312305741882e-06
WDR53|348793 0.2629 5.32909671946413e-09 6.97312305741882e-06
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.