Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.0945, -0.0647, -0.04202, -0.0244, -0.0071, 0.0107, 0.0306, 0.05348, 0.08384, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 290 367 290

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
AHSG|197 0.3797 3.64947199917864e-05 0.0549458421743005
PRB2|653247 0.351 0.000523283548035547 0.163213817218766
AMY1A|276 0.3382 3.11037913647283e-07 0.00280976099293273
PRB1|5542 0.3228 0.00111732722117397 0.225294741216969
PARP6|56965 0.3195 2.62622421587366e-08 0.000474479929081895
LOC153328|153328 0.3076 0.000969013082315229 0.213501943392552
ITLN2|142683 0.2689 0.000118288649798126 0.0926222638907202
GAGE12J|729396 0.2618 0.00372294721519539 0.344428288745074
PRB3|5544 0.2617 0.00647732198136652 0.423185745506196
ITGB1BP1|9270 0.2557 1.03413674445996e-05 0.0311395809369301
HAO2|51179 0.2515 0.00068438386661196 0.180004570286463
APOA2|336 0.2509 0.0136709041178493 0.491038220073924
LOC285033|285033 0.2475 2.01296175987853e-05 0.0398077669095581
OR7E5P|219445 0.2434 0.0193642893335331 0.540735931249808
FGF5|2250 0.2424 0.00104218865985661 0.213968437700334
RPL23AP82|284942 0.2412 3.29491461945608e-05 0.0541174749361027
CALHM3|119395 0.236 0.0153427263015629 0.507454986682769
NBPF4|148545 0.2358 0.000318022438845489 0.145724059006324
VIL1|7429 0.2331 0.0172582734665108 0.515380540032149
ADK|132 0.2329 6.23935892352101e-05 0.0751509984475027
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.