Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.08786, -0.0564, -0.0343, -0.01464, 0.0046, 0.02354, 0.0445, 0.0688, 0.1022, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 395 408 391

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
OR52N2|390077 0.3681 6.10396429978621e-06 0.00585177419582136
KIR2DS4|3809 0.3598 7.56890834452406e-08 0.000689338327477529
KIR3DL1|3811 0.3408 2.67395309716534e-06 0.00401560323703896
PRSS41|360226 0.3355 4.19513075264888e-05 0.0155947564611223
COX7B2|170712 0.3091 9.58805802802054e-05 0.0203077298814412
SSX4|6759 0.3031 0.000154988847433257 0.0247642268069893
KIR2DL3|3804 0.2976 8.19794125450635e-06 0.00678752272503787
TTPA|7274 0.2975 2.24549485006165e-06 0.00401560323703896
KLRC4|8302 0.2894 1.78125919383376e-05 0.00983201097444908
SSX6|280657 0.2794 0.000696487314185923 0.0428356936417541
TUBA3C|7278 0.2784 0.000697874079978789 0.0428356936417541
KIAA1841|84542 0.2753 3.11988452850187e-08 0.000568286966866616
KIR2DL1|3802 0.272 0.000425241442910451 0.0364980540676649
GPR12|2835 0.2693 0.000115563702312249 0.0211532868231449
CLNK|116449 0.2637 0.000119005385876392 0.0214622089479057
KLRC3|3823 0.2635 3.08638184565169e-06 0.00401560323703896
RFPL4B|442247 0.26 0.000834390042969702 0.0464783322100707
IFNG|3458 0.2581 3.3423900167584e-06 0.00405877561035028
APOC1|341 0.2475 7.2217283220155e-07 0.00328859453463781
CSAG3|389903 0.2463 3.00853115065003e-06 0.00401560323703896
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.