Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.10232, -0.0682, -0.0421, -0.0203, 0.0011, 0.0232, 0.0463, 0.07454, 0.1143, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 533 515 478

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
FBXO45|200933 0.3073 6.55764331725095e-12 6.59729739727055e-08
C13orf34|79866 0.3067 7.20268289455817e-12 6.59729739727055e-08
RNF219|79596 0.3004 2.01083594220108e-11 1.22788345417272e-07
PNMA5|114824 0.2896 5.39717825986941e-09 7.54235519878262e-06
SIM1|6492 0.2876 6.59030846528186e-05 0.00234423030632036
SPHKAP|80309 0.2849 1.7051355865938e-05 0.000882383582226322
IL20RB|53833 0.2792 5.2491055946291e-10 2.40395913470026e-06
FOXG1|2290 0.2763 0.000124080473687282 0.00352954999608281
PAK2|5062 0.2762 8.15381984153873e-10 2.53996227883633e-06
SIAH2|6478 0.2755 9.02268926239458e-10 2.53996227883633e-06
LOC100130386|100130386 0.273 0.000343789971467112 0.00718117273353025
TBL1XR1|79718 0.267 3.0317259813728e-09 5.55381882527684e-06
MAGEA6|4105 0.2636 1.32837173159039e-05 0.000737407325788013
MYO18B|84700 0.2633 6.03048508462933e-07 0.000108188530717285
ZNF675|171392 0.2633 5.07515740544306e-09 7.54235519878262e-06
TAC3|6866 0.2628 0.000140380014420405 0.00386709997619157
ZNF200|7752 0.2625 5.67391555961194e-09 7.54235519878262e-06
C18orf2|56651 0.2624 2.10560656133119e-05 0.00103135311756754
WDR53|348793 0.2619 6.17584627882195e-09 7.54235519878262e-06
ADAM21P1|145241 0.2615 2.04263919401093e-06 0.000220112396441684
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.