Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.0564, -0.0395, -0.0259, -0.0141, -0.0035, 0.0069, 0.018, 0.0314, 0.0498, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 978 1093 974

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
HFE2|148738 0.3074 2.43114595122051e-09 2.22401231617653e-05
CACNA1S|779 0.268 2.91497706417232e-06 0.00280696949295246
ABRA|137735 0.2343 1.00563434024536e-07 0.000229988573614115
TBX10|347853 0.1925 0.000309245891950916 0.0463767445830652
PROKR1|10887 0.1918 2.37452006577321e-05 0.00868884382467734
OR1Q1|158131 0.1863 0.000574190064553726 0.0659842181582006
CHRNA4|1137 0.182 0.000189695270141232 0.0324675201750846
SUSD2|56241 0.179 1.86831059512116e-08 8.54565266208418e-05
CDH10|1008 0.1753 0.000315819193703604 0.0465986126451704
ASB5|140458 0.1752 0.00130308763832487 0.0996150070466432
SNX11|29916 0.1721 6.5118462622138e-08 0.000170201056019234
OVCH1|341350 0.1719 0.00155976586092477 0.108096500725301
FBXO40|51725 0.1678 0.000383190373016262 0.0535179470588209
LOC400891|400891 0.1649 0.00162179682096375 0.110306299763393
MYL2|4633 0.1638 0.0012057557025793 0.0955548580695832
KCNT1|57582 0.1614 6.24746180877267e-06 0.00394150211218292
TEX11|56159 0.1576 3.53917629292155e-06 0.0030834652121568
SLC10A1|6554 0.1574 0.00140275816386604 0.103151994109108
TMC1|117531 0.1573 3.55047654312379e-05 0.0111999170401712
SEPX1|51734 0.156 1.00074233921887e-06 0.00143152409696613
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.