Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.1698, -0.1114, -0.06929, -0.0321, 0.0026, 0.0386, 0.07809, 0.1244, 0.1873, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 185 184 184

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
OR2W3|343171 0.4843 2.7538493707624e-08 7.38346357006695e-05
RNF11|26994 0.4359 6.20910434179223e-10 3.88441567622522e-06
CYB5R1|51706 0.4251 1.79783521403465e-09 8.43544282425057e-06
CAPZB|832 0.4076 9.33630506239069e-09 3.50447546821897e-05
IFNK|56832 0.4035 3.69880425088454e-06 0.000701203617985868
PRB1|5542 0.3984 0.00122289206215487 0.0167405092797393
ASB15|142685 0.39 0.00131964431043574 0.0175156184004653
TMEM79|84283 0.3861 6.18746489600142e-08 0.000144994727327467
HSPA2|3306 0.3847 6.95307196263428e-08 0.000144994727327467
PLEKHG5|57449 0.3823 8.5200481780845e-08 0.000153273475794565
C1orf74|148304 0.3817 8.9834198302441e-08 0.000153273475794565
S100A11|6282 0.3743 1.661087956073e-07 0.000211094856358329
CITED4|163732 0.3741 1.68713919723729e-07 0.000211094856358329
FAM178B|51252 0.3732 2.46857399854861e-07 0.000243843141077686
SLC38A2|54407 0.371 2.17428239190909e-07 0.000240040776066763
FOLR3|2352 0.367 3.2072548125095e-05 0.00194173413939285
APOBEC3A|200315 0.3669 3.50869431731127e-07 0.000278904205873213
KIAA1609|57707 0.3664 3.12661796009195e-07 0.000278904205873213
MMP28|79148 0.3659 3.25488486829784e-07 0.000278904205873213
MAP7D1|55700 0.3649 3.52045949281177e-07 0.000278904205873213
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.