Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.0886, -0.0593, -0.0375, -0.0192, -0.0013, 0.017, 0.03689, 0.0603, 0.09273, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 510 520 502

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
GAGE4|2576 0.3616 6.31556404928624e-06 0.00218867555570412
RAX|30062 0.3038 2.421508061623e-06 0.00148570308619631
PTCHD3|374308 0.2981 0.000111302217288944 0.0104362864762052
CPN1|1369 0.2964 0.000105253721326637 0.0102884046407866
WDR87|83889 0.2582 0.000215163202629043 0.0148099975202867
C1orf126|200197 0.2476 2.09640704795788e-08 0.0001301050076657
LTA4H|4048 0.246 2.35184123287269e-08 0.0001301050076657
KIR2DL1|3802 0.2431 3.91976157509966e-05 0.00610486256162555
CDCP1|64866 0.2368 7.96948989023605e-08 0.000209233264575369
KLRC2|3822 0.2331 2.55950215466427e-07 0.000414163600778917
KYNU|8942 0.2321 1.44266028456741e-07 0.000294591230108665
CT45A5|441521 0.2313 0.00201273761640564 0.0501899687041702
?|340602 0.2301 0.0024657148843743 0.0559126054378087
GOLGA6L1|283767 0.2299 1.29231322532064e-06 0.00131945180305237
CLEC7A|64581 0.2297 1.95423442539422e-07 0.00035914920269895
MAGEA4|4103 0.2283 3.44356890424891e-06 0.00158214773305716
KIR2DL4|3805 0.2276 4.81650013428592e-07 0.000590117596452711
GLDC|2731 0.2263 3.06303240549255e-07 0.000433018534985708
CALCA|796 0.2235 0.00449246621456134 0.0755301676408008
SHISA5|51246 0.2172 8.9825179294678e-07 0.000971063026516231
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.