Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.14214, -0.0981, -0.0672, -0.042, -0.0175, 0.006, 0.031, 0.0613, 0.10194, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 194 304 193

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
MYL7|58498 0.4827 6.64923105020421e-06 0.0126330189235998
PRSS37|136242 0.4726 0.000137425189183471 0.0423618968759286
MAGEA4|4103 0.3787 0.00029862119846058 0.0622802215913949
MAGEA8|4107 0.3623 4.78680021336331e-05 0.0262088296248366
PRTN3|5657 0.3622 0.00327062059122873 0.151741777277237
ADCYAP1R1|117 0.3582 0.00292085363789241 0.147559903089859
GJB7|375519 0.3546 1.59714488345841e-05 0.0156125478889854
C3orf49|132200 0.3518 0.00628123699036576 0.18666153128069
SSX4|6759 0.326 0.00627088193478142 0.186658804824664
PGPEP1L|145814 0.3202 9.11345316541023e-05 0.0352651856849608
TAS2R13|50838 0.3051 0.0187970016258188 0.274399823538777
PDP1|54704 0.3041 1.70771108289891e-05 0.0156125478889854
KIR3DL3|115653 0.304 0.00210728919112269 0.132829111132642
TRIM17|51127 0.2992 2.37569987013853e-05 0.0180028556409206
C1orf49|84066 0.2983 0.0115214676572508 0.235404592024457
RFPL4A|342931 0.2972 0.0106795126898511 0.226947185829069
COX8C|341947 0.2887 0.00851813236342114 0.208505078456986
GBP5|115362 0.2879 4.89965473824405e-05 0.0262088296248366
LOC100271832|100271832 0.2876 0.0150117184467025 0.254487010767607
P2RY6|5031 0.2831 6.63859744596262e-05 0.0301840429374306
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.