Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.14214, -0.0984, -0.06712, -0.0422, -0.0178, 0.0059, 0.031, 0.061, 0.101, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 194 304 193

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
MYL7|58498 0.4852 5.84671357772848e-06 0.0118149088709053
PRSS37|136242 0.4703 0.000149425389998292 0.0438322510951442
MAGEA4|4103 0.3741 0.000357731367427894 0.0625846835070673
ADCYAP1R1|117 0.363 0.00253461389236964 0.14133976848833
PRTN3|5657 0.3625 0.0032438936092527 0.152677746344702
MAGEA8|4107 0.3612 5.04354682537844e-05 0.0269785253273993
C3orf49|132200 0.3567 0.00554816038790729 0.181668944913246
GJB7|375519 0.3537 1.69315037301665e-05 0.0153448213690089
SSX4|6759 0.3275 0.006013199160531 0.187033783856913
PGPEP1L|145814 0.32 9.23039321381047e-05 0.0349735752874106
TAS2R13|50838 0.3081 0.0176144985265181 0.273302964548592
PDP1|54704 0.3064 1.46900584154785e-05 0.0153448213690089
KIR3DL3|115653 0.3034 0.00214751446585359 0.135928921714178
RFPL4A|342931 0.3009 0.0096820305522165 0.222383736120283
TRIM17|51127 0.2968 2.77203142649007e-05 0.0195251528621723
LOC100271832|100271832 0.2881 0.0148331201626304 0.256191791450863
GBP5|115362 0.2831 6.62581840529253e-05 0.0314674224439581
COX8C|341947 0.2816 0.0103872067642137 0.226986527155563
P2RY6|5031 0.2816 7.28361894841711e-05 0.0327766360846321
C1orf49|84066 0.2805 0.0178281335883055 0.273357709320204
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.