Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.0567, -0.0398, -0.026, -0.0142, -0.0036, 0.007, 0.0182, 0.0315, 0.0499, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 978 1093 974

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
HFE2|148738 0.3017 4.92876051083613e-09 4.50883011531289e-05
CACNA1S|779 0.266 3.46300125286803e-06 0.00289035671378529
ABRA|137735 0.2251 3.20085347116361e-07 0.000532389228258268
TBX10|347853 0.1932 0.000293478731580699 0.0431638279629016
PROKR1|10887 0.1926 2.19678084611985e-05 0.00803846047212176
OR1Q1|158131 0.1874 0.000533310389485564 0.0627924417907852
CHRNA4|1137 0.1835 0.000168260230674289 0.0292741389024907
SUSD2|56241 0.1796 1.65089473203039e-08 7.55119250430702e-05
CDH10|1008 0.177 0.00027573679993953 0.0413514794401118
SNX11|29916 0.1727 5.80909835790067e-08 0.000151833233651644
ASB5|140458 0.172 0.00160128154558414 0.106534716938209
OVCH1|341350 0.1719 0.00156694912835098 0.106487689515663
LOC400891|400891 0.1701 0.00114086424729853 0.091549352055149
FBXO40|51725 0.167 0.000410011890184858 0.0531349807878518
KCNT1|57582 0.1618 5.88433384707265e-06 0.00371240593331177
MYL2|4633 0.1593 0.00164242111564783 0.108482804086255
TMC1|117531 0.159 2.89909036019775e-05 0.00959964512502387
SLC10A1|6554 0.1579 0.00136000789368351 0.099134280569058
TEX11|56159 0.1579 3.38379499642549e-06 0.00289035671378529
SEPX1|51734 0.1563 9.474744027127e-07 0.00131686632153752
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.