Correlations between APOBEC_MutLoad_MinEstimate and mRNAseq expression
Overview
Introduction

This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.

Summary

The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.091, -0.0558, -0.03001, -0.0077, 0.0131, 0.03478, 0.0571, 0.08304, 0.1193, respectively.

Results
Correlation results

Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.

Table 1.  Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both

Category APOBEC_MutLoad_MinEstimate Expression Common
Sample 177 501 177

Figure 1.  Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2.  Get Full Table Top 20 genes ranked by correlation coefficients

geneID cor p-value q-value
TSPYL6|388951 0.5781 2.98381297447747e-08 0.000552423134094759
C11orf86|254439 0.4779 2.50474980545423e-05 0.0662470541402566
SLC10A1|6554 0.4159 2.05020617163321e-05 0.0632625284360286
FAM75C1|441452 0.4083 0.000894303331782531 0.367936264102706
ATP4B|496 0.4056 0.000801333902122892 0.362553969104832
ALPP|250 0.3571 5.86888532749796e-06 0.0362188476510991
OLFM4|10562 0.3512 3.48935851168797e-06 0.0323009917426955
C11orf85|283129 0.3425 0.00688754967740124 0.579950713666998
CYP4F8|11283 0.3394 0.0148429297342312 0.645632246523302
CGA|1081 0.3345 0.00242376193145444 0.476423030323529
AMAC1|146861 0.3274 0.00827759667052819 0.593874537637112
C20orf141|128653 0.3272 0.00345293134324942 0.532729757407665
INHA|3623 0.3265 9.76297271670923e-06 0.0451879192192887
TRIM17|51127 0.316 1.83494550967112e-05 0.0632625284360286
C8G|733 0.313 4.02855914163336e-05 0.09323092993525
LCE5A|254910 0.3046 0.00979684226331745 0.596182849930428
OR1F1|4992 0.3028 0.00285761779867921 0.497923409941477
THRSP|7069 0.2994 0.00774556144293648 0.585994538325235
GPR78|27201 0.2975 0.0233062458127198 0.683193044987412
TMED1|11018 0.2969 5.99622199286554e-05 0.123348948862125
Methods & Data
Input

Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.

Correlation across sample

Pearson correlation with pairwise.complete.obs was used to do this analysis.