This pipeline attempts to calculate the pearson correlation between APOBEC_MutLoad_MinEstimate and mRnaseq data of each gene across samples to determine if the APOBEC_MutLoad_MinEstimate also result in differential expressions.
The correlation coefficients in 10, 20, 30, 40, 50, 60, 70, 80, 90 percentiles are -0.088, -0.0567, -0.0344, -0.01474, 0.0046, 0.0235, 0.0446, 0.069, 0.10216, respectively.
Number of samples used for the calculation are shown in Table 1. Figure 1 shows the distribution of calculated correlation coefficients and quantile-quantile plot of the calculated correlation coefficients against a normal distribution. Table 2 shows the top 20 genes ordered by the value of correlation coefficients.
Table 1. Counts of mRNAseq and number of samples in APOBEC_MutLoad_MinEstimate and expression data sets and common to both
Category | APOBEC_MutLoad_MinEstimate | Expression | Common |
---|---|---|---|
Sample | 395 | 408 | 391 |
Figure 1. Summary figures. Left: histogram showing the distribution of the calculated correlations across samples for all Genes. Right: QQ plot of the calculated correlations across samples. The QQ plot is used to plot the quantiles of the calculated correlation coefficients against that derived from a normal distribution. Points deviating from the blue line indicate deviation from normality.

Table 2. Get Full Table Top 20 genes ranked by correlation coefficients
geneID | cor | p-value | q-value |
---|---|---|---|
OR52N2|390077 | 0.3669 | 6.57697862171958e-06 | 0.00630524555761169 |
KIR2DS4|3809 | 0.3596 | 7.71140280519944e-08 | 0.000702316010483539 |
KIR3DL1|3811 | 0.3421 | 2.43114360065633e-06 | 0.00384652980721829 |
PRSS41|360226 | 0.3346 | 4.42176917296866e-05 | 0.0162657057838258 |
COX7B2|170712 | 0.3098 | 9.25224976682237e-05 | 0.0203489279352552 |
SSX4|6759 | 0.301 | 0.000172630047932998 | 0.025634710508236 |
TTPA|7274 | 0.3004 | 1.76764757942038e-06 | 0.00384652980721829 |
KIR2DL3|3804 | 0.2958 | 9.36078987101574e-06 | 0.00741333858698051 |
KLRC4|8302 | 0.289 | 1.83517766689789e-05 | 0.0098316944713368 |
SSX6|280657 | 0.2788 | 0.000716291465384877 | 0.0433463423321779 |
KIAA1841|84542 | 0.2774 | 2.42870705768894e-08 | 0.00044238899055804 |
TUBA3C|7278 | 0.2766 | 0.000758138554261212 | 0.0446909183361423 |
KIR2DL1|3802 | 0.2748 | 0.00036907202065839 | 0.0340842338879753 |
GPR12|2835 | 0.2692 | 0.000116110247520229 | 0.0215811036589895 |
KLRC3|3823 | 0.2631 | 3.17448657671804e-06 | 0.00384652980721829 |
CLNK|116449 | 0.2623 | 0.000129318599570993 | 0.022453663760489 |
RFPL4B|442247 | 0.2604 | 0.000818162410854306 | 0.0456210915579083 |
IFNG|3458 | 0.258 | 3.3787799569307e-06 | 0.00384652980721829 |
APOC1|341 | 0.2466 | 7.92513032532582e-07 | 0.00327716341308726 |
CSAG3|389903 | 0.2464 | 2.99151906646422e-06 | 0.00384652980721829 |
Gene level (TCGA Level III) mRNAseq expression data and APOBEC_MutLoad_MinEstimate derived by Mutation_APOBEC pipeline were used to do this analysis. Pearson correlation coefficients were calculated for APOBEC_MutLoad_MinEstimate and each gene across all the samples that were common.
Pearson correlation with pairwise.complete.obs was used to do this analysis.