This pipeline computes the correlation between APOBRC groups and selected clinical features.
Testing the association between APOBEC groups identified by 2 different apobec score and 12 clinical features across 376 patients, 5 significant findings detected with Q value < 0.25.
-
3 subtypes identified in current cancer cohort by 'APOBEC MUTLOAD MINESTIMATE'. These subtypes correlate to 'Time to Death' and 'RACE'.
-
3 subtypes identified in current cancer cohort by 'APOBEC ENRICH'. These subtypes correlate to 'Time to Death', 'PATHOLOGY_T_STAGE', and 'RACE'.
Table 1. Get Full Table Overview of the association between APOBEC groups by 2 different APOBEC scores and 12 clinical features. Shown in the table are P values (Q values). Thresholded by Q value < 0.25, 5 significant findings detected.
Clinical Features |
Statistical Tests |
APOBEC MUTLOAD MINESTIMATE |
APOBEC ENRICH |
Time to Death | logrank test |
0.0119 (0.0714) |
0.00505 (0.0404) |
YEARS TO BIRTH | Kruskal-Wallis (anova) |
0.764 (0.873) |
0.639 (0.767) |
NEOPLASM DISEASESTAGE | Fisher's exact test |
0.214 (0.348) |
0.243 (0.364) |
PATHOLOGY T STAGE | Fisher's exact test |
0.106 (0.289) |
0.0342 (0.164) |
PATHOLOGY N STAGE | Fisher's exact test |
0.17 (0.335) |
0.168 (0.335) |
PATHOLOGY M STAGE | Fisher's exact test |
0.547 (0.692) |
0.0995 (0.289) |
GENDER | Fisher's exact test |
0.122 (0.293) |
0.845 (0.921) |
KARNOFSKY PERFORMANCE SCORE | Kruskal-Wallis (anova) |
0.182 (0.335) |
0.444 (0.606) |
NUMBER PACK YEARS SMOKED | Kruskal-Wallis (anova) |
0.217 (0.348) |
0.455 (0.606) |
NUMBER OF LYMPH NODES | Kruskal-Wallis (anova) |
0.064 (0.256) |
0.108 (0.289) |
RACE | Fisher's exact test |
0.00034 (0.00408) |
0.00016 (0.00384) |
ETHNICITY | Fisher's exact test |
1 (1.00) |
1 (1.00) |
Table S1. Description of APOBEC group #1: 'APOBEC MUTLOAD MINESTIMATE'
Cluster Labels | 0 | HIGH | LOW |
---|---|---|---|
Number of samples | 46 | 126 | 204 |
P value = 0.0119 (logrank test), Q value = 0.071
Table S2. Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #1: 'Time to Death'
nPatients | nDeath | Duration Range (Median), Month | |
---|---|---|---|
ALL | 370 | 151 | 0.1 - 166.0 (15.6) |
0 | 43 | 22 | 2.0 - 93.0 (13.0) |
HIGH | 126 | 41 | 0.4 - 166.0 (17.6) |
LOW | 201 | 88 | 0.1 - 140.8 (15.3) |
Figure S1. Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #1: 'Time to Death'

P value = 0.00034 (Fisher's exact test), Q value = 0.0041
Table S3. Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'RACE'
nPatients | ASIAN | BLACK OR AFRICAN AMERICAN | WHITE |
---|---|---|---|
ALL | 38 | 21 | 301 |
0 | 13 | 5 | 26 |
HIGH | 8 | 5 | 110 |
LOW | 17 | 11 | 165 |
Figure S2. Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'RACE'

Table S4. Description of APOBEC group #2: 'APOBEC ENRICH'
Cluster Labels | FC.HIGH.SIG | FC.LOW.NONSIG | FC.NEUTRAL |
---|---|---|---|
Number of samples | 315 | 46 | 15 |
P value = 0.00505 (logrank test), Q value = 0.04
Table S5. Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #1: 'Time to Death'
nPatients | nDeath | Duration Range (Median), Month | |
---|---|---|---|
ALL | 370 | 151 | 0.1 - 166.0 (15.6) |
FC.HIGH.SIG | 312 | 120 | 0.4 - 166.0 (16.5) |
FC.LOW.NONSIG | 43 | 22 | 2.0 - 93.0 (13.0) |
FC.NEUTRAL | 15 | 9 | 0.1 - 30.9 (12.9) |
Figure S3. Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #1: 'Time to Death'

P value = 0.0342 (Fisher's exact test), Q value = 0.16
Table S6. Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #4: 'PATHOLOGY_T_STAGE'
nPatients | T0+T1+T2 | T3 | T4 |
---|---|---|---|
ALL | 114 | 179 | 52 |
FC.HIGH.SIG | 89 | 158 | 41 |
FC.LOW.NONSIG | 22 | 15 | 7 |
FC.NEUTRAL | 3 | 6 | 4 |
Figure S4. Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #4: 'PATHOLOGY_T_STAGE'

P value = 0.00016 (Fisher's exact test), Q value = 0.0038
Table S7. Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #11: 'RACE'
nPatients | ASIAN | BLACK OR AFRICAN AMERICAN | WHITE |
---|---|---|---|
ALL | 38 | 21 | 301 |
FC.HIGH.SIG | 23 | 16 | 264 |
FC.LOW.NONSIG | 13 | 5 | 26 |
FC.NEUTRAL | 2 | 0 | 11 |
Figure S5. Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #11: 'RACE'

-
APOBEC groups file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/APOBEC_Pipelines/BLCA-TP/15185889/APOBEC_clinical_corr_input_15190444/APOBEC_for_clinical.correlaion.input.categorical.txt
-
Clinical data file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/Append_Data/BLCA-TP/15076720/BLCA-TP.merged_data.txt
-
Number of patients = 376
-
Number of selected clinical features = 12
APOBEC classification based on APOBEC_MutLoad_MinEstimate : a. APOBEC non group -- samples with zero value, b. APOBEC hig group -- samples above median value in non zero samples, c. APOBEC hig group -- samples below median value in non zero samples.
APOBEC classification based on APOBEC_enrich : a. No Enrichmment group -- all samples with BH_Fisher_p-value_tCw >=0.05, b. Small enrichment group -- samples with BH_Fisher_p-value_tCw = < 0.05 and APOBEC_enrich=<2, c. High enrichment gruop -- samples with BH_Fisher_p-value_tCw =< 0.05 and APOBEC_enrich>2.
For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R
For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.