Correlation between APOBEC groups and selected clinical features
Bladder Urothelial Carcinoma (Primary solid tumor)
28 January 2016  |  analyses__2016_01_28
Maintainer Information
Citation Information
Maintained by Hailei Zhang (Broad Institute)
Cite as Broad Institute TCGA Genome Data Analysis Center (2016): Correlation between APOBEC groups and selected clinical features. Broad Institute of MIT and Harvard. doi:10.7908/C19K49J4
Overview
Introduction

This pipeline computes the correlation between APOBRC groups and selected clinical features.

Summary

Testing the association between APOBEC groups identified by 2 different apobec score and 13 clinical features across 395 patients, 8 significant findings detected with Q value < 0.25.

  • 3 subtypes identified in current cancer cohort by 'APOBEC MUTLOAD MINESTIMATE'. These subtypes correlate to 'Time to Death',  'GENDER',  'RADIATION_THERAPY',  'NUMBER_OF_LYMPH_NODES', and 'RACE'.

  • 3 subtypes identified in current cancer cohort by 'APOBEC ENRICH'. These subtypes correlate to 'Time to Death',  'RADIATION_THERAPY', and 'RACE'.

Results
Overview of the results

Table 1.  Get Full Table Overview of the association between APOBEC groups by 2 different APOBEC scores and 13 clinical features. Shown in the table are P values (Q values). Thresholded by Q value < 0.25, 8 significant findings detected.

Clinical
Features
Statistical
Tests
APOBEC
MUTLOAD
MINESTIMATE
APOBEC
ENRICH
Time to Death logrank test 0.000892
(0.00773)
0.0188
(0.092)
YEARS TO BIRTH Kruskal-Wallis (anova) 0.637
(0.772)
0.611
(0.772)
PATHOLOGIC STAGE Fisher's exact test 0.121
(0.337)
0.365
(0.635)
PATHOLOGY T STAGE Fisher's exact test 0.381
(0.635)
0.135
(0.337)
PATHOLOGY N STAGE Fisher's exact test 0.143
(0.337)
0.552
(0.755)
PATHOLOGY M STAGE Fisher's exact test 0.907
(0.982)
0.216
(0.468)
GENDER Fisher's exact test 0.0248
(0.092)
0.653
(0.772)
RADIATION THERAPY Fisher's exact test 0.0227
(0.092)
0.00376
(0.0244)
KARNOFSKY PERFORMANCE SCORE Kruskal-Wallis (anova) 0.808
(0.914)
0.488
(0.704)
NUMBER PACK YEARS SMOKED Kruskal-Wallis (anova) 0.39
(0.635)
0.391
(0.635)
NUMBER OF LYMPH NODES Kruskal-Wallis (anova) 0.0501
(0.163)
0.454
(0.695)
RACE Fisher's exact test 0.00022
(0.00286)
4e-05
(0.00104)
ETHNICITY Fisher's exact test 1
(1.00)
1
(1.00)
APOBEC group #1: 'APOBEC MUTLOAD MINESTIMATE'

Table S1.  Description of APOBEC group #1: 'APOBEC MUTLOAD MINESTIMATE'

Cluster Labels 0 HIGH LOW
Number of samples 46 172 177
'APOBEC MUTLOAD MINESTIMATE' versus 'Time to Death'

P value = 0.000892 (logrank test), Q value = 0.0077

Table S2.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 392 175 0.6 - 166.0 (17.6)
0 45 27 2.0 - 93.0 (13.6)
HIGH 171 62 0.6 - 166.0 (19.8)
LOW 176 86 0.6 - 165.7 (16.1)

Figure S1.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #1: 'Time to Death'

'APOBEC MUTLOAD MINESTIMATE' versus 'GENDER'

P value = 0.0248 (Fisher's exact test), Q value = 0.092

Table S3.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #7: 'GENDER'

nPatients FEMALE MALE
ALL 102 293
0 13 33
HIGH 33 139
LOW 56 121

Figure S2.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #7: 'GENDER'

'APOBEC MUTLOAD MINESTIMATE' versus 'RADIATION_THERAPY'

P value = 0.0227 (Fisher's exact test), Q value = 0.092

Table S4.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #8: 'RADIATION_THERAPY'

nPatients NO YES
ALL 350 20
0 35 6
HIGH 156 5
LOW 159 9

Figure S3.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #8: 'RADIATION_THERAPY'

'APOBEC MUTLOAD MINESTIMATE' versus 'NUMBER_OF_LYMPH_NODES'

P value = 0.0501 (Kruskal-Wallis (anova)), Q value = 0.16

Table S5.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'NUMBER_OF_LYMPH_NODES'

nPatients Mean (Std.Dev)
ALL 288 1.8 (4.4)
0 29 0.9 (1.7)
HIGH 128 1.7 (4.8)
LOW 131 2.1 (4.3)

Figure S4.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'NUMBER_OF_LYMPH_NODES'

'APOBEC MUTLOAD MINESTIMATE' versus 'RACE'

P value = 0.00022 (Fisher's exact test), Q value = 0.0029

Table S6.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #12: 'RACE'

nPatients ASIAN BLACK OR AFRICAN AMERICAN WHITE
ALL 39 22 317
0 13 4 27
HIGH 9 7 150
LOW 17 11 140

Figure S5.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #12: 'RACE'

APOBEC group #2: 'APOBEC ENRICH'

Table S7.  Description of APOBEC group #2: 'APOBEC ENRICH'

Cluster Labels FC.HIGH.ENRICH FC.LOW.ENRICH FC.NO.ENRICH
Number of samples 327 22 46
'APOBEC ENRICH' versus 'Time to Death'

P value = 0.0188 (logrank test), Q value = 0.092

Table S8.  Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 392 175 0.6 - 166.0 (17.6)
FC.HIGH.ENRICH 325 137 0.6 - 166.0 (17.9)
FC.LOW.ENRICH 22 11 1.9 - 52.0 (13.7)
FC.NO.ENRICH 45 27 2.0 - 93.0 (13.6)

Figure S6.  Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #1: 'Time to Death'

'APOBEC ENRICH' versus 'RADIATION_THERAPY'

P value = 0.00376 (Fisher's exact test), Q value = 0.024

Table S9.  Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #8: 'RADIATION_THERAPY'

nPatients NO YES
ALL 350 20
FC.HIGH.ENRICH 297 11
FC.LOW.ENRICH 18 3
FC.NO.ENRICH 35 6

Figure S7.  Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #8: 'RADIATION_THERAPY'

'APOBEC ENRICH' versus 'RACE'

P value = 4e-05 (Fisher's exact test), Q value = 0.001

Table S10.  Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #12: 'RACE'

nPatients ASIAN BLACK OR AFRICAN AMERICAN WHITE
ALL 39 22 317
FC.HIGH.ENRICH 22 16 276
FC.LOW.ENRICH 4 2 14
FC.NO.ENRICH 13 4 27

Figure S8.  Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #12: 'RACE'

Methods & Data
Input
  • APOBEC groups file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/APOBEC_Pipelines/BLCA-TP/22506785/APOBEC_clinical_corr_input_22536864/APOBEC_for_clinical.correlaion.input.categorical.txt

  • Clinical data file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/Append_Data/BLCA-TP/22506467/BLCA-TP.merged_data.txt

  • Number of patients = 395

  • Number of selected clinical features = 13

APOBEC classification

APOBEC classification based on APOBEC_MutLoad_MinEstimate : a. APOBEC non group -- samples with zero value, b. APOBEC high group -- samples above median value in non zero samples, c. APOBEC low group -- samples below median value in non zero samples.

APOBEC classification based on APOBEC_enrich : a. No Enrichmment group -- all samples with BH_Fisher_p-value_tCw > 0.05, b. Low enrichment group -- samples with BH_Fisher_p-value_tCw = < 0.05 and APOBEC_enrich=<2, c. High enrichment group -- samples with BH_Fisher_p-value_tCw =< 0.05 and APOBEC_enrich>2.

Survival analysis

For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R

Fisher's exact test

For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R

Q value calculation

For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References
[1] Bland and Altman, Statistics notes: The logrank test, BMJ 328(7447):1073 (2004)
[2] Fisher, R.A., On the interpretation of chi-square from contingency tables, and the calculation of P, Journal of the Royal Statistical Society 85(1):87-94 (1922)
[3] Benjamini and Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 59:289-300 (1995)