Correlation between APOBEC groups and selected clinical features
Breast Invasive Carcinoma (Primary solid tumor)
02 April 2015  |  analyses__2015_04_02
Maintainer Information
Citation Information
Maintained by Hailei Zhang (Broad Institute)
Cite as Broad Institute TCGA Genome Data Analysis Center (2015): Correlation between APOBEC groups and selected clinical features. Broad Institute of MIT and Harvard. doi:10.7908/C1ZW1JWC
Overview
Introduction

This pipeline computes the correlation between APOBRC groups and selected clinical features.

Summary

Testing the association between APOBEC groups identified by 2 different apobec score and 12 clinical features across 967 patients, 7 significant findings detected with Q value < 0.25.

  • 3 subtypes identified in current cancer cohort by 'APOBEC MUTLOAD MINESTIMATE'. These subtypes correlate to 'YEARS_TO_BIRTH',  'HISTOLOGICAL_TYPE',  'RADIATIONS_RADIATION_REGIMENINDICATION', and 'RACE'.

  • 3 subtypes identified in current cancer cohort by 'APOBEC ENRICH'. These subtypes correlate to 'HISTOLOGICAL_TYPE',  'RADIATIONS_RADIATION_REGIMENINDICATION', and 'RACE'.

Results
Overview of the results

Table 1.  Get Full Table Overview of the association between APOBEC groups by 2 different APOBEC scores and 12 clinical features. Shown in the table are P values (Q values). Thresholded by Q value < 0.25, 7 significant findings detected.

Clinical
Features
Statistical
Tests
APOBEC
MUTLOAD
MINESTIMATE
APOBEC
ENRICH
Time to Death logrank test 0.184
(0.481)
0.618
(0.872)
YEARS TO BIRTH Kruskal-Wallis (anova) 0.0568
(0.195)
0.875
(0.913)
NEOPLASM DISEASESTAGE Fisher's exact test 0.727
(0.913)
0.802
(0.913)
PATHOLOGY T STAGE Fisher's exact test 0.358
(0.573)
0.341
(0.573)
PATHOLOGY N STAGE Fisher's exact test 0.284
(0.525)
0.81
(0.913)
PATHOLOGY M STAGE Fisher's exact test 0.579
(0.869)
0.662
(0.882)
GENDER Fisher's exact test 0.237
(0.481)
1
(1.00)
HISTOLOGICAL TYPE Fisher's exact test 6e-05
(0.00144)
0.00019
(0.00228)
RADIATIONS RADIATION REGIMENINDICATION Fisher's exact test 0.0013
(0.00749)
0.00156
(0.00749)
NUMBER OF LYMPH NODES Kruskal-Wallis (anova) 0.174
(0.481)
0.241
(0.481)
RACE Fisher's exact test 0.00407
(0.0163)
0.00034
(0.00272)
ETHNICITY Fisher's exact test 0.862
(0.913)
0.208
(0.481)
APOBEC group #1: 'APOBEC MUTLOAD MINESTIMATE'

Table S1.  Description of APOBEC group #1: 'APOBEC MUTLOAD MINESTIMATE'

Cluster Labels 0 HIGH LOW
Number of samples 745 129 93
'APOBEC MUTLOAD MINESTIMATE' versus 'YEARS_TO_BIRTH'

P value = 0.0568 (Kruskal-Wallis (anova)), Q value = 0.19

Table S2.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #2: 'YEARS_TO_BIRTH'

nPatients Mean (Std.Dev)
ALL 954 58.7 (13.1)
0 736 58.7 (13.1)
HIGH 126 60.6 (13.8)
LOW 92 56.0 (12.0)

Figure S1.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #2: 'YEARS_TO_BIRTH'

'APOBEC MUTLOAD MINESTIMATE' versus 'HISTOLOGICAL_TYPE'

P value = 6e-05 (Fisher's exact test), Q value = 0.0014

Table S3.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #8: 'HISTOLOGICAL_TYPE'

nPatients INFILTRATING CARCINOMA NOS INFILTRATING DUCTAL CARCINOMA INFILTRATING LOBULAR CARCINOMA MEDULLARY CARCINOMA METAPLASTIC CARCINOMA MIXED HISTOLOGY (PLEASE SPECIFY) MUCINOUS CARCINOMA OTHER SPECIFY
ALL 1 710 164 5 1 27 14 44
0 1 555 111 1 0 23 14 40
HIGH 0 86 39 2 0 1 0 1
LOW 0 69 14 2 1 3 0 3

Figure S2.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #8: 'HISTOLOGICAL_TYPE'

'APOBEC MUTLOAD MINESTIMATE' versus 'RADIATIONS_RADIATION_REGIMENINDICATION'

P value = 0.0013 (Fisher's exact test), Q value = 0.0075

Table S4.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #9: 'RADIATIONS_RADIATION_REGIMENINDICATION'

nPatients NO YES
ALL 282 685
0 236 509
HIGH 21 108
LOW 25 68

Figure S3.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #9: 'RADIATIONS_RADIATION_REGIMENINDICATION'

'APOBEC MUTLOAD MINESTIMATE' versus 'RACE'

P value = 0.00407 (Fisher's exact test), Q value = 0.016

Table S5.  Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'RACE'

nPatients AMERICAN INDIAN OR ALASKA NATIVE ASIAN BLACK OR AFRICAN AMERICAN WHITE
ALL 1 57 116 698
0 1 34 82 556
HIGH 0 16 18 82
LOW 0 7 16 60

Figure S4.  Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'RACE'

APOBEC group #2: 'APOBEC ENRICH'

Table S6.  Description of APOBEC group #2: 'APOBEC ENRICH'

Cluster Labels FC.HIGH.SIG FC.LOW.NONSIG FC.NEUTRAL
Number of samples 213 645 109
'APOBEC ENRICH' versus 'HISTOLOGICAL_TYPE'

P value = 0.00019 (Fisher's exact test), Q value = 0.0023

Table S7.  Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #8: 'HISTOLOGICAL_TYPE'

nPatients INFILTRATING CARCINOMA NOS INFILTRATING DUCTAL CARCINOMA INFILTRATING LOBULAR CARCINOMA MEDULLARY CARCINOMA METAPLASTIC CARCINOMA MIXED HISTOLOGY (PLEASE SPECIFY) MUCINOUS CARCINOMA OTHER SPECIFY
ALL 1 710 164 5 1 27 14 44
FC.HIGH.SIG 0 148 52 4 1 4 0 3
FC.LOW.NONSIG 1 487 90 1 0 20 13 33
FC.NEUTRAL 0 75 22 0 0 3 1 8

Figure S5.  Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #8: 'HISTOLOGICAL_TYPE'

'APOBEC ENRICH' versus 'RADIATIONS_RADIATION_REGIMENINDICATION'

P value = 0.00156 (Fisher's exact test), Q value = 0.0075

Table S8.  Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #9: 'RADIATIONS_RADIATION_REGIMENINDICATION'

nPatients NO YES
ALL 282 685
FC.HIGH.SIG 42 171
FC.LOW.NONSIG 208 437
FC.NEUTRAL 32 77

Figure S6.  Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #9: 'RADIATIONS_RADIATION_REGIMENINDICATION'

'APOBEC ENRICH' versus 'RACE'

P value = 0.00034 (Fisher's exact test), Q value = 0.0027

Table S9.  Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #11: 'RACE'

nPatients AMERICAN INDIAN OR ALASKA NATIVE ASIAN BLACK OR AFRICAN AMERICAN WHITE
ALL 1 57 116 698
FC.HIGH.SIG 0 23 34 133
FC.LOW.NONSIG 0 31 73 477
FC.NEUTRAL 1 3 9 88

Figure S7.  Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #11: 'RACE'

Methods & Data
Input
  • APOBEC groups file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/APOBEC_Pipelines/BRCA-TP/15165570/APOBEC_clinical_corr_input_15169895/APOBEC_for_clinical.correlaion.input.categorical.txt

  • Clinical data file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/Append_Data/BRCA-TP/15076769/BRCA-TP.merged_data.txt

  • Number of patients = 967

  • Number of selected clinical features = 12

APOBEC classification

APOBEC classification based on APOBEC_MutLoad_MinEstimate : a. APOBEC non group -- samples with zero value, b. APOBEC hig group -- samples above median value in non zero samples, c. APOBEC hig group -- samples below median value in non zero samples.

APOBEC classification based on APOBEC_enrich : a. No Enrichmment group -- all samples with BH_Fisher_p-value_tCw >=0.05, b. Small enrichment group -- samples with BH_Fisher_p-value_tCw = < 0.05 and APOBEC_enrich=<2, c. High enrichment gruop -- samples with BH_Fisher_p-value_tCw =< 0.05 and APOBEC_enrich>2.

Survival analysis

For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R

Fisher's exact test

For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R

Q value calculation

For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References
[1] Bland and Altman, Statistics notes: The logrank test, BMJ 328(7447):1073 (2004)
[2] Fisher, R.A., On the interpretation of chi-square from contingency tables, and the calculation of P, Journal of the Royal Statistical Society 85(1):87-94 (1922)
[3] Benjamini and Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 59:289-300 (1995)