Correlation between APOBEC groups and selected clinical features

Breast Invasive Carcinoma (Primary solid tumor)

21 August 2015 | analyses__2015_08_21

Maintainer Information

Citation Information

Maintained by Hailei Zhang (Broad Institute)

Cite as Broad Institute TCGA Genome Data Analysis Center (2015): Correlation between APOBEC groups and selected clinical features. Broad Institute of MIT and Harvard. doi:10.7908/C11R6PPG

Overview

Introduction

This pipeline computes the correlation between APOBRC groups and selected clinical features.

Summary

Testing the association between APOBEC groups identified by 2 different apobec score and 12 clinical features across 977 patients, 5 significant findings detected with Q value < 0.25.

3 subtypes identified in current cancer cohort by 'APOBEC MUTLOAD MINESTIMATE'. These subtypes correlate to 'HISTOLOGICAL_TYPE' and 'RACE'.
3 subtypes identified in current cancer cohort by 'APOBEC ENRICH'. These subtypes correlate to 'HISTOLOGICAL_TYPE', 'RACE', and 'ETHNICITY'.

Results

Overview of the results

Table 1. Get Full Table Overview of the association between APOBEC groups by 2 different APOBEC scores and 12 clinical features. Shown in the table are P values (Q values). Thresholded by Q value < 0.25, 5 significant findings detected.


Clinical Features	Statistical Tests	APOBEC MUTLOAD MINESTIMATE	APOBEC ENRICH
Time to Death	logrank test	0.457 (0.783)	0.781 (0.986)
YEARS TO BIRTH	Kruskal-Wallis (anova)	0.228 (0.563)	0.861 (1.00)
PATHOLOGIC STAGE	Fisher's exact test	0.729 (0.986)	0.572 (0.915)
PATHOLOGY T STAGE	Fisher's exact test	0.344 (0.635)	0.225 (0.563)
PATHOLOGY N STAGE	Fisher's exact test	0.258 (0.563)	0.637 (0.956)
PATHOLOGY M STAGE	Fisher's exact test	1 (1.00)	1 (1.00)
GENDER	Fisher's exact test	0.244 (0.563)	1 (1.00)
RADIATION THERAPY	Fisher's exact test	0.339 (0.635)	0.741 (0.986)
HISTOLOGICAL TYPE	Fisher's exact test	0.00041 (0.00984)	0.00143 (0.0124)
NUMBER OF LYMPH NODES	Kruskal-Wallis (anova)	0.195 (0.563)	0.195 (0.563)
RACE	Fisher's exact test	0.00289 (0.0173)	0.00155 (0.0124)
ETHNICITY	Fisher's exact test	0.905 (1.00)	0.0461 (0.221)

APOBEC group #1: 'APOBEC MUTLOAD MINESTIMATE'

Table S1. Description of APOBEC group #1: 'APOBEC MUTLOAD MINESTIMATE'

Cluster Labels	0	HIGH	LOW
Number of samples	750	133	94

'APOBEC MUTLOAD MINESTIMATE' versus 'HISTOLOGICAL_TYPE'

P value = 0.00041 (Fisher's exact test), Q value = 0.0098

Table S2. Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #9: 'HISTOLOGICAL_TYPE'

nPatients	INFILTRATING CARCINOMA NOS	INFILTRATING DUCTAL CARCINOMA	INFILTRATING LOBULAR CARCINOMA	MEDULLARY CARCINOMA	METAPLASTIC CARCINOMA	MIXED HISTOLOGY (PLEASE SPECIFY)	MUCINOUS CARCINOMA	OTHER SPECIFY
ALL	1	712	171	5	6	26	14	41
0	1	555	116	1	5	21	14	37
HIGH	0	89	40	2	0	1	0	1
LOW	0	68	15	2	1	4	0	3

Figure S1. Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #9: 'HISTOLOGICAL_TYPE'

'APOBEC MUTLOAD MINESTIMATE' versus 'RACE'

P value = 0.00289 (Fisher's exact test), Q value = 0.017

Table S3. Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'RACE'

nPatients	AMERICAN INDIAN OR ALASKA NATIVE	ASIAN	BLACK OR AFRICAN AMERICAN	WHITE
ALL	1	57	121	705
0	1	33	86	561
HIGH	0	16	19	85
LOW	0	8	16	59

Figure S2. Get High-res Image Clustering Approach #1: 'APOBEC MUTLOAD MINESTIMATE' versus Clinical Feature #11: 'RACE'

APOBEC group #2: 'APOBEC ENRICH'

Table S4. Description of APOBEC group #2: 'APOBEC ENRICH'

Cluster Labels	FC.HIGH.SIG	FC.LOW.NONSIG	FC.NEUTRAL
Number of samples	220	656	101

'APOBEC ENRICH' versus 'HISTOLOGICAL_TYPE'

P value = 0.00143 (Fisher's exact test), Q value = 0.012

Table S5. Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #9: 'HISTOLOGICAL_TYPE'

nPatients	INFILTRATING CARCINOMA NOS	INFILTRATING DUCTAL CARCINOMA	INFILTRATING LOBULAR CARCINOMA	MEDULLARY CARCINOMA	METAPLASTIC CARCINOMA	MIXED HISTOLOGY (PLEASE SPECIFY)	MUCINOUS CARCINOMA	OTHER SPECIFY
ALL	1	712	171	5	6	26	14	41
FC.HIGH.SIG	0	152	54	4	1	5	0	3
FC.LOW.NONSIG	1	491	95	1	4	18	13	33
FC.NEUTRAL	0	69	22	0	1	3	1	5

Figure S3. Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #9: 'HISTOLOGICAL_TYPE'

'APOBEC ENRICH' versus 'RACE'

P value = 0.00155 (Fisher's exact test), Q value = 0.012

Table S6. Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #11: 'RACE'

nPatients	AMERICAN INDIAN OR ALASKA NATIVE	ASIAN	BLACK OR AFRICAN AMERICAN	WHITE
ALL	1	57	121	705
FC.HIGH.SIG	0	24	35	138
FC.LOW.NONSIG	1	31	74	488
FC.NEUTRAL	0	2	12	79

Figure S4. Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #11: 'RACE'

'APOBEC ENRICH' versus 'ETHNICITY'

P value = 0.0461 (Fisher's exact test), Q value = 0.22

Table S7. Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #12: 'ETHNICITY'

nPatients	HISPANIC OR LATINO	NOT HISPANIC OR LATINO
ALL	34	772
FC.HIGH.SIG	6	175
FC.LOW.NONSIG	28	512
FC.NEUTRAL	0	85

Figure S5. Get High-res Image Clustering Approach #2: 'APOBEC ENRICH' versus Clinical Feature #12: 'ETHNICITY'

Methods & Data

Input

APOBEC groups file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/APOBEC_Pipelines/BRCA-TP/20231607/APOBEC_clinical_corr_input_20231679/APOBEC_for_clinical.correlaion.input.categorical.txt
Clinical data file = /xchip/cga/gdac-prod/tcga-gdac/jobResults/Append_Data/BRCA-TP/19775058/BRCA-TP.merged_data.txt
Number of patients = 977
Number of selected clinical features = 12

APOBEC classification

APOBEC classification based on APOBEC_MutLoad_MinEstimate : a. APOBEC non group -- samples with zero value, b. APOBEC hig group -- samples above median value in non zero samples, c. APOBEC hig group -- samples below median value in non zero samples.

APOBEC classification based on APOBEC_enrich : a. No Enrichmment group -- all samples with BH_Fisher_p-value_tCw >=0.05, b. Small enrichment group -- samples with BH_Fisher_p-value_tCw = < 0.05 and APOBEC_enrich=<2, c. High enrichment gruop -- samples with BH_Fisher_p-value_tCw =< 0.05 and APOBEC_enrich>2.

Survival analysis

For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R

Fisher's exact test

For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R

Q value calculation

For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References

[1] Bland and Altman, Statistics notes: The logrank test, BMJ 328(7447):1073 (2004)

[2] Fisher, R.A., On the interpretation of chi-square from contingency tables, and the calculation of P, Journal of the Royal Statistical Society 85(1):87-94 (1922)

[3] Benjamini and Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 59:289-300 (1995)

Made with Nozzle