Correlation between copy number variations of arm-level result and selected clinical features
Acute Myeloid Leukemia (Primary blood derived cancer - Peripheral blood)
22 February 2013  |  analyses__2013_02_22
Maintainer Information
Citation Information
Maintained by TCGA GDAC Team (Broad Institute/MD Anderson Cancer Center/Harvard Medical School)
Cite as Broad Institute TCGA Genome Data Analysis Center (2013): Correlation between copy number variations of arm-level result and selected clinical features. Broad Institute of MIT and Harvard. doi:10.7908/C1ZP44B9
Overview
Introduction

This pipeline computes the correlation between significant arm-level copy number variations (cnvs) and selected clinical features.

Summary

Testing the association between subtypes identified by 22 different clustering approaches and 3 clinical features across 191 patients, 4 significant findings detected with Q value < 0.25.

  • 2 subtypes identified in current cancer cohort by '1p gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '4p gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '4q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '8p gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '8q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '10q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '11p gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '11q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '13q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '17q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '19p gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '19q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '21q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '22q gain mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '5q loss mutation analysis'. These subtypes correlate to 'Time to Death'.

  • 2 subtypes identified in current cancer cohort by '7p loss mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '7q loss mutation analysis'. These subtypes correlate to 'Time to Death'.

  • 2 subtypes identified in current cancer cohort by '12p loss mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '17p loss mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '17q loss mutation analysis'. These subtypes do not correlate to any clinical features.

  • 2 subtypes identified in current cancer cohort by '18p loss mutation analysis'. These subtypes correlate to 'Time to Death'.

  • 2 subtypes identified in current cancer cohort by '18q loss mutation analysis'. These subtypes correlate to 'Time to Death'.

Results
Overview of the results

Table 1.  Get Full Table Overview of the association between subtypes identified by 22 different clustering approaches and 3 clinical features. Shown in the table are P values (Q values). Thresholded by Q value < 0.25, 4 significant findings detected.

Clinical
Features
Time
to
Death
AGE GENDER
Statistical Tests logrank test t-test Fisher's exact test
1p gain 0.0913
(1.00)
0.592
(1.00)
4p gain 0.587
(1.00)
0.425
(1.00)
0.627
(1.00)
4q gain 0.587
(1.00)
0.425
(1.00)
0.627
(1.00)
8p gain 0.48
(1.00)
0.119
(1.00)
0.0597
(1.00)
8q gain 0.529
(1.00)
0.168
(1.00)
0.0382
(1.00)
10q gain 0.947
(1.00)
0.252
(1.00)
11p gain 0.188
(1.00)
1
(1.00)
11q gain 0.57
(1.00)
0.0154
(0.834)
1
(1.00)
13q gain 0.511
(1.00)
0.592
(1.00)
17q gain 0.729
(1.00)
0.29
(1.00)
1
(1.00)
19p gain 0.347
(1.00)
0.789
(1.00)
0.0642
(1.00)
19q gain 0.347
(1.00)
0.789
(1.00)
0.0642
(1.00)
21q gain 0.0186
(0.966)
0.616
(1.00)
0.0732
(1.00)
22q gain 0.887
(1.00)
0.00598
(0.341)
0.73
(1.00)
5q loss 0.00073
(0.0438)
0.0601
(1.00)
0.0325
(1.00)
7p loss 0.00716
(0.401)
0.325
(1.00)
1
(1.00)
7q loss 0.00315
(0.183)
0.325
(1.00)
0.805
(1.00)
12p loss 0.514
(1.00)
1
(1.00)
17p loss 0.376
(1.00)
0.308
(1.00)
0.114
(1.00)
17q loss 0.0166
(0.882)
0.0075
(0.412)
0.378
(1.00)
18p loss 0.00202
(0.119)
0.355
(1.00)
0.0642
(1.00)
18q loss 0.000113
(0.00689)
0.0297
(1.00)
0.252
(1.00)
Clustering Approach #1: '1p gain mutation analysis'

Table S1.  Get Full Table Description of clustering approach #1: '1p gain mutation analysis'

Cluster Labels 1P GAIN MUTATED 1P GAIN WILD-TYPE
Number of samples 3 188
Clustering Approach #2: '4p gain mutation analysis'

Table S2.  Get Full Table Description of clustering approach #2: '4p gain mutation analysis'

Cluster Labels 4P GAIN MUTATED 4P GAIN WILD-TYPE
Number of samples 4 187
Clustering Approach #3: '4q gain mutation analysis'

Table S3.  Get Full Table Description of clustering approach #3: '4q gain mutation analysis'

Cluster Labels 4Q GAIN MUTATED 4Q GAIN WILD-TYPE
Number of samples 4 187
Clustering Approach #4: '8p gain mutation analysis'

Table S4.  Get Full Table Description of clustering approach #4: '8p gain mutation analysis'

Cluster Labels 8P GAIN MUTATED 8P GAIN WILD-TYPE
Number of samples 20 171
Clustering Approach #5: '8q gain mutation analysis'

Table S5.  Get Full Table Description of clustering approach #5: '8q gain mutation analysis'

Cluster Labels 8Q GAIN MUTATED 8Q GAIN WILD-TYPE
Number of samples 21 170
Clustering Approach #6: '10q gain mutation analysis'

Table S6.  Get Full Table Description of clustering approach #6: '10q gain mutation analysis'

Cluster Labels 10Q GAIN MUTATED 10Q GAIN WILD-TYPE
Number of samples 3 188
Clustering Approach #7: '11p gain mutation analysis'

Table S7.  Get Full Table Description of clustering approach #7: '11p gain mutation analysis'

Cluster Labels 11P GAIN MUTATED 11P GAIN WILD-TYPE
Number of samples 4 187
Clustering Approach #8: '11q gain mutation analysis'

Table S8.  Get Full Table Description of clustering approach #8: '11q gain mutation analysis'

Cluster Labels 11Q GAIN MUTATED 11Q GAIN WILD-TYPE
Number of samples 7 184
Clustering Approach #9: '13q gain mutation analysis'

Table S9.  Get Full Table Description of clustering approach #9: '13q gain mutation analysis'

Cluster Labels 13Q GAIN MUTATED 13Q GAIN WILD-TYPE
Number of samples 3 188
Clustering Approach #10: '17q gain mutation analysis'

Table S10.  Get Full Table Description of clustering approach #10: '17q gain mutation analysis'

Cluster Labels 17Q GAIN MUTATED 17Q GAIN WILD-TYPE
Number of samples 3 188
Clustering Approach #11: '19p gain mutation analysis'

Table S11.  Get Full Table Description of clustering approach #11: '19p gain mutation analysis'

Cluster Labels 19P GAIN MUTATED 19P GAIN WILD-TYPE
Number of samples 5 186
Clustering Approach #12: '19q gain mutation analysis'

Table S12.  Get Full Table Description of clustering approach #12: '19q gain mutation analysis'

Cluster Labels 19Q GAIN MUTATED 19Q GAIN WILD-TYPE
Number of samples 5 186
Clustering Approach #13: '21q gain mutation analysis'

Table S13.  Get Full Table Description of clustering approach #13: '21q gain mutation analysis'

Cluster Labels 21Q GAIN MUTATED 21Q GAIN WILD-TYPE
Number of samples 8 183
Clustering Approach #14: '22q gain mutation analysis'

Table S14.  Get Full Table Description of clustering approach #14: '22q gain mutation analysis'

Cluster Labels 22Q GAIN MUTATED 22Q GAIN WILD-TYPE
Number of samples 8 183
Clustering Approach #15: '5q loss mutation analysis'

Table S15.  Get Full Table Description of clustering approach #15: '5q loss mutation analysis'

Cluster Labels 5Q LOSS MUTATED 5Q LOSS WILD-TYPE
Number of samples 6 185
'5q loss mutation analysis' versus 'Time to Death'

P value = 0.00073 (logrank test), Q value = 0.044

Table S16.  Clustering Approach #15: '5q loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 168 106 0.9 - 94.1 (12.0)
5Q LOSS MUTATED 6 6 1.0 - 12.0 (7.0)
5Q LOSS WILD-TYPE 162 100 0.9 - 94.1 (12.5)

Figure S1.  Get High-res Image Clustering Approach #15: '5q loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

Clustering Approach #16: '7p loss mutation analysis'

Table S17.  Get Full Table Description of clustering approach #16: '7p loss mutation analysis'

Cluster Labels 7P LOSS MUTATED 7P LOSS WILD-TYPE
Number of samples 15 176
Clustering Approach #17: '7q loss mutation analysis'

Table S18.  Get Full Table Description of clustering approach #17: '7q loss mutation analysis'

Cluster Labels 7Q LOSS MUTATED 7Q LOSS WILD-TYPE
Number of samples 18 173
'7q loss mutation analysis' versus 'Time to Death'

P value = 0.00315 (logrank test), Q value = 0.18

Table S19.  Clustering Approach #17: '7q loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 168 106 0.9 - 94.1 (12.0)
7Q LOSS MUTATED 15 13 1.0 - 40.0 (9.0)
7Q LOSS WILD-TYPE 153 93 0.9 - 94.1 (13.9)

Figure S2.  Get High-res Image Clustering Approach #17: '7q loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

Clustering Approach #18: '12p loss mutation analysis'

Table S20.  Get Full Table Description of clustering approach #18: '12p loss mutation analysis'

Cluster Labels 12P LOSS MUTATED 12P LOSS WILD-TYPE
Number of samples 3 188
Clustering Approach #19: '17p loss mutation analysis'

Table S21.  Get Full Table Description of clustering approach #19: '17p loss mutation analysis'

Cluster Labels 17P LOSS MUTATED 17P LOSS WILD-TYPE
Number of samples 10 181
Clustering Approach #20: '17q loss mutation analysis'

Table S22.  Get Full Table Description of clustering approach #20: '17q loss mutation analysis'

Cluster Labels 17Q LOSS MUTATED 17Q LOSS WILD-TYPE
Number of samples 5 186
Clustering Approach #21: '18p loss mutation analysis'

Table S23.  Get Full Table Description of clustering approach #21: '18p loss mutation analysis'

Cluster Labels 18P LOSS MUTATED 18P LOSS WILD-TYPE
Number of samples 5 186
'18p loss mutation analysis' versus 'Time to Death'

P value = 0.00202 (logrank test), Q value = 0.12

Table S24.  Clustering Approach #21: '18p loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 168 106 0.9 - 94.1 (12.0)
18P LOSS MUTATED 5 5 1.0 - 12.0 (7.0)
18P LOSS WILD-TYPE 163 101 0.9 - 94.1 (12.0)

Figure S3.  Get High-res Image Clustering Approach #21: '18p loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

Clustering Approach #22: '18q loss mutation analysis'

Table S25.  Get Full Table Description of clustering approach #22: '18q loss mutation analysis'

Cluster Labels 18Q LOSS MUTATED 18Q LOSS WILD-TYPE
Number of samples 3 188
'18q loss mutation analysis' versus 'Time to Death'

P value = 0.000113 (logrank test), Q value = 0.0069

Table S26.  Clustering Approach #22: '18q loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 168 106 0.9 - 94.1 (12.0)
18Q LOSS MUTATED 3 3 1.0 - 7.0 (2.0)
18Q LOSS WILD-TYPE 165 103 0.9 - 94.1 (12.0)

Figure S4.  Get High-res Image Clustering Approach #22: '18q loss mutation analysis' versus Clinical Feature #1: 'Time to Death'

Methods & Data
Input
  • Cluster data file = broad_values_by_arm.mutsig.cluster.txt

  • Clinical data file = LAML-TB.clin.merged.picked.txt

  • Number of patients = 191

  • Number of clustering approaches = 22

  • Number of selected clinical features = 3

  • Exclude small clusters that include fewer than K patients, K = 3

Survival analysis

For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R

Student's t-test analysis

For continuous numerical clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the clinical values between two tumor subtypes using 't.test' function in R

Fisher's exact test

For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R

Q value calculation

For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.

References
[1] Bland and Altman, Statistics notes: The logrank test, BMJ 328(7447):1073 (2004)
[2] Lehmann and Romano, Testing Statistical Hypotheses (3E ed.), New York: Springer. ISBN 0387988645 (2005)
[3] Fisher, R.A., On the interpretation of chi-square from contingency tables, and the calculation of P, Journal of the Royal Statistical Society 85(1):87-94 (1922)
[4] Benjamini and Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 59:289-300 (1995)