Prostate Adenocarcinoma: Correlation between molecular cancer subtypes and selected clinical features
Maintained by TCGA GDAC Team (Broad Institute/Dana-Farber Cancer Institute/Harvard Medical School)
Overview
Introduction

This pipeline computes the correlation between cancer subtypes identified by different molecular patterns and selected clinical features.

Summary

Testing the association between subtypes identified by 6 different clustering approaches and 4 clinical features across 127 patients, one significant finding detected with P value < 0.05.

  • 3 subtypes identified in current cancer cohort by 'CN CNMF'. These subtypes do not correlate to any clinical features.

  • 3 subtypes identified in current cancer cohort by 'METHLYATION CNMF'. These subtypes correlate to 'AGE'.

  • CNMF clustering analysis on sequencing-based mRNA expression data identified 3 subtypes that do not correlate to any clinical features.

  • Consensus hierarchical clustering analysis on sequencing-based mRNA expression data identified 3 subtypes that do not correlate to any clinical features.

  • CNMF clustering analysis on sequencing-based miR expression data identified 4 subtypes that do not correlate to any clinical features.

  • Consensus hierarchical clustering analysis on sequencing-based miR expression data identified 3 subtypes that do not correlate to any clinical features.

Results
Overview of the results

Table 1.  Get Full Table Overview of the association between subtypes identified by 6 different clustering approaches and 4 clinical features. Shown in the table are P values from statistical tests. Thresholded by P value < 0.05, one significant finding detected.

Clinical
Features
Time
to
Death
AGE RADIATIONS
RADIATION
REGIMENINDICATION
NEOADJUVANT
THERAPY
Statistical Tests logrank test ANOVA Fisher's exact test Fisher's exact test
CN CNMF 1 0.113 0.852 0.815
METHLYATION CNMF 1 0.00809 1 0.385
RNAseq CNMF subtypes 1 0.132 0.861 0.546
RNAseq cHierClus subtypes 1 0.103 1 0.129
MIRseq CNMF subtypes 1 0.195 0.758 0.161
MIRseq cHierClus subtypes 1 0.877 0.306 0.579
Clustering Approach #1: 'CN CNMF'

Table S1.  Get Full Table Description of clustering approach #1: 'CN CNMF'

Cluster Labels 1 2 3
Number of samples 28 58 40
'CN CNMF' versus 'Time to Death'

P value = 1 (logrank test)

Table S2.  Clustering Approach #1: 'CN CNMF' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 126 1 0.3 - 66.0 (18.4)
subtype1 28 0 0.3 - 63.3 (13.8)
subtype2 58 0 1.0 - 65.9 (20.7)
subtype3 40 1 0.9 - 66.0 (22.9)

Figure S1.  Get High-res Image Clustering Approach #1: 'CN CNMF' versus Clinical Feature #1: 'Time to Death'

'CN CNMF' versus 'AGE'

P value = 0.113 (ANOVA)

Table S3.  Clustering Approach #1: 'CN CNMF' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 126 60.9 (6.7)
subtype1 28 62.3 (6.1)
subtype2 58 59.6 (7.3)
subtype3 40 62.0 (5.9)

Figure S2.  Get High-res Image Clustering Approach #1: 'CN CNMF' versus Clinical Feature #2: 'AGE'

'CN CNMF' versus 'RADIATIONS.RADIATION.REGIMENINDICATION'

P value = 0.852 (Fisher's exact test)

Table S4.  Clustering Approach #1: 'CN CNMF' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

nPatients NO YES
ALL 5 121
subtype1 1 27
subtype2 3 55
subtype3 1 39

Figure S3.  Get High-res Image Clustering Approach #1: 'CN CNMF' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

'CN CNMF' versus 'NEOADJUVANT.THERAPY'

P value = 0.815 (Fisher's exact test)

Table S5.  Clustering Approach #1: 'CN CNMF' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 4 122
subtype1 0 28
subtype2 2 56
subtype3 2 38

Figure S4.  Get High-res Image Clustering Approach #1: 'CN CNMF' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

Clustering Approach #2: 'METHLYATION CNMF'

Table S6.  Get Full Table Description of clustering approach #2: 'METHLYATION CNMF'

Cluster Labels 1 2 3
Number of samples 38 38 51
'METHLYATION CNMF' versus 'Time to Death'

P value = 1 (logrank test)

Table S7.  Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 127 1 0.3 - 66.0 (18.2)
subtype1 38 0 0.3 - 65.9 (23.8)
subtype2 38 0 1.1 - 54.9 (15.6)
subtype3 51 1 1.0 - 66.0 (19.5)

Figure S5.  Get High-res Image Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #1: 'Time to Death'

'METHLYATION CNMF' versus 'AGE'

P value = 0.00809 (ANOVA)

Table S8.  Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 127 61.0 (6.7)
subtype1 38 62.6 (5.8)
subtype2 38 58.2 (7.3)
subtype3 51 61.9 (6.3)

Figure S6.  Get High-res Image Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #2: 'AGE'

'METHLYATION CNMF' versus 'RADIATIONS.RADIATION.REGIMENINDICATION'

P value = 1 (Fisher's exact test)

Table S9.  Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

nPatients NO YES
ALL 5 122
subtype1 1 37
subtype2 2 36
subtype3 2 49

Figure S7.  Get High-res Image Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

'METHLYATION CNMF' versus 'NEOADJUVANT.THERAPY'

P value = 0.385 (Fisher's exact test)

Table S10.  Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 4 123
subtype1 1 37
subtype2 0 38
subtype3 3 48

Figure S8.  Get High-res Image Clustering Approach #2: 'METHLYATION CNMF' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

Clustering Approach #3: 'RNAseq CNMF subtypes'

Table S11.  Get Full Table Description of clustering approach #3: 'RNAseq CNMF subtypes'

Cluster Labels 1 2 3
Number of samples 34 31 37
'RNAseq CNMF subtypes' versus 'Time to Death'

P value = 1 (logrank test)

Table S12.  Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 102 1 0.3 - 66.0 (16.8)
subtype1 34 0 0.3 - 65.9 (19.6)
subtype2 31 0 1.1 - 54.9 (13.0)
subtype3 37 1 1.0 - 66.0 (17.1)

Figure S9.  Get High-res Image Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

'RNAseq CNMF subtypes' versus 'AGE'

P value = 0.132 (ANOVA)

Table S13.  Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 102 61.3 (6.8)
subtype1 34 62.4 (6.0)
subtype2 31 59.3 (7.4)
subtype3 37 62.0 (6.6)

Figure S10.  Get High-res Image Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

'RNAseq CNMF subtypes' versus 'RADIATIONS.RADIATION.REGIMENINDICATION'

P value = 0.861 (Fisher's exact test)

Table S14.  Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

nPatients NO YES
ALL 5 97
subtype1 1 33
subtype2 2 29
subtype3 2 35

Figure S11.  Get High-res Image Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

'RNAseq CNMF subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 0.546 (Fisher's exact test)

Table S15.  Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 4 98
subtype1 2 32
subtype2 0 31
subtype3 2 35

Figure S12.  Get High-res Image Clustering Approach #3: 'RNAseq CNMF subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

Clustering Approach #4: 'RNAseq cHierClus subtypes'

Table S16.  Get Full Table Description of clustering approach #4: 'RNAseq cHierClus subtypes'

Cluster Labels 1 2 3
Number of samples 28 32 42
'RNAseq cHierClus subtypes' versus 'Time to Death'

P value = 1 (logrank test)

Table S17.  Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 102 1 0.3 - 66.0 (16.8)
subtype1 28 0 1.0 - 65.9 (19.6)
subtype2 32 1 1.0 - 66.0 (14.8)
subtype3 42 0 0.3 - 54.9 (15.1)

Figure S13.  Get High-res Image Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

'RNAseq cHierClus subtypes' versus 'AGE'

P value = 0.103 (ANOVA)

Table S18.  Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 102 61.3 (6.8)
subtype1 28 61.7 (6.2)
subtype2 32 63.1 (6.8)
subtype3 42 59.8 (6.9)

Figure S14.  Get High-res Image Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

'RNAseq cHierClus subtypes' versus 'RADIATIONS.RADIATION.REGIMENINDICATION'

P value = 1 (Fisher's exact test)

Table S19.  Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

nPatients NO YES
ALL 5 97
subtype1 1 27
subtype2 2 30
subtype3 2 40

Figure S15.  Get High-res Image Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

'RNAseq cHierClus subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 0.129 (Fisher's exact test)

Table S20.  Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 4 98
subtype1 1 27
subtype2 3 29
subtype3 0 42

Figure S16.  Get High-res Image Clustering Approach #4: 'RNAseq cHierClus subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

Clustering Approach #5: 'MIRseq CNMF subtypes'

Table S21.  Get Full Table Description of clustering approach #5: 'MIRseq CNMF subtypes'

Cluster Labels 1 2 3 4
Number of samples 39 37 21 28
'MIRseq CNMF subtypes' versus 'Time to Death'

P value = 1 (logrank test)

Table S22.  Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 125 1 0.3 - 66.0 (18.2)
subtype1 39 0 0.9 - 65.9 (22.1)
subtype2 37 0 3.0 - 54.9 (19.9)
subtype3 21 0 0.3 - 64.1 (21.8)
subtype4 28 1 1.1 - 66.0 (9.4)

Figure S17.  Get High-res Image Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

'MIRseq CNMF subtypes' versus 'AGE'

P value = 0.195 (ANOVA)

Table S23.  Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 125 61.0 (6.6)
subtype1 39 62.6 (6.2)
subtype2 37 59.3 (6.8)
subtype3 21 61.0 (6.1)
subtype4 28 61.2 (7.1)

Figure S18.  Get High-res Image Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

'MIRseq CNMF subtypes' versus 'RADIATIONS.RADIATION.REGIMENINDICATION'

P value = 0.758 (Fisher's exact test)

Table S24.  Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

nPatients NO YES
ALL 5 120
subtype1 2 37
subtype2 2 35
subtype3 1 20
subtype4 0 28

Figure S19.  Get High-res Image Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

'MIRseq CNMF subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 0.161 (Fisher's exact test)

Table S25.  Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 3 122
subtype1 3 36
subtype2 0 37
subtype3 0 21
subtype4 0 28

Figure S20.  Get High-res Image Clustering Approach #5: 'MIRseq CNMF subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

Clustering Approach #6: 'MIRseq cHierClus subtypes'

Table S26.  Get Full Table Description of clustering approach #6: 'MIRseq cHierClus subtypes'

Cluster Labels 1 2 3
Number of samples 21 67 37
'MIRseq cHierClus subtypes' versus 'Time to Death'

P value = 1 (logrank test)

Table S27.  Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 125 1 0.3 - 66.0 (18.2)
subtype1 21 0 0.3 - 64.1 (23.8)
subtype2 67 0 0.9 - 65.9 (23.0)
subtype3 37 1 1.0 - 66.0 (8.5)

Figure S21.  Get High-res Image Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

'MIRseq cHierClus subtypes' versus 'AGE'

P value = 0.877 (ANOVA)

Table S28.  Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 125 61.0 (6.6)
subtype1 21 60.5 (5.8)
subtype2 67 61.0 (6.9)
subtype3 37 61.4 (6.7)

Figure S22.  Get High-res Image Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

'MIRseq cHierClus subtypes' versus 'RADIATIONS.RADIATION.REGIMENINDICATION'

P value = 0.306 (Fisher's exact test)

Table S29.  Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

nPatients NO YES
ALL 5 120
subtype1 1 20
subtype2 4 63
subtype3 0 37

Figure S23.  Get High-res Image Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #3: 'RADIATIONS.RADIATION.REGIMENINDICATION'

'MIRseq cHierClus subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 0.579 (Fisher's exact test)

Table S30.  Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 3 122
subtype1 0 21
subtype2 3 64
subtype3 0 37

Figure S24.  Get High-res Image Clustering Approach #6: 'MIRseq cHierClus subtypes' versus Clinical Feature #4: 'NEOADJUVANT.THERAPY'

Methods & Data
Input
  • Cluster data file = PRAD.mergedcluster.txt

  • Clinical data file = PRAD.clin.merged.picked.txt

  • Number of patients = 127

  • Number of clustering approaches = 6

  • Number of selected clinical features = 4

  • Exclude small clusters that include fewer than K patients, K = 3

Clustering approaches
CNMF clustering

consensus non-negative matrix factorization clustering approach (Brunet et al. 2004)

Consensus hierarchical clustering

Resampling-based clustering method (Monti et al. 2003)

Survival analysis

For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R

ANOVA analysis

For continuous numerical clinical features, one-way analysis of variance (Howell 2002) was applied to compare the clinical values between tumor subtypes using 'anova' function in R

Fisher's exact test

For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.

References
[1] Brunet et al., Metagenes and molecular pattern discovery using matrix factorization, PNAS 101(12):4164-9 (2004)
[3] Bland and Altman, Statistics notes: The logrank test, BMJ 328(7447):1073 (2004)
[4] Howell, D, Statistical Methods for Psychology. (5th ed.), Duxbury Press:324-5 (2002)
[5] Fisher, R.A., On the interpretation of chi-square from contingency tables, and the calculation of P, Journal of the Royal Statistical Society 85(1):87-94 (1922)