Thyroid Adenocarcinoma: Correlation between molecular cancer subtypes and selected clinical features
Maintained by TCGA GDAC Team (Broad Institute/Dana-Farber Cancer Institute/Harvard Medical School)
Overview
Introduction

This pipeline computes the correlation between cancer subtypes identified by different molecular patterns and selected clinical features.

Summary

Testing the association between subtypes identified by 5 different clustering approaches and 5 clinical features across 157 patients, 13 significant findings detected with P value < 0.05.

  • 3 subtypes identified in current cancer cohort by 'METHLYATION CNMF'. These subtypes correlate to 'Time to Death',  'AGE', and 'HISTOLOGICAL.TYPE'.

  • CNMF clustering analysis on sequencing-based mRNA expression data identified 4 subtypes that correlate to 'AGE' and 'HISTOLOGICAL.TYPE'.

  • Consensus hierarchical clustering analysis on sequencing-based mRNA expression data identified 3 subtypes that correlate to 'AGE',  'GENDER', and 'HISTOLOGICAL.TYPE'.

  • CNMF clustering analysis on sequencing-based miR expression data identified 3 subtypes that correlate to 'Time to Death' and 'HISTOLOGICAL.TYPE'.

  • Consensus hierarchical clustering analysis on sequencing-based miR expression data identified 3 subtypes that correlate to 'Time to Death',  'AGE', and 'HISTOLOGICAL.TYPE'.

Results
Overview of the results

Table 1.  Get Full Table Overview of the association between subtypes identified by 5 different clustering approaches and 5 clinical features. Shown in the table are P values from statistical tests. Thresholded by P value < 0.05, 13 significant findings detected.

Clinical
Features
Time
to
Death
AGE GENDER HISTOLOGICAL
TYPE
NEOADJUVANT
THERAPY
Statistical Tests logrank test ANOVA Fisher's exact test Chi-square test Fisher's exact test
METHLYATION CNMF 0.0279 0.00227 0.879 9.77e-10 1
RNAseq CNMF subtypes 0.0652 0.0134 0.108 2.43e-08 1
RNAseq cHierClus subtypes 0.102 0.00898 0.00519 2.72e-10 0.24
MIRseq CNMF subtypes 0.0404 0.13 0.261 1.32e-06 1
MIRseq cHierClus subtypes 0.000532 0.0021 0.424 3.09e-05 0.662
Clustering Approach #1: 'METHLYATION CNMF'

Table S1.  Get Full Table Description of clustering approach #1: 'METHLYATION CNMF'

Cluster Labels 1 2 3
Number of samples 42 24 91
'METHLYATION CNMF' versus 'Time to Death'

P value = 0.0279 (logrank test)

Table S2.  Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 157 1 0.1 - 66.1 (9.0)
subtype1 42 1 0.2 - 65.9 (7.7)
subtype2 24 0 0.1 - 66.1 (6.9)
subtype3 91 0 0.2 - 66.1 (10.4)

Figure S1.  Get High-res Image Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #1: 'Time to Death'

'METHLYATION CNMF' versus 'AGE'

P value = 0.00227 (ANOVA)

Table S3.  Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 157 46.7 (15.9)
subtype1 42 53.6 (16.9)
subtype2 24 41.4 (14.2)
subtype3 91 44.9 (14.9)

Figure S2.  Get High-res Image Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #2: 'AGE'

'METHLYATION CNMF' versus 'GENDER'

P value = 0.879 (Fisher's exact test)

Table S4.  Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #3: 'GENDER'

nPatients FEMALE MALE
ALL 44 113
subtype1 13 29
subtype2 6 18
subtype3 25 66

Figure S3.  Get High-res Image Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #3: 'GENDER'

'METHLYATION CNMF' versus 'HISTOLOGICAL.TYPE'

P value = 9.77e-10 (Chi-square test)

Table S5.  Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

nPatients OTHER THYROID PAPILLARY CARCINOMA - CLASSICAL/USUAL THYROID PAPILLARY CARCINOMA - FOLLICULAR (>= 99% FOLLICULAR PATTERNED) THYROID PAPILLARY CARCINOMA - TALL CELL (>= 50% TALL CELL FEATURES)
ALL 7 88 43 19
subtype1 4 8 28 2
subtype2 1 15 5 3
subtype3 2 65 10 14

Figure S4.  Get High-res Image Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

'METHLYATION CNMF' versus 'NEOADJUVANT.THERAPY'

P value = 1 (Fisher's exact test)

Table S6.  Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 154 3
subtype1 41 1
subtype2 24 0
subtype3 89 2

Figure S5.  Get High-res Image Clustering Approach #1: 'METHLYATION CNMF' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

Clustering Approach #2: 'RNAseq CNMF subtypes'

Table S7.  Get Full Table Description of clustering approach #2: 'RNAseq CNMF subtypes'

Cluster Labels 1 2 3 4
Number of samples 33 28 42 13
'RNAseq CNMF subtypes' versus 'Time to Death'

P value = 0.0652 (logrank test)

Table S8.  Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 116 1 0.1 - 65.9 (8.9)
subtype1 33 1 0.3 - 65.9 (7.6)
subtype2 28 0 0.5 - 65.7 (9.3)
subtype3 42 0 1.0 - 65.9 (10.2)
subtype4 13 0 0.1 - 65.9 (9.2)

Figure S6.  Get High-res Image Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

'RNAseq CNMF subtypes' versus 'AGE'

P value = 0.0134 (ANOVA)

Table S9.  Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 116 47.7 (16.3)
subtype1 33 55.0 (16.9)
subtype2 28 42.3 (13.6)
subtype3 42 46.6 (16.2)
subtype4 13 44.4 (15.4)

Figure S7.  Get High-res Image Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

'RNAseq CNMF subtypes' versus 'GENDER'

P value = 0.108 (Fisher's exact test)

Table S10.  Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #3: 'GENDER'

nPatients FEMALE MALE
ALL 35 81
subtype1 11 22
subtype2 4 24
subtype3 17 25
subtype4 3 10

Figure S8.  Get High-res Image Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #3: 'GENDER'

'RNAseq CNMF subtypes' versus 'HISTOLOGICAL.TYPE'

P value = 2.43e-08 (Chi-square test)

Table S11.  Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

nPatients OTHER THYROID PAPILLARY CARCINOMA - CLASSICAL/USUAL THYROID PAPILLARY CARCINOMA - FOLLICULAR (>= 99% FOLLICULAR PATTERNED) THYROID PAPILLARY CARCINOMA - TALL CELL (>= 50% TALL CELL FEATURES)
ALL 5 63 32 16
subtype1 3 6 23 1
subtype2 0 22 4 2
subtype3 1 28 3 10
subtype4 1 7 2 3

Figure S9.  Get High-res Image Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

'RNAseq CNMF subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 1 (Fisher's exact test)

Table S12.  Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 113 3
subtype1 32 1
subtype2 27 1
subtype3 41 1
subtype4 13 0

Figure S10.  Get High-res Image Clustering Approach #2: 'RNAseq CNMF subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

Clustering Approach #3: 'RNAseq cHierClus subtypes'

Table S13.  Get Full Table Description of clustering approach #3: 'RNAseq cHierClus subtypes'

Cluster Labels 1 2 3
Number of samples 67 17 32
'RNAseq cHierClus subtypes' versus 'Time to Death'

P value = 0.102 (logrank test)

Table S14.  Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 116 1 0.1 - 65.9 (8.9)
subtype1 67 0 0.1 - 65.9 (9.4)
subtype2 17 0 0.5 - 65.7 (8.0)
subtype3 32 1 0.3 - 65.9 (7.7)

Figure S11.  Get High-res Image Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

'RNAseq cHierClus subtypes' versus 'AGE'

P value = 0.00898 (ANOVA)

Table S15.  Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 116 47.7 (16.3)
subtype1 67 45.3 (15.4)
subtype2 17 43.4 (14.7)
subtype3 32 55.1 (17.0)

Figure S12.  Get High-res Image Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

'RNAseq cHierClus subtypes' versus 'GENDER'

P value = 0.00519 (Fisher's exact test)

Table S16.  Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #3: 'GENDER'

nPatients FEMALE MALE
ALL 35 81
subtype1 24 43
subtype2 0 17
subtype3 11 21

Figure S13.  Get High-res Image Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #3: 'GENDER'

'RNAseq cHierClus subtypes' versus 'HISTOLOGICAL.TYPE'

P value = 2.72e-10 (Chi-square test)

Table S17.  Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

nPatients OTHER THYROID PAPILLARY CARCINOMA - CLASSICAL/USUAL THYROID PAPILLARY CARCINOMA - FOLLICULAR (>= 99% FOLLICULAR PATTERNED) THYROID PAPILLARY CARCINOMA - TALL CELL (>= 50% TALL CELL FEATURES)
ALL 5 63 32 16
subtype1 2 44 6 15
subtype2 0 14 3 0
subtype3 3 5 23 1

Figure S14.  Get High-res Image Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

'RNAseq cHierClus subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 0.24 (Fisher's exact test)

Table S18.  Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 113 3
subtype1 66 1
subtype2 17 0
subtype3 30 2

Figure S15.  Get High-res Image Clustering Approach #3: 'RNAseq cHierClus subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

Clustering Approach #4: 'MIRseq CNMF subtypes'

Table S19.  Get Full Table Description of clustering approach #4: 'MIRseq CNMF subtypes'

Cluster Labels 1 2 3
Number of samples 34 46 30
'MIRseq CNMF subtypes' versus 'Time to Death'

P value = 0.0404 (logrank test)

Table S20.  Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 110 1 0.1 - 66.1 (8.1)
subtype1 34 1 0.3 - 65.9 (7.8)
subtype2 46 0 0.2 - 66.1 (16.0)
subtype3 30 0 0.1 - 65.9 (6.9)

Figure S16.  Get High-res Image Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #1: 'Time to Death'

'MIRseq CNMF subtypes' versus 'AGE'

P value = 0.13 (ANOVA)

Table S21.  Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 110 45.8 (16.2)
subtype1 34 50.0 (18.5)
subtype2 46 42.6 (14.5)
subtype3 30 46.0 (15.5)

Figure S17.  Get High-res Image Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #2: 'AGE'

'MIRseq CNMF subtypes' versus 'GENDER'

P value = 0.261 (Fisher's exact test)

Table S22.  Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #3: 'GENDER'

nPatients FEMALE MALE
ALL 33 77
subtype1 13 21
subtype2 10 36
subtype3 10 20

Figure S18.  Get High-res Image Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #3: 'GENDER'

'MIRseq CNMF subtypes' versus 'HISTOLOGICAL.TYPE'

P value = 1.32e-06 (Chi-square test)

Table S23.  Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

nPatients OTHER THYROID PAPILLARY CARCINOMA - CLASSICAL/USUAL THYROID PAPILLARY CARCINOMA - FOLLICULAR (>= 99% FOLLICULAR PATTERNED) THYROID PAPILLARY CARCINOMA - TALL CELL (>= 50% TALL CELL FEATURES)
ALL 6 58 34 12
subtype1 3 6 23 2
subtype2 2 32 8 4
subtype3 1 20 3 6

Figure S19.  Get High-res Image Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

'MIRseq CNMF subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 1 (Fisher's exact test)

Table S24.  Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 108 2
subtype1 33 1
subtype2 45 1
subtype3 30 0

Figure S20.  Get High-res Image Clustering Approach #4: 'MIRseq CNMF subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

Clustering Approach #5: 'MIRseq cHierClus subtypes'

Table S25.  Get Full Table Description of clustering approach #5: 'MIRseq cHierClus subtypes'

Cluster Labels 1 2 3
Number of samples 20 45 45
'MIRseq cHierClus subtypes' versus 'Time to Death'

P value = 0.000532 (logrank test)

Table S26.  Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

nPatients nDeath Duration Range (Median), Month
ALL 110 1 0.1 - 66.1 (8.1)
subtype1 20 1 0.4 - 46.7 (7.4)
subtype2 45 0 0.2 - 66.1 (8.0)
subtype3 45 0 0.1 - 66.0 (12.3)

Figure S21.  Get High-res Image Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #1: 'Time to Death'

'MIRseq cHierClus subtypes' versus 'AGE'

P value = 0.0021 (ANOVA)

Table S27.  Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

nPatients Mean (Std.Dev)
ALL 110 45.8 (16.2)
subtype1 20 56.6 (15.7)
subtype2 45 41.6 (15.0)
subtype3 45 45.2 (15.9)

Figure S22.  Get High-res Image Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #2: 'AGE'

'MIRseq cHierClus subtypes' versus 'GENDER'

P value = 0.424 (Fisher's exact test)

Table S28.  Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #3: 'GENDER'

nPatients FEMALE MALE
ALL 33 77
subtype1 8 12
subtype2 14 31
subtype3 11 34

Figure S23.  Get High-res Image Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #3: 'GENDER'

'MIRseq cHierClus subtypes' versus 'HISTOLOGICAL.TYPE'

P value = 3.09e-05 (Chi-square test)

Table S29.  Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

nPatients OTHER THYROID PAPILLARY CARCINOMA - CLASSICAL/USUAL THYROID PAPILLARY CARCINOMA - FOLLICULAR (>= 99% FOLLICULAR PATTERNED) THYROID PAPILLARY CARCINOMA - TALL CELL (>= 50% TALL CELL FEATURES)
ALL 6 58 34 12
subtype1 3 2 14 1
subtype2 1 27 14 3
subtype3 2 29 6 8

Figure S24.  Get High-res Image Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #4: 'HISTOLOGICAL.TYPE'

'MIRseq cHierClus subtypes' versus 'NEOADJUVANT.THERAPY'

P value = 0.662 (Fisher's exact test)

Table S30.  Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

nPatients NO YES
ALL 108 2
subtype1 20 0
subtype2 43 2
subtype3 45 0

Figure S25.  Get High-res Image Clustering Approach #5: 'MIRseq cHierClus subtypes' versus Clinical Feature #5: 'NEOADJUVANT.THERAPY'

Methods & Data
Input
  • Cluster data file = THCA.mergedcluster.txt

  • Clinical data file = THCA.clin.merged.picked.txt

  • Number of patients = 157

  • Number of clustering approaches = 5

  • Number of selected clinical features = 5

  • Exclude small clusters that include fewer than K patients, K = 3

Clustering approaches
CNMF clustering

consensus non-negative matrix factorization clustering approach (Brunet et al. 2004)

Consensus hierarchical clustering

Resampling-based clustering method (Monti et al. 2003)

Survival analysis

For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R

ANOVA analysis

For continuous numerical clinical features, one-way analysis of variance (Howell 2002) was applied to compare the clinical values between tumor subtypes using 'anova' function in R

Fisher's exact test

For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R

Chi-square test

For multi-class clinical features (nominal or ordinal), Chi-square tests (Greenwood and Nikulin 1996) were used to estimate the P values using the 'chisq.test' function in R

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.

References
[1] Brunet et al., Metagenes and molecular pattern discovery using matrix factorization, PNAS 101(12):4164-9 (2004)
[3] Bland and Altman, Statistics notes: The logrank test, BMJ 328(7447):1073 (2004)
[4] Howell, D, Statistical Methods for Psychology. (5th ed.), Duxbury Press:324-5 (2002)
[5] Fisher, R.A., On the interpretation of chi-square from contingency tables, and the calculation of P, Journal of the Royal Statistical Society 85(1):87-94 (1922)
[6] Greenwood and Nikulin, A guide to chi-squared testing, Wiley, New York. ISBN 047155779X (1996)