This pipeline computes the correlation between cancer subtypes identified by different molecular patterns and selected clinical features.
Testing the association between subtypes identified by 4 different clustering approaches and 2 clinical features across 10 patients, no significant finding detected with P value < 0.05 and Q value < 0.25.
-
2 subtypes identified in current cancer cohort by 'Copy Number Ratio CNMF subtypes'. These subtypes do not correlate to any clinical features.
-
2 subtypes identified in current cancer cohort by 'MIRSEQ CNMF'. These subtypes do not correlate to any clinical features.
-
2 subtypes identified in current cancer cohort by 'MIRSEQ CHIERARCHICAL'. These subtypes do not correlate to any clinical features.
-
3 subtypes identified in current cancer cohort by 'MIRseq Mature cHierClus subtypes'. These subtypes do not correlate to any clinical features.
Clinical Features |
AGE | GENDER |
Statistical Tests | ANOVA | Fisher's exact test |
Copy Number Ratio CNMF subtypes |
0.521 (1.00) |
1 (1.00) |
MIRSEQ CNMF |
0.648 (1.00) |
1 (1.00) |
MIRSEQ CHIERARCHICAL |
0.906 (1.00) |
0.429 (1.00) |
MIRseq Mature cHierClus subtypes |
0.747 (1.00) |
0.7 (1.00) |
Cluster Labels | 3 | 4 |
---|---|---|
Number of samples | 4 | 5 |
P value = 0.521 (t-test), Q value = 1
nPatients | Mean (Std.Dev) | |
---|---|---|
ALL | 9 | 46.8 (11.0) |
subtype3 | 4 | 49.5 (8.1) |
subtype4 | 5 | 44.6 (13.4) |
P value = 1 (Fisher's exact test), Q value = 1
nPatients | FEMALE | MALE |
---|---|---|
ALL | 6 | 3 |
subtype3 | 3 | 1 |
subtype4 | 3 | 2 |
Cluster Labels | 1 | 2 | 3 |
---|---|---|---|
Number of samples | 3 | 5 | 2 |
P value = 0.648 (t-test), Q value = 1
nPatients | Mean (Std.Dev) | |
---|---|---|
ALL | 8 | 47.1 (11.5) |
subtype1 | 3 | 43.7 (17.6) |
subtype2 | 5 | 49.2 (7.9) |
P value = 1 (Fisher's exact test), Q value = 1
nPatients | FEMALE | MALE |
---|---|---|
ALL | 6 | 2 |
subtype1 | 2 | 1 |
subtype2 | 4 | 1 |
Cluster Labels | 1 | 2 | 3 | 4 |
---|---|---|---|---|
Number of samples | 1 | 3 | 4 | 2 |
P value = 0.906 (t-test), Q value = 1
nPatients | Mean (Std.Dev) | |
---|---|---|
ALL | 7 | 50.1 (8.6) |
subtype2 | 3 | 49.7 (7.1) |
subtype3 | 4 | 50.5 (10.7) |
P value = 0.429 (Fisher's exact test), Q value = 1
nPatients | FEMALE | MALE |
---|---|---|
ALL | 5 | 2 |
subtype2 | 3 | 0 |
subtype3 | 2 | 2 |
Cluster Labels | 1 | 2 | 3 |
---|---|---|---|
Number of samples | 4 | 3 | 3 |
P value = 0.747 (ANOVA), Q value = 1
nPatients | Mean (Std.Dev) | |
---|---|---|
ALL | 10 | 48.3 (11.4) |
subtype1 | 4 | 50.8 (11.0) |
subtype2 | 3 | 49.7 (7.1) |
subtype3 | 3 | 43.7 (17.6) |
P value = 0.7 (Fisher's exact test), Q value = 1
nPatients | FEMALE | MALE |
---|---|---|
ALL | 7 | 3 |
subtype1 | 2 | 2 |
subtype2 | 3 | 0 |
subtype3 | 2 | 1 |
-
Cluster data file = PCPG-TP.mergedcluster.txt
-
Clinical data file = PCPG-TP.merged_data.txt
-
Number of patients = 10
-
Number of clustering approaches = 4
-
Number of selected clinical features = 2
-
Exclude small clusters that include fewer than K patients, K = 3
For continuous numerical clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the clinical values between two tumor subtypes using 't.test' function in R
For binary clinical features, two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R
For continuous numerical clinical features, one-way analysis of variance (Howell 2002) was applied to compare the clinical values between tumor subtypes using 'anova' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.