This pipeline computes the correlation between significantly recurrent gene mutations and selected clinical features.
Testing the association between mutation status of 16 genes and 10 clinical features across 154 patients, 6 significant findings detected with Q value < 0.25.
-
PIK3CA mutation correlated to 'NUMBER.OF.LYMPH.NODES'.
-
BRAF mutation correlated to 'HISTOLOGICAL.TYPE'.
-
ACVR2A mutation correlated to 'NUMBER.OF.LYMPH.NODES'.
-
SMAD2 mutation correlated to 'NEOPLASM.DISEASESTAGE'.
-
PCBP1 mutation correlated to 'NEOPLASM.DISEASESTAGE'.
-
GGT1 mutation correlated to 'NUMBER.OF.LYMPH.NODES'.
Clinical Features |
Time to Death |
AGE |
NEOPLASM DISEASESTAGE |
PATHOLOGY T STAGE |
PATHOLOGY N STAGE |
PATHOLOGY M STAGE |
GENDER |
HISTOLOGICAL TYPE |
COMPLETENESS OF RESECTION |
NUMBER OF LYMPH NODES |
||
nMutated (%) | nWild-Type | logrank test | t-test | Chi-square test | Fisher's exact test | Fisher's exact test | Fisher's exact test | Fisher's exact test | Fisher's exact test | Fisher's exact test | t-test | |
PIK3CA | 26 (17%) | 128 |
0.77 (1.00) |
0.228 (1.00) |
0.0285 (1.00) |
0.577 (1.00) |
0.01 (1.00) |
0.333 (1.00) |
0.83 (1.00) |
0.217 (1.00) |
0.446 (1.00) |
0.000218 (0.0338) |
BRAF | 20 (13%) | 134 |
0.315 (1.00) |
0.0284 (1.00) |
0.13 (1.00) |
0.0621 (1.00) |
0.798 (1.00) |
0.401 (1.00) |
0.0915 (1.00) |
3.92e-05 (0.00611) |
0.59 (1.00) |
0.808 (1.00) |
ACVR2A | 8 (5%) | 146 |
0.156 (1.00) |
0.695 (1.00) |
0.495 (1.00) |
0.891 (1.00) |
0.369 (1.00) |
1 (1.00) |
0.0632 (1.00) |
1 (1.00) |
1 (1.00) |
1.59e-06 (0.000251) |
SMAD2 | 10 (6%) | 144 |
0.608 (1.00) |
0.765 (1.00) |
0.000742 (0.114) |
0.412 (1.00) |
0.672 (1.00) |
0.636 (1.00) |
0.327 (1.00) |
0.16 (1.00) |
0.407 (1.00) |
0.275 (1.00) |
PCBP1 | 4 (3%) | 150 |
0.395 (1.00) |
0.00485 (0.733) |
5.99e-06 (0.00094) |
0.325 (1.00) |
0.641 (1.00) |
0.0347 (1.00) |
1 (1.00) |
1 (1.00) |
1 (1.00) |
0.132 (1.00) |
GGT1 | 3 (2%) | 151 |
0.852 (1.00) |
0.905 (1.00) |
0.712 (1.00) |
0.571 (1.00) |
1 (1.00) |
1 (1.00) |
1 (1.00) |
1 (1.00) |
6.02e-09 (9.58e-07) |
|
APC | 103 (67%) | 51 |
0.715 (1.00) |
0.461 (1.00) |
0.811 (1.00) |
0.797 (1.00) |
0.893 (1.00) |
0.201 (1.00) |
0.0394 (1.00) |
0.809 (1.00) |
0.0413 (1.00) |
0.603 (1.00) |
KRAS | 58 (38%) | 96 |
0.684 (1.00) |
0.00449 (0.686) |
0.68 (1.00) |
0.968 (1.00) |
0.402 (1.00) |
0.271 (1.00) |
0.244 (1.00) |
0.813 (1.00) |
0.278 (1.00) |
0.383 (1.00) |
TP53 | 74 (48%) | 80 |
0.967 (1.00) |
0.0988 (1.00) |
0.144 (1.00) |
0.64 (1.00) |
0.217 (1.00) |
0.525 (1.00) |
0.42 (1.00) |
0.00469 (0.713) |
0.68 (1.00) |
0.113 (1.00) |
FBXW7 | 29 (19%) | 125 |
0.388 (1.00) |
0.041 (1.00) |
0.046 (1.00) |
0.114 (1.00) |
0.611 (1.00) |
0.0169 (1.00) |
0.216 (1.00) |
0.0151 (1.00) |
0.0805 (1.00) |
0.754 (1.00) |
NRAS | 15 (10%) | 139 |
0.889 (1.00) |
0.151 (1.00) |
0.483 (1.00) |
0.524 (1.00) |
0.0775 (1.00) |
0.496 (1.00) |
0.0268 (1.00) |
1 (1.00) |
0.278 (1.00) |
0.128 (1.00) |
SMAD4 | 18 (12%) | 136 |
0.028 (1.00) |
0.878 (1.00) |
0.95 (1.00) |
0.912 (1.00) |
0.834 (1.00) |
1 (1.00) |
0.803 (1.00) |
0.299 (1.00) |
0.773 (1.00) |
0.404 (1.00) |
FAM123B | 19 (12%) | 135 |
0.136 (1.00) |
0.579 (1.00) |
0.796 (1.00) |
0.755 (1.00) |
0.254 (1.00) |
1 (1.00) |
1 (1.00) |
0.74 (1.00) |
1 (1.00) |
0.665 (1.00) |
SOX9 | 9 (6%) | 145 |
0.387 (1.00) |
0.298 (1.00) |
0.138 (1.00) |
0.22 (1.00) |
0.899 (1.00) |
1 (1.00) |
1 (1.00) |
1 (1.00) |
1 (1.00) |
0.326 (1.00) |
TNFRSF10C | 6 (4%) | 148 |
0.816 (1.00) |
0.953 (1.00) |
0.334 (1.00) |
0.104 (1.00) |
0.246 (1.00) |
0.0736 (1.00) |
0.681 (1.00) |
0.594 (1.00) |
0.106 (1.00) |
0.178 (1.00) |
ACOT4 | 3 (2%) | 151 |
0.633 (1.00) |
0.0308 (1.00) |
0.982 (1.00) |
0.035 (1.00) |
1 (1.00) |
0.377 (1.00) |
1 (1.00) |
0.377 (1.00) |
0.368 (1.00) |
0.649 (1.00) |
P value = 0.000218 (t-test), Q value = 0.034
nPatients | Mean (Std.Dev) | |
---|---|---|
ALL | 153 | 2.2 (4.5) |
PIK3CA MUTATED | 26 | 0.6 (1.6) |
PIK3CA WILD-TYPE | 127 | 2.6 (4.8) |
P value = 3.92e-05 (Fisher's exact test), Q value = 0.0061
nPatients | COLON ADENOCARCINOMA | COLON MUCINOUS ADENOCARCINOMA |
---|---|---|
ALL | 130 | 22 |
BRAF MUTATED | 10 | 10 |
BRAF WILD-TYPE | 120 | 12 |
P value = 1.59e-06 (t-test), Q value = 0.00025
nPatients | Mean (Std.Dev) | |
---|---|---|
ALL | 153 | 2.2 (4.5) |
ACVR2A MUTATED | 8 | 0.2 (0.5) |
ACVR2A WILD-TYPE | 145 | 2.3 (4.6) |
P value = 0.000742 (Chi-square test), Q value = 0.11
nPatients | STAGE I | STAGE II | STAGE IIA | STAGE IIB | STAGE III | STAGE IIIA | STAGE IIIB | STAGE IIIC | STAGE IV | STAGE IVA |
---|---|---|---|---|---|---|---|---|---|---|
ALL | 31 | 14 | 38 | 4 | 14 | 2 | 12 | 16 | 21 | 1 |
SMAD2 MUTATED | 1 | 4 | 0 | 1 | 0 | 1 | 0 | 0 | 2 | 0 |
SMAD2 WILD-TYPE | 30 | 10 | 38 | 3 | 14 | 1 | 12 | 16 | 19 | 1 |
P value = 5.99e-06 (Chi-square test), Q value = 0.00094
nPatients | STAGE I | STAGE II | STAGE IIA | STAGE IIB | STAGE III | STAGE IIIA | STAGE IIIB | STAGE IIIC | STAGE IV | STAGE IVA |
---|---|---|---|---|---|---|---|---|---|---|
ALL | 31 | 14 | 38 | 4 | 14 | 2 | 12 | 16 | 21 | 1 |
PCBP1 MUTATED | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
PCBP1 WILD-TYPE | 30 | 14 | 36 | 4 | 14 | 2 | 12 | 16 | 21 | 0 |
P value = 6.02e-09 (t-test), Q value = 9.6e-07
nPatients | Mean (Std.Dev) | |
---|---|---|
ALL | 153 | 2.2 (4.5) |
GGT1 MUTATED | 3 | 0.0 (0.0) |
GGT1 WILD-TYPE | 150 | 2.3 (4.5) |
-
Mutation data file = transformed.cor.cli.txt
-
Clinical data file = COAD-TP.merged_data.txt
-
Number of patients = 154
-
Number of significantly mutated genes = 16
-
Number of selected clinical features = 10
-
Exclude genes that fewer than K tumors have mutations, K = 3
For survival clinical features, the Kaplan-Meier survival curves of tumors with and without gene mutations were plotted and the statistical significance P values were estimated by logrank test (Bland and Altman 2004) using the 'survdiff' function in R
For continuous numerical clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the clinical values between tumors with and without gene mutations using 't.test' function in R
For multi-class clinical features (nominal or ordinal), Chi-square tests (Greenwood and Nikulin 1996) were used to estimate the P values using the 'chisq.test' function in R
For binary or multi-class clinical features (nominal or ordinal), two-tailed Fisher's exact tests (Fisher 1922) were used to estimate the P values using the 'fisher.test' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.