(primary solid tumor cohort)
This pipeline uses various statistical tests to identify genes whose promoter methylation levels correlated to selected clinical features.
Testing the association between 17321 genes and 8 clinical features across 529 samples, statistically thresholded by Q value < 0.05, 7 clinical features related to at least one genes.
-
1 gene correlated to 'Time to Death'.
-
CDC73
-
131 genes correlated to 'AGE'.
-
KIF15 , MEX3C , EGR2 , C10ORF35 , LGALS8 , ...
-
189 genes correlated to 'GENDER'.
-
ALDOC , ZNF486 , CRIP1 , DNAJC15 , NMNAT3 , ...
-
162 genes correlated to 'RADIATIONS.RADIATION.REGIMENINDICATION'.
-
CCDC86 , NDUFB4 , TUBA4B , HS1BP3 , MAP3K10 , ...
-
12 genes correlated to 'DISTANT.METASTASIS'.
-
NHEDC1 , RHBDL3 , MMAB , SAG , PACRGL , ...
-
68 genes correlated to 'LYMPH.NODE.METASTASIS'.
-
SS18L1 , ZNF235 , MANBAL , HCRTR2 , TMEM33 , ...
-
26 genes correlated to 'NEOPLASM.DISEASESTAGE'.
-
IDH3B , DRG2 , HIST1H4C , WDR74 , RHBDL3 , ...
-
No genes correlated to 'NUMBER.OF.LYMPH.NODES'
Complete statistical result table is provided in Supplement Table 1
Clinical feature | Statistical test | Significant genes | Associated with | Associated with | ||
---|---|---|---|---|---|---|
Time to Death | Cox regression test | N=1 | shorter survival | N=0 | longer survival | N=1 |
AGE | Spearman correlation test | N=131 | older | N=119 | younger | N=12 |
GENDER | t test | N=189 | male | N=42 | female | N=147 |
RADIATIONS RADIATION REGIMENINDICATION | t test | N=162 | yes | N=139 | no | N=23 |
DISTANT METASTASIS | ANOVA test | N=12 | ||||
LYMPH NODE METASTASIS | ANOVA test | N=68 | ||||
NUMBER OF LYMPH NODES | Spearman correlation test | N=0 | ||||
NEOPLASM DISEASESTAGE | ANOVA test | N=26 |
Time to Death | Duration (Months) | 0-223.4 (median=17.9) |
censored | N = 443 | |
death | N = 58 | |
Significant markers | N = 1 | |
associated with shorter survival | 0 | |
associated with longer survival | 1 |
HazardRatio | Wald_P | Q | C_index | |
---|---|---|---|---|
CDC73 | 0 | 1.405e-06 | 0.024 | 0.355 |
AGE | Mean (SD) | 57.58 (13) |
Significant markers | N = 131 | |
pos. correlated | 119 | |
neg. correlated | 12 |
SpearmanCorr | corrP | Q | |
---|---|---|---|
KIF15 | 0.3171 | 8.453e-14 | 1.46e-09 |
MEX3C | 0.2847 | 2.668e-11 | 4.62e-07 |
EGR2 | 0.284 | 2.983e-11 | 5.17e-07 |
C10ORF35 | 0.2822 | 3.997e-11 | 6.92e-07 |
LGALS8 | -0.2817 | 4.347e-11 | 7.53e-07 |
RPL13A | 0.2792 | 6.589e-11 | 1.14e-06 |
FASN | 0.2735 | 1.631e-10 | 2.82e-06 |
RPL27A | 0.2654 | 5.773e-10 | 1e-05 |
RPL7A | 0.2643 | 6.913e-10 | 1.2e-05 |
CACNA2D1 | 0.2625 | 9.072e-10 | 1.57e-05 |
GENDER | Labels | N |
FEMALE | 523 | |
MALE | 6 | |
Significant markers | N = 189 | |
Higher in MALE | 42 | |
Higher in FEMALE | 147 |
T(pos if higher in 'MALE') | ttestP | Q | AUC | |
---|---|---|---|---|
ALDOC | -25.64 | 2.894e-93 | 5.01e-89 | 0.8674 |
ZNF486 | -18.28 | 5.134e-58 | 8.89e-54 | 0.8174 |
CRIP1 | -16.89 | 1.803e-51 | 3.12e-47 | 0.8725 |
DNAJC15 | -13.78 | 1.012e-35 | 1.75e-31 | 0.7333 |
NMNAT3 | -13.23 | 4.002e-34 | 6.93e-30 | 0.6918 |
LOC400043 | -13.14 | 3.487e-31 | 6.04e-27 | 0.6025 |
RND2 | -13.17 | 1.867e-28 | 3.23e-24 | 0.7916 |
EML1 | -11.43 | 5.597e-27 | 9.69e-23 | 0.6083 |
SPC25 | -12.2 | 2.984e-26 | 5.17e-22 | 0.7511 |
HSPC157 | -12.96 | 6.478e-25 | 1.12e-20 | 0.6316 |
162 genes related to 'RADIATIONS.RADIATION.REGIMENINDICATION'.
RADIATIONS.RADIATION.REGIMENINDICATION | Labels | N |
NO | 141 | |
YES | 388 | |
Significant markers | N = 162 | |
Higher in YES | 139 | |
Higher in NO | 23 |
T(pos if higher in 'YES') | ttestP | Q | AUC | |
---|---|---|---|---|
CCDC86 | 6.92 | 2.585e-11 | 4.48e-07 | 0.6804 |
NDUFB4 | 6.58 | 1.989e-10 | 3.45e-06 | 0.6688 |
TUBA4B | 6.41 | 3.966e-10 | 6.87e-06 | 0.6407 |
HS1BP3 | 6.3 | 9.065e-10 | 1.57e-05 | 0.6613 |
MAP3K10 | 6.27 | 1.214e-09 | 2.1e-05 | 0.6505 |
TFAP4 | 6.22 | 1.286e-09 | 2.23e-05 | 0.641 |
PTRH1 | 6.15 | 1.522e-09 | 2.63e-05 | 0.616 |
TICAM1 | 6.19 | 1.946e-09 | 3.37e-05 | 0.6622 |
DDX54 | 6.17 | 2.074e-09 | 3.59e-05 | 0.6544 |
RASL11A | 6.08 | 2.549e-09 | 4.41e-05 | 0.6328 |
DISTANT.METASTASIS | Labels | N |
CM0 (I+) | 1 | |
M0 | 459 | |
M1 | 6 | |
MX | 63 | |
Significant markers | N = 12 |
ANOVA_P | Q | |
---|---|---|
NHEDC1 | 6.226e-25 | 1.08e-20 |
RHBDL3 | 2.073e-21 | 3.59e-17 |
MMAB | 6.627e-20 | 1.15e-15 |
SAG | 4.444e-11 | 7.7e-07 |
PACRGL | 1.58e-09 | 2.74e-05 |
C9ORF153 | 9.672e-09 | 0.000167 |
TRIM21 | 1.656e-07 | 0.00287 |
ABHD10 | 2.799e-07 | 0.00485 |
LARS | 2.919e-07 | 0.00505 |
CRKL | 1.092e-06 | 0.0189 |
LYMPH.NODE.METASTASIS | Labels | N |
N0 | 141 | |
N0 (I+) | 14 | |
N0 (I-) | 74 | |
N0 (MOL+) | 1 | |
N1 | 70 | |
N1A | 81 | |
N1B | 23 | |
N1C | 2 | |
N1MI | 13 | |
N2 | 37 | |
N2A | 38 | |
N3 | 11 | |
N3A | 16 | |
N3B | 1 | |
NX | 7 | |
Significant markers | N = 68 |
ANOVA_P | Q | |
---|---|---|
SS18L1 | 6.738e-72 | 1.17e-67 |
ZNF235 | 2.768e-51 | 4.79e-47 |
MANBAL | 1.293e-34 | 2.24e-30 |
HCRTR2 | 5.343e-25 | 9.25e-21 |
TMEM33 | 3.921e-20 | 6.79e-16 |
ZNF33A | 5.597e-19 | 9.69e-15 |
SCRN2 | 5.621e-16 | 9.73e-12 |
ING5 | 1.378e-12 | 2.39e-08 |
KCNQ5 | 4.066e-11 | 7.04e-07 |
TMEM208 | 2.539e-10 | 4.4e-06 |
NUMBER.OF.LYMPH.NODES | Mean (SD) | 2.42 (4.5) |
Significant markers | N = 0 |
NEOPLASM.DISEASESTAGE | Labels | N |
STAGE I | 43 | |
STAGE IA | 38 | |
STAGE IB | 2 | |
STAGE II | 8 | |
STAGE IIA | 176 | |
STAGE IIB | 121 | |
STAGE III | 2 | |
STAGE IIIA | 88 | |
STAGE IIIB | 14 | |
STAGE IIIC | 25 | |
STAGE IV | 6 | |
STAGE X | 5 | |
Significant markers | N = 26 |
-
Expresson data file = BRCA-TP.meth.for_correlation.filtered_data.txt
-
Clinical data file = BRCA-TP.clin.merged.picked.txt
-
Number of patients = 529
-
Number of genes = 17321
-
Number of clinical features = 8
For survival clinical features, Wald's test in univariate Cox regression analysis with proportional hazards model (Andersen and Gill 1982) was used to estimate the P values using the 'coxph' function in R. Kaplan-Meier survival curves were plot using the four quartile subgroups of patients based on expression levels
For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R
For two-class clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the log2-expression levels between the two clinical classes using 't.test' function in R
For multi-class clinical features (ordinal or nominal), one-way analysis of variance (Howell 2002) was applied to compare the log2-expression levels between different clinical classes using 'anova' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.