This pipeline uses various statistical tests to identify genes whose promoter methylation levels correlated to selected clinical features.
Testing the association between 19646 genes and 8 clinical features across 127 samples, statistically thresholded by Q value < 0.05, 7 clinical features related to at least one genes.
-
3 genes correlated to 'AGE'.
-
RASL11B , WTIP , CACNA2D2
-
118 genes correlated to 'NEOPLASM.DISEASESTAGE'.
-
SLC12A9 , SEPSECS , SMU1 , C16ORF61 , CENPN , ...
-
1 gene correlated to 'PATHOLOGY.T.STAGE'.
-
TPCN2
-
210 genes correlated to 'PATHOLOGY.N.STAGE'.
-
PTP4A3 , FAM157A , MSC , PCDHGA1__9 , PCDHGA2__9 , ...
-
32 genes correlated to 'PATHOLOGY.M.STAGE'.
-
SEPSECS , SLC12A9 , KLHL7 , HSPB11 , LRRC42 , ...
-
12 genes correlated to 'GENDER'.
-
ALG11__1 , UTP14C , ALDH3A1 , SLC22A11 , ZNF35 , ...
-
10 genes correlated to 'COMPLETENESS.OF.RESECTION'.
-
SEPSECS , BIVM , KDELC1 , CCDC94 , ZNF540 , ...
-
No genes correlated to 'Time to Death'
Complete statistical result table is provided in Supplement Table 1
Table 1. Get Full Table This table shows the clinical features, statistical methods used, and the number of genes that are significantly associated with each clinical feature at Q value < 0.05.
Clinical feature | Statistical test | Significant genes | Associated with | Associated with | ||
---|---|---|---|---|---|---|
Time to Death | Cox regression test | N=0 | ||||
AGE | Spearman correlation test | N=3 | older | N=3 | younger | N=0 |
NEOPLASM DISEASESTAGE | ANOVA test | N=118 | ||||
PATHOLOGY T STAGE | Spearman correlation test | N=1 | higher stage | N=1 | lower stage | N=0 |
PATHOLOGY N STAGE | t test | N=210 | class1 | N=36 | class0 | N=174 |
PATHOLOGY M STAGE | ANOVA test | N=32 | ||||
GENDER | t test | N=12 | male | N=3 | female | N=9 |
COMPLETENESS OF RESECTION | ANOVA test | N=10 |
Table S1. Basic characteristics of clinical feature: 'Time to Death'
Time to Death | Duration (Months) | 0-113 (median=14.6) |
censored | N = 68 | |
death | N = 54 | |
Significant markers | N = 0 |
Table S2. Basic characteristics of clinical feature: 'AGE'
AGE | Mean (SD) | 61.38 (14) |
Significant markers | N = 3 | |
pos. correlated | 3 | |
neg. correlated | 0 |
Table S3. Get Full Table List of 3 genes significantly correlated to 'AGE' by Spearman correlation test
SpearmanCorr | corrP | Q | |
---|---|---|---|
RASL11B | 0.4408 | 2.675e-07 | 0.00526 |
WTIP | 0.4188 | 1.172e-06 | 0.023 |
CACNA2D2 | 0.4075 | 2.4e-06 | 0.0472 |
Figure S1. Get High-res Image As an example, this figure shows the association of RASL11B to 'AGE'. P value = 2.68e-07 with Spearman correlation analysis. The straight line presents the best linear regression.

Table S4. Basic characteristics of clinical feature: 'NEOPLASM.DISEASESTAGE'
NEOPLASM.DISEASESTAGE | Labels | N |
STAGE I | 48 | |
STAGE II | 27 | |
STAGE III | 2 | |
STAGE IIIA | 30 | |
STAGE IIIB | 3 | |
STAGE IIIC | 5 | |
STAGE IV | 1 | |
STAGE IVA | 1 | |
STAGE IVB | 1 | |
Significant markers | N = 118 |
Table S5. Get Full Table List of top 10 genes differentially expressed by 'NEOPLASM.DISEASESTAGE'
ANOVA_P | Q | |
---|---|---|
SLC12A9 | 1.153e-68 | 2.26e-64 |
SEPSECS | 1.891e-60 | 3.72e-56 |
SMU1 | 4.847e-43 | 9.52e-39 |
C16ORF61 | 9.19e-32 | 1.81e-27 |
CENPN | 9.19e-32 | 1.81e-27 |
KLHL7 | 5.058e-30 | 9.93e-26 |
HSPB11 | 1.933e-29 | 3.8e-25 |
LRRC42 | 1.933e-29 | 3.8e-25 |
KILLIN | 2.179e-29 | 4.28e-25 |
PTEN | 2.179e-29 | 4.28e-25 |
Figure S2. Get High-res Image As an example, this figure shows the association of SLC12A9 to 'NEOPLASM.DISEASESTAGE'. P value = 1.15e-68 with ANOVA analysis.

Table S6. Basic characteristics of clinical feature: 'PATHOLOGY.T.STAGE'
PATHOLOGY.T.STAGE | Mean (SD) | 2.02 (0.97) |
N | ||
1 | 51 | |
2 | 30 | |
3 | 39 | |
4 | 7 | |
Significant markers | N = 1 | |
pos. correlated | 1 | |
neg. correlated | 0 |
Table S7. Get Full Table List of one gene significantly correlated to 'PATHOLOGY.T.STAGE' by Spearman correlation test
SpearmanCorr | corrP | Q | |
---|---|---|---|
TPCN2 | 0.4313 | 4.157e-07 | 0.00817 |
Figure S3. Get High-res Image As an example, this figure shows the association of TPCN2 to 'PATHOLOGY.T.STAGE'. P value = 4.16e-07 with Spearman correlation analysis.

Table S8. Basic characteristics of clinical feature: 'PATHOLOGY.N.STAGE'
PATHOLOGY.N.STAGE | Labels | N |
class0 | 83 | |
class1 | 3 | |
Significant markers | N = 210 | |
Higher in class1 | 36 | |
Higher in class0 | 174 |
Table S9. Get Full Table List of top 10 genes differentially expressed by 'PATHOLOGY.N.STAGE'
T(pos if higher in 'class1') | ttestP | Q | AUC | |
---|---|---|---|---|
PTP4A3 | -15.24 | 5.021e-20 | 9.86e-16 | 0.9598 |
FAM157A | -14.35 | 2.016e-19 | 3.96e-15 | 0.9438 |
MSC | -10.72 | 3.534e-17 | 6.94e-13 | 0.9076 |
PCDHGA1__9 | -9.96 | 7.214e-16 | 1.42e-11 | 0.8916 |
PCDHGA2__9 | -9.96 | 7.214e-16 | 1.42e-11 | 0.8916 |
PCDHGA3__8 | -9.96 | 7.214e-16 | 1.42e-11 | 0.8916 |
PCDHGA4__7 | -9.96 | 7.214e-16 | 1.42e-11 | 0.8916 |
PCDHGB1__8 | -9.96 | 7.214e-16 | 1.42e-11 | 0.8916 |
PCDHGB2__7 | -9.96 | 7.214e-16 | 1.42e-11 | 0.8916 |
THBD | -9.5 | 1.935e-14 | 3.8e-10 | 0.8112 |
Figure S4. Get High-res Image As an example, this figure shows the association of PTP4A3 to 'PATHOLOGY.N.STAGE'. P value = 5.02e-20 with T-test analysis.

Table S10. Basic characteristics of clinical feature: 'PATHOLOGY.M.STAGE'
PATHOLOGY.M.STAGE | Labels | N |
M0 | 101 | |
M1 | 2 | |
MX | 24 | |
Significant markers | N = 32 |
Table S11. Get Full Table List of top 10 genes differentially expressed by 'PATHOLOGY.M.STAGE'
ANOVA_P | Q | |
---|---|---|
SEPSECS | 2.855e-20 | 5.61e-16 |
SLC12A9 | 4.703e-19 | 9.24e-15 |
KLHL7 | 9.885e-16 | 1.94e-11 |
HSPB11 | 2.851e-13 | 5.6e-09 |
LRRC42 | 2.851e-13 | 5.6e-09 |
C16ORF61 | 5.438e-13 | 1.07e-08 |
CENPN | 5.438e-13 | 1.07e-08 |
ERCC2 | 5.137e-12 | 1.01e-07 |
ALG8 | 2.396e-11 | 4.7e-07 |
SAMHD1 | 7.836e-11 | 1.54e-06 |
Figure S5. Get High-res Image As an example, this figure shows the association of SEPSECS to 'PATHOLOGY.M.STAGE'. P value = 2.86e-20 with ANOVA analysis.

Table S12. Basic characteristics of clinical feature: 'GENDER'
GENDER | Labels | N |
FEMALE | 49 | |
MALE | 78 | |
Significant markers | N = 12 | |
Higher in MALE | 3 | |
Higher in FEMALE | 9 |
Table S13. Get Full Table List of top 10 genes differentially expressed by 'GENDER'
T(pos if higher in 'MALE') | ttestP | Q | AUC | |
---|---|---|---|---|
ALG11__1 | 12.44 | 1.128e-17 | 2.22e-13 | 0.95 |
UTP14C | 12.44 | 1.128e-17 | 2.22e-13 | 0.95 |
ALDH3A1 | -7.31 | 3.285e-11 | 6.45e-07 | 0.7972 |
SLC22A11 | -5.79 | 6.67e-08 | 0.00131 | 0.7195 |
ZNF35 | 5.74 | 7.133e-08 | 0.0014 | 0.7588 |
MAP3K8 | -5.7 | 1.078e-07 | 0.00212 | 0.7713 |
NKD1 | -5.4 | 3.387e-07 | 0.00665 | 0.7179 |
FAM83A | -5.14 | 1.266e-06 | 0.0249 | 0.7462 |
LOC100131726 | -5.14 | 1.266e-06 | 0.0249 | 0.7462 |
DHODH | -5.02 | 2.088e-06 | 0.041 | 0.7386 |
Figure S6. Get High-res Image As an example, this figure shows the association of ALG11__1 to 'GENDER'. P value = 1.13e-17 with T-test analysis.

Table S14. Basic characteristics of clinical feature: 'COMPLETENESS.OF.RESECTION'
COMPLETENESS.OF.RESECTION | Labels | N |
R0 | 102 | |
R1 | 10 | |
R2 | 1 | |
RX | 9 | |
Significant markers | N = 10 |
Table S15. Get Full Table List of 10 genes differentially expressed by 'COMPLETENESS.OF.RESECTION'
ANOVA_P | Q | |
---|---|---|
SEPSECS | 6.302e-67 | 1.24e-62 |
BIVM | 4.319e-23 | 8.48e-19 |
KDELC1 | 4.319e-23 | 8.48e-19 |
CCDC94 | 7.105e-22 | 1.4e-17 |
ZNF540 | 5.669e-16 | 1.11e-11 |
ZNF571 | 5.669e-16 | 1.11e-11 |
C5ORF42 | 2.076e-13 | 4.08e-09 |
TBC1D15 | 4.683e-11 | 9.2e-07 |
C1ORF101 | 4.495e-09 | 8.83e-05 |
CCDC117 | 1.354e-06 | 0.0266 |
Figure S7. Get High-res Image As an example, this figure shows the association of SEPSECS to 'COMPLETENESS.OF.RESECTION'. P value = 6.3e-67 with ANOVA analysis.

-
Expresson data file = LIHC-TP.meth.by_min_clin_corr.data.txt
-
Clinical data file = LIHC-TP.merged_data.txt
-
Number of patients = 127
-
Number of genes = 19646
-
Number of clinical features = 8
For survival clinical features, Wald's test in univariate Cox regression analysis with proportional hazards model (Andersen and Gill 1982) was used to estimate the P values using the 'coxph' function in R. Kaplan-Meier survival curves were plot using the four quartile subgroups of patients based on expression levels
For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R
For multi-class clinical features (ordinal or nominal), one-way analysis of variance (Howell 2002) was applied to compare the log2-expression levels between different clinical classes using 'anova' function in R
For two-class clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the log2-expression levels between the two clinical classes using 't.test' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.