This pipeline uses various statistical tests to identify mRNAs whose expression levels correlated to selected clinical features.
Testing the association between 18295 genes and 7 clinical features across 480 samples, statistically thresholded by Q value < 0.05, 6 clinical features related to at least one genes.
-
2315 genes correlated to 'Time to Death'.
-
ANKRD56|345079 , B3GNTL1|146712 , COL7A1|1294 , DONSON|29980 , ADAMTS14|140766 , ...
-
19 genes correlated to 'AGE'.
-
RANBP17|64901 , RFPL1S|10740 , WFDC1|58189 , UTY|7404 , PALLD|23022 , ...
-
227 genes correlated to 'GENDER'.
-
XIST|7503 , PRKY|5616 , NLGN4Y|22829 , RPS4Y1|6192 , ZFY|7544 , ...
-
317 genes correlated to 'DISTANT.METASTASIS'.
-
GARNL3|84253 , IL20RB|53833 , PLEKHA9|51054 , C22ORF9|23313 , BIRC5|332 , ...
-
31 genes correlated to 'LYMPH.NODE.METASTASIS'.
-
PI3|5266 , CEP55|55165 , FAM64A|54478 , RPSAP52|204010 , UBE2T|29089 , ...
-
2142 genes correlated to 'NEOPLASM.DISEASESTAGE'.
-
NR3C2|4306 , PLEKHA9|51054 , ALDH6A1|4329 , FKBP11|51303 , ACADSB|36 , ...
-
No genes correlated to 'KARNOFSKY.PERFORMANCE.SCORE'
Complete statistical result table is provided in Supplement Table 1
Clinical feature | Statistical test | Significant genes | Associated with | Associated with | ||
---|---|---|---|---|---|---|
Time to Death | Cox regression test | N=2315 | shorter survival | N=1496 | longer survival | N=819 |
AGE | Spearman correlation test | N=19 | older | N=3 | younger | N=16 |
GENDER | t test | N=227 | male | N=146 | female | N=81 |
KARNOFSKY PERFORMANCE SCORE | Spearman correlation test | N=0 | ||||
DISTANT METASTASIS | t test | N=317 | m1 | N=264 | m0 | N=53 |
LYMPH NODE METASTASIS | ANOVA test | N=31 | ||||
NEOPLASM DISEASESTAGE | ANOVA test | N=2142 |
Time to Death | Duration (Months) | 0.1-111 (median=34.3) |
censored | N = 323 | |
death | N = 154 | |
Significant markers | N = 2315 | |
associated with shorter survival | 1496 | |
associated with longer survival | 819 |
HazardRatio | Wald_P | Q | C_index | |
---|---|---|---|---|
ANKRD56|345079 | 0.71 | 0 | 0 | 0.319 |
B3GNTL1|146712 | 2.4 | 0 | 0 | 0.684 |
COL7A1|1294 | 1.32 | 0 | 0 | 0.676 |
DONSON|29980 | 2.7 | 0 | 0 | 0.686 |
ADAMTS14|140766 | 1.44 | 1.11e-16 | 2e-12 | 0.684 |
SLC16A12|387700 | 0.78 | 1.11e-16 | 2e-12 | 0.311 |
NUMBL|9253 | 1.85 | 2.22e-16 | 4.1e-12 | 0.687 |
ANAPC7|51434 | 5.8 | 3.331e-16 | 6.1e-12 | 0.677 |
STX1A|6804 | 1.73 | 3.331e-16 | 6.1e-12 | 0.678 |
RGS17|26575 | 1.43 | 4.441e-16 | 8.1e-12 | 0.666 |
AGE | Mean (SD) | 60.58 (12) |
Significant markers | N = 19 | |
pos. correlated | 3 | |
neg. correlated | 16 |
SpearmanCorr | corrP | Q | |
---|---|---|---|
RANBP17|64901 | -0.2556 | 1.392e-08 | 0.000255 |
RFPL1S|10740 | -0.2559 | 1.702e-08 | 0.000311 |
WFDC1|58189 | -0.2476 | 3.998e-08 | 0.000731 |
UTY|7404 | -0.2758 | 1.044e-07 | 0.00191 |
PALLD|23022 | -0.2349 | 1.983e-07 | 0.00363 |
NEFH|4744 | -0.2324 | 2.701e-07 | 0.00494 |
DIO2|1734 | -0.2285 | 4.416e-07 | 0.00808 |
FNDC1|84624 | -0.2249 | 6.595e-07 | 0.0121 |
ZNF610|162963 | -0.2249 | 6.608e-07 | 0.0121 |
KDM5D|8284 | -0.248 | 8.594e-07 | 0.0157 |
GENDER | Labels | N |
FEMALE | 167 | |
MALE | 313 | |
Significant markers | N = 227 | |
Higher in MALE | 146 | |
Higher in FEMALE | 81 |
T(pos if higher in 'MALE') | ttestP | Q | AUC | |
---|---|---|---|---|
XIST|7503 | -44.56 | 1.4e-165 | 2.56e-161 | 0.9853 |
PRKY|5616 | 40.36 | 1.88e-110 | 3.44e-106 | 0.9858 |
NLGN4Y|22829 | 38.51 | 1.975e-84 | 3.61e-80 | 0.9856 |
RPS4Y1|6192 | 36.63 | 5.211e-74 | 9.53e-70 | 0.9894 |
ZFY|7544 | 37.95 | 7.956e-71 | 1.46e-66 | 0.9873 |
TSIX|9383 | -23.93 | 6.198e-70 | 1.13e-65 | 0.9675 |
DDX3Y|8653 | 32.19 | 2.102e-59 | 3.84e-55 | 0.9819 |
KDM5C|8242 | -17.04 | 1.013e-49 | 1.85e-45 | 0.8987 |
NCRNA00183|554203 | -16.62 | 4.036e-46 | 7.38e-42 | 0.8681 |
KDM5D|8284 | 26.38 | 9.922e-41 | 1.81e-36 | 0.9806 |
No gene related to 'KARNOFSKY.PERFORMANCE.SCORE'.
KARNOFSKY.PERFORMANCE.SCORE | Mean (SD) | 90.88 (18) |
Score | N | |
0 | 1 | |
70 | 1 | |
80 | 3 | |
90 | 12 | |
100 | 17 | |
Significant markers | N = 0 |
DISTANT.METASTASIS | Labels | N |
M0 | 403 | |
M1 | 77 | |
Significant markers | N = 317 | |
Higher in M1 | 264 | |
Higher in M0 | 53 |
T(pos if higher in 'M1') | ttestP | Q | AUC | |
---|---|---|---|---|
GARNL3|84253 | -6.98 | 2.078e-10 | 3.8e-06 | 0.7458 |
IL20RB|53833 | 6.7 | 1.089e-09 | 1.99e-05 | 0.7308 |
PLEKHA9|51054 | 6.68 | 1.135e-09 | 2.08e-05 | 0.7383 |
C22ORF9|23313 | 6.61 | 1.462e-09 | 2.67e-05 | 0.735 |
BIRC5|332 | 6.63 | 1.615e-09 | 2.95e-05 | 0.7295 |
INHBE|83729 | 6.55 | 2.516e-09 | 4.6e-05 | 0.7275 |
NFE2L3|9603 | 6.48 | 2.544e-09 | 4.65e-05 | 0.7237 |
TYMP|1890 | 6.38 | 3.096e-09 | 5.66e-05 | 0.7102 |
OIP5|11339 | 6.38 | 4.433e-09 | 8.11e-05 | 0.7251 |
CENPA|1058 | 6.35 | 6.395e-09 | 0.000117 | 0.7237 |
LYMPH.NODE.METASTASIS | Labels | N |
N0 | 228 | |
N1 | 17 | |
NX | 235 | |
Significant markers | N = 31 |
ANOVA_P | Q | |
---|---|---|
PI3|5266 | 4.011e-09 | 7.34e-05 |
CEP55|55165 | 7.237e-08 | 0.00132 |
FAM64A|54478 | 7.259e-08 | 0.00133 |
RPSAP52|204010 | 8.717e-08 | 0.00159 |
UBE2T|29089 | 8.96e-08 | 0.00164 |
SKA1|220134 | 9.772e-08 | 0.00179 |
FOXM1|2305 | 1.654e-07 | 0.00302 |
IQGAP3|128239 | 1.943e-07 | 0.00355 |
CDCA8|55143 | 2.399e-07 | 0.00439 |
AURKB|9212 | 3.879e-07 | 0.00709 |
NEOPLASM.DISEASESTAGE | Labels | N |
STAGE I | 233 | |
STAGE II | 49 | |
STAGE III | 120 | |
STAGE IV | 78 | |
Significant markers | N = 2142 |
ANOVA_P | Q | |
---|---|---|
NR3C2|4306 | 3.868e-18 | 7.08e-14 |
PLEKHA9|51054 | 4.808e-18 | 8.8e-14 |
ALDH6A1|4329 | 6.075e-18 | 1.11e-13 |
FKBP11|51303 | 7.418e-18 | 1.36e-13 |
ACADSB|36 | 1.267e-17 | 2.32e-13 |
BIRC5|332 | 2.499e-17 | 4.57e-13 |
TRAF6|7189 | 6.463e-17 | 1.18e-12 |
UBE2C|11065 | 6.442e-17 | 1.18e-12 |
TMEM150C|441027 | 8.129e-17 | 1.49e-12 |
FAM160A1|729830 | 9.057e-17 | 1.66e-12 |
-
Expresson data file = KIRC-TP.uncv2.mRNAseq_RSEM_normalized_log2.txt
-
Clinical data file = KIRC-TP.clin.merged.picked.txt
-
Number of patients = 480
-
Number of genes = 18295
-
Number of clinical features = 7
For survival clinical features, Wald's test in univariate Cox regression analysis with proportional hazards model (Andersen and Gill 1982) was used to estimate the P values using the 'coxph' function in R. Kaplan-Meier survival curves were plot using the four quartile subgroups of patients based on expression levels
For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R
For two-class clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the log2-expression levels between the two clinical classes using 't.test' function in R
For multi-class clinical features (ordinal or nominal), one-way analysis of variance (Howell 2002) was applied to compare the log2-expression levels between different clinical classes using 'anova' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.