This pipeline uses various statistical tests to identify mRNAs whose expression levels correlated to selected clinical features.
Testing the association between 17814 genes and 5 clinical features across 529 samples, statistically thresholded by Q value < 0.05, 5 clinical features related to at least one genes.

2 genes correlated to 'Time to Death'.

RPS26 , PPP1R14D

386 genes correlated to 'AGE'.

ESR1 , CNTNAP3 , FOXD2 , KLK6 , NUDT16 , ...

6 genes correlated to 'GENDER'.

PI3 , TMEM16C , CACNG1 , RP1336C9.6 , MAPK4 , ...

3 genes correlated to 'RADIATIONS.RADIATION.REGIMENINDICATION'.

OR13C4 , TMEM92 , CYP3A5

12 genes correlated to 'NEOADJUVANT.THERAPY'.

ZCCHC7 , DUSP1 , MGC33407 , RGS1 , C2ORF60 , ...
Complete statistical result table is provided in Supplement Table 1
Clinical feature  Statistical test  Significant genes  Associated with  Associated with  

Time to Death  Cox regression test  N=2  shorter survival  N=2  longer survival  N=0 
AGE  Spearman correlation test  N=386  older  N=188  younger  N=198 
GENDER  t test  N=6  male  N=1  female  N=5 
RADIATIONS RADIATION REGIMENINDICATION  t test  N=3  yes  N=3  no  N=0 
NEOADJUVANT THERAPY  t test  N=12  yes  N=5  no  N=7 
Time to Death  Duration (Months)  0.1223.4 (median=24.1) 
censored  N = 432  
death  N = 65  
Significant markers  N = 2  
associated with shorter survival  2  
associated with longer survival  0 
AGE  Mean (SD)  57.89 (13) 
Significant markers  N = 386  
pos. correlated  188  
neg. correlated  198 
SpearmanCorr  corrP  Q  

ESR1  0.414  2.515e23  4.48e19 
CNTNAP3  0.2907  9.302e12  1.66e07 
FOXD2  0.289  1.233e11  2.2e07 
KLK6  0.2889  1.247e11  2.22e07 
NUDT16  0.2875  1.591e11  2.83e07 
MAGED4B  0.2858  2.117e11  3.77e07 
KRT17  0.2842  2.751e11  4.9e07 
MFGE8  0.2842  2.763e11  4.92e07 
SYT8  0.2838  2.942e11  5.24e07 
PHOSPHO2  0.2791  6.34e11  1.13e06 
GENDER  Labels  N 
FEMALE  523  
MALE  6  
Significant markers  N = 6  
Higher in MALE  1  
Higher in FEMALE  5 
T(pos if higher in 'MALE')  ttestP  Q  AUC  

PI3  9.33  2.686e11  4.79e07  0.7129 
TMEM16C  14.29  1.571e10  2.8e06  0.9261 
CACNG1  18.46  1.712e08  0.000305  0.9614 
RP1336C9.6  6.29  6.462e07  0.0115  0.6651 
MAPK4  9.13  1.424e06  0.0254  0.8011 
PLA2G3  10.66  1.754e06  0.0312  0.8311 
3 genes related to 'RADIATIONS.RADIATION.REGIMENINDICATION'.
RADIATIONS.RADIATION.REGIMENINDICATION  Labels  N 
NO  147  
YES  382  
Significant markers  N = 3  
Higher in YES  3  
Higher in NO  0 
T(pos if higher in 'YES')  ttestP  Q  AUC  

OR13C4  5.11  6.374e07  0.0114  0.63 
TMEM92  4.98  9.615e07  0.0171  0.6099 
CYP3A5  4.96  1.028e06  0.0183  0.5852 
NEOADJUVANT.THERAPY  Labels  N 
NO  221  
YES  308  
Significant markers  N = 12  
Higher in YES  5  
Higher in NO  7 
T(pos if higher in 'YES')  ttestP  Q  AUC  

ZCCHC7  5.7  2.181e08  0.000388  0.639 
DUSP1  5.32  1.584e07  0.00282  0.6308 
MGC33407  5.31  1.786e07  0.00318  0.6328 
RGS1  5.25  2.221e07  0.00396  0.6173 
C2ORF60  5.15  3.791e07  0.00675  0.6169 
EGR1  5.12  4.448e07  0.00792  0.6247 
OR13C4  5.02  7.424e07  0.0132  0.62 
FOS  5.02  7.445e07  0.0133  0.6208 
EED  4.93  1.093e06  0.0195  0.6048 
NLRP12  4.84  1.701e06  0.0303  0.5973 

Expresson data file = BRCA.medianexp.txt

Clinical data file = BRCA.clin.merged.picked.txt

Number of patients = 529

Number of genes = 17814

Number of clinical features = 5
For survival clinical features, Wald's test in univariate Cox regression analysis with proportional hazards model (Andersen and Gill 1982) was used to estimate the P values using the 'coxph' function in R. KaplanMeier survival curves were plot using the four quartile subgroups of patients based on expression levels
For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and twotailed P values were estimated using 'cor.test' function in R
For twoclass clinical features, twotailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the log2expression levels between the two clinical classes using 't.test' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
This is an experimental feature. Location of data archives could not be determined.