Correlation between mRNA expression and clinical features
Lung Adenocarcinoma (MOLECULAR_NONSMOKER)
07 February 2013  |  awg_luad__2013_02_07
Maintainer Information
Citation Information
Maintained by TCGA GDAC Team (Broad Institute/MD Anderson Cancer Center/Harvard Medical School)
Cite as Broad Institute TCGA Genome Data Analysis Center (2013): Correlation between mRNA expression and clinical features. Broad Institute of MIT and Harvard. doi:10.7908/C1TB150D
Overview
Introduction

This pipeline uses various statistical tests to identify mRNAs whose expression levels correlated to selected clinical features.

Summary

Testing the association between 17814 genes and 8 clinical features across 9 samples, statistically thresholded by Q value < 0.05, no clinical feature related to at least one genes.

  • No genes correlated to 'AGE', 'GENDER', 'PATHOLOGY.T', 'PATHOLOGY.N', 'TUMOR.STAGE', 'NUMBERPACKYEARSSMOKED', 'TOBACCOSMOKINGHISTORYINDICATOR', and 'YEAROFTOBACCOSMOKINGONSET'.

Results
Overview of the results

Complete statistical result table is provided in Supplement Table 1

Table 1.  Get Full Table This table shows the clinical features, statistical methods used, and the number of genes that are significantly associated with each clinical feature at Q value < 0.05.

Clinical feature Statistical test Significant genes Associated with                 Associated with
AGE Spearman correlation test   N=0        
GENDER t test   N=0        
PATHOLOGY T t test   N=0        
PATHOLOGY N Spearman correlation test   N=0        
TUMOR STAGE Spearman correlation test   N=0        
NUMBERPACKYEARSSMOKED Spearman correlation test   N=0        
TOBACCOSMOKINGHISTORYINDICATOR ANOVA test   N=0        
YEAROFTOBACCOSMOKINGONSET Spearman correlation test   N=0        
Clinical variable #1: 'AGE'

No gene related to 'AGE'.

Table S1.  Basic characteristics of clinical feature: 'AGE'

AGE Mean (SD) 65.89 (12)
  Significant markers N = 0
Clinical variable #2: 'GENDER'

No gene related to 'GENDER'.

Table S2.  Basic characteristics of clinical feature: 'GENDER'

GENDER Labels N
  FEMALE 5
  MALE 4
     
  Significant markers N = 0
Clinical variable #3: 'PATHOLOGY.T'

No gene related to 'PATHOLOGY.T'.

Table S3.  Basic characteristics of clinical feature: 'PATHOLOGY.T'

PATHOLOGY.T Labels N
  T1 2
  T2 7
     
  Significant markers N = 0
Clinical variable #4: 'PATHOLOGY.N'

No gene related to 'PATHOLOGY.N'.

Table S4.  Basic characteristics of clinical feature: 'PATHOLOGY.N'

PATHOLOGY.N Mean (SD) 0.78 (0.97)
  N
  N0 5
  N1 1
  N2 3
     
  Significant markers N = 0
Clinical variable #5: 'TUMOR.STAGE'

No gene related to 'TUMOR.STAGE'.

Table S5.  Basic characteristics of clinical feature: 'TUMOR.STAGE'

TUMOR.STAGE Mean (SD) 1.89 (1.2)
  N
  Stage 1 5
  Stage 2 1
  Stage 3 2
  Stage 4 1
     
  Significant markers N = 0
Clinical variable #6: 'NUMBERPACKYEARSSMOKED'

No gene related to 'NUMBERPACKYEARSSMOKED'.

Table S6.  Basic characteristics of clinical feature: 'NUMBERPACKYEARSSMOKED'

NUMBERPACKYEARSSMOKED Mean (SD) 46.5 (11)
  Value N
  30 1
  50 2
  56 1
     
  Significant markers N = 0
Clinical variable #7: 'TOBACCOSMOKINGHISTORYINDICATOR'

No gene related to 'TOBACCOSMOKINGHISTORYINDICATOR'.

Table S7.  Basic characteristics of clinical feature: 'TOBACCOSMOKINGHISTORYINDICATOR'

TOBACCOSMOKINGHISTORYINDICATOR Labels N
  CURRENT REFORMED SMOKER FOR > 15 YEARS 2
  CURRENT SMOKER 3
  LIFELONG NON-SMOKER 4
     
  Significant markers N = 0
Clinical variable #8: 'YEAROFTOBACCOSMOKINGONSET'

No gene related to 'YEAROFTOBACCOSMOKINGONSET'.

Table S8.  Basic characteristics of clinical feature: 'YEAROFTOBACCOSMOKINGONSET'

YEAROFTOBACCOSMOKINGONSET Mean (SD) 1963.4 (9.7)
  Value N
  1955 1
  1956 1
  1962 1
  1965 1
  1979 1
     
  Significant markers N = 0
Methods & Data
Input
  • Expresson data file = MOLECULAR_NONSMOKER.medianexp.txt

  • Clinical data file = MOLECULAR_NONSMOKER.clin.merged.picked.txt

  • Number of patients = 9

  • Number of genes = 17814

  • Number of clinical features = 8

Correlation analysis

For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R

Student's t-test analysis

For two-class clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the log2-expression levels between the two clinical classes using 't.test' function in R

ANOVA analysis

For multi-class clinical features (ordinal or nominal), one-way analysis of variance (Howell 2002) was applied to compare the log2-expression levels between different clinical classes using 'anova' function in R

Q value calculation

For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.

References
[1] Spearman, C, The proof and measurement of association between two things, Amer. J. Psychol 15:72-101 (1904)
[2] Lehmann and Romano, Testing Statistical Hypotheses (3E ed.), New York: Springer. ISBN 0387988645 (2005)
[3] Howell, D, Statistical Methods for Psychology. (5th ed.), Duxbury Press:324-5 (2002)
[4] Benjamini and Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 59:289-300 (1995)