Correlation between RPPA expression and clinical features
Pheochromocytoma and Paraganglioma (Primary solid tumor)
02 April 2015  |  analyses__2015_04_02
Maintainer Information
Citation Information
Maintained by Juok Cho (Broad Institute)
Cite as Broad Institute TCGA Genome Data Analysis Center (2015): Correlation between RPPA expression and clinical features. Broad Institute of MIT and Harvard. doi:10.7908/C16Q1WBS
Overview
Introduction

This pipeline uses various statistical tests to identify RPPAs whose expression levels correlated to selected clinical features.

Summary

Testing the association between 192 genes and 3 clinical features across 79 samples, statistically thresholded by P value < 0.05 and Q value < 0.3, 3 clinical features related to at least one genes.

  • 1 gene correlated to 'YEARS_TO_BIRTH'.

    • EEF2K|EEF2K-R-V

  • 2 genes correlated to 'GENDER'.

    • RPS6KA1|P90RSK_PT359_S363-R-C ,  BIRC2 |CIAP-R-V

  • 1 gene correlated to 'RACE'.

    • GSK3A GSK3B|GSK3-ALPHA-BETA-M-V

Results
Overview of the results

Complete statistical result table is provided in Supplement Table 1

Table 1.  Get Full Table This table shows the clinical features, statistical methods used, and the number of genes that are significantly associated with each clinical feature at P value < 0.05 and Q value < 0.3.

Clinical feature Statistical test Significant genes Associated with                 Associated with
YEARS_TO_BIRTH Spearman correlation test N=1 older N=0 younger N=1
GENDER Wilcoxon test N=2 male N=2 female N=0
RACE Kruskal-Wallis test N=1        
Clinical variable #1: 'YEARS_TO_BIRTH'

One gene related to 'YEARS_TO_BIRTH'.

Table S1.  Basic characteristics of clinical feature: 'YEARS_TO_BIRTH'

YEARS_TO_BIRTH Mean (SD) 47.77 (15)
  Significant markers N = 1
  pos. correlated 0
  neg. correlated 1
List of one gene differentially expressed by 'YEARS_TO_BIRTH'

Table S2.  Get Full Table List of one gene significantly correlated to 'YEARS_TO_BIRTH' by Spearman correlation test

SpearmanCorr corrP Q
EEF2K|EEF2K-R-V -0.381 0.0005318 0.102
Clinical variable #2: 'GENDER'

2 genes related to 'GENDER'.

Table S3.  Basic characteristics of clinical feature: 'GENDER'

GENDER Labels N
  FEMALE 40
  MALE 39
     
  Significant markers N = 2
  Higher in MALE 2
  Higher in FEMALE 0
List of 2 genes differentially expressed by 'GENDER'

Table S4.  Get Full Table List of 2 genes differentially expressed by 'GENDER'. 0 significant gene(s) located in sex chromosomes is(are) filtered out.

W(pos if higher in 'MALE') wilcoxontestP Q AUC
RPS6KA1|P90RSK_PT359_S363-R-C 416 0.0003647 0.07 0.7333
BIRC2 |CIAP-R-V 1097 0.001912 0.184 0.7032
Clinical variable #3: 'RACE'

One gene related to 'RACE'.

Table S5.  Basic characteristics of clinical feature: 'RACE'

RACE Labels N
  ASIAN 4
  BLACK OR AFRICAN AMERICAN 9
  WHITE 65
     
  Significant markers N = 1
List of one gene differentially expressed by 'RACE'

Table S6.  Get Full Table List of one gene differentially expressed by 'RACE'

kruskal_wallis_P Q
GSK3A GSK3B|GSK3-ALPHA-BETA-M-V 0.0001786 0.0343
Methods & Data
Input
  • Expresson data file = PCPG-TP.rppa.txt

  • Clinical data file = PCPG-TP.merged_data.txt

  • Number of patients = 79

  • Number of genes = 192

  • Number of clinical features = 3

Selected clinical features
  • For clinical features selected for this analysis and their value conozzle.versions, please find a documentation on selected CDEs .

  • Survival time data

    • Survival time data is a combined value of days_to_death and days_to_last_followup. For each patient, it creates a combined value 'days_to_death_or_last_fup' using conversion process below.

      • if 'vital_status'==1(dead), 'days_to_last_followup' is always NA. Thus, uses 'days_to_death' value for 'days_to_death_or_fup'

      • if 'vital_status'==0(alive),

        • if 'days_to_death'==NA & 'days_to_last_followup'!=NA, uses 'days_to_last_followup' value for 'days_to_death_or_fup'

        • if 'days_to_death'!=NA, excludes this case in survival analysis and report the case.

      • if 'vital_status'==NA,excludes this case in survival analysis and report the case.

    • cf. In certain diesase types such as SKCM, days_to_death parameter is replaced with time_from_specimen_dx or time_from_specimen_procurement_to_death .

  • This analysis excluded clinical variables that has only NA values.

Correlation analysis

For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R

Wilcoxon rank sum test (Mann-Whitney U test)

For two groups (mutant or wild-type) of continuous type of clinical data, wilcoxon rank sum test (Mann and Whitney, 1947) was applied to compare their mean difference using 'wilcox.test(continuous.clinical ~ as.factor(group), exact=FALSE)' function in R. This test is equivalent to the Mann-Whitney test.

Q value calculation

For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References
[1] Spearman, C, The proof and measurement of association between two things, Amer. J. Psychol 15:72-101 (1904)
[2] Mann and Whitney, On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other, Annals of Mathematical Statistics 18 (1), 50-60 (1947)
[3] Benjamini and Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 59:289-300 (1995)