Correlation between RPPA expression and clinical features
Pheochromocytoma and Paraganglioma (Primary solid tumor)
28 January 2016  |  analyses__2016_01_28
Maintainer Information
Citation Information
Maintained by Juok Cho (Broad Institute)
Cite as Broad Institute TCGA Genome Data Analysis Center (2016): Correlation between RPPA expression and clinical features. Broad Institute of MIT and Harvard. doi:10.7908/C1DB819Q
Overview
Introduction

This pipeline uses various statistical tests to identify RPPAs whose expression levels correlated to selected clinical features. The input file "PCPG-TP.rppa.txt" is generated in the pipeline RPPA_AnnotateWithGene in the stddata run.

Summary

Testing the association between 192 genes and 7 clinical features across 79 samples, statistically thresholded by P value < 0.05 and Q value < 0.3, 5 clinical features related to at least one genes.

  • 1 gene correlated to 'YEARS_TO_BIRTH'.

    • EEF2K|EEF2K

  • 30 genes correlated to 'TUMOR_TISSUE_SITE'.

    • ANXA1|ANNEXIN-1 ,  CAV1|CAVEOLIN-1 ,  BAK1|BAK ,  MYH9|MYOSIN-IIA_PS1943 ,  PREX1|PREX1 ,  ...

  • 2 genes correlated to 'GENDER'.

    • RPS6KA1|P90RSK_PT359_S363 ,  BIRC2 |CIAP

  • 1 gene correlated to 'NUMBER_OF_LYMPH_NODES'.

    • RPS6|S6_PS240_S244

  • 1 gene correlated to 'RACE'.

    • GSK3A GSK3B|GSK3-ALPHA-BETA

  • No genes correlated to 'KARNOFSKY_PERFORMANCE_SCORE', and 'HISTOLOGICAL_TYPE'.

Results
Overview of the results

Complete statistical result table is provided in Supplement Table 1

Table 1.  Get Full Table This table shows the clinical features, statistical methods used, and the number of genes that are significantly associated with each clinical feature at P value < 0.05 and Q value < 0.3.

Clinical feature Statistical test Significant genes Associated with                 Associated with
YEARS_TO_BIRTH Spearman correlation test N=1 older N=0 younger N=1
TUMOR_TISSUE_SITE Wilcoxon test N=30 extra-adrenal site N=30 adrenal gland N=0
GENDER Wilcoxon test N=2 male N=2 female N=0
KARNOFSKY_PERFORMANCE_SCORE Wilcoxon test   N=0        
HISTOLOGICAL_TYPE Kruskal-Wallis test   N=0        
NUMBER_OF_LYMPH_NODES Spearman correlation test N=1 higher number_of_lymph_nodes N=0 lower number_of_lymph_nodes N=1
RACE Kruskal-Wallis test N=1        
Clinical variable #1: 'YEARS_TO_BIRTH'

One gene related to 'YEARS_TO_BIRTH'.

Table S1.  Basic characteristics of clinical feature: 'YEARS_TO_BIRTH'

YEARS_TO_BIRTH Mean (SD) 47.77 (15)
  Significant markers N = 1
  pos. correlated 0
  neg. correlated 1
List of one gene differentially expressed by 'YEARS_TO_BIRTH'

Table S2.  Get Full Table List of one gene significantly correlated to 'YEARS_TO_BIRTH' by Spearman correlation test

SpearmanCorr corrP Q
EEF2K|EEF2K -0.381 0.0005318 0.102
Clinical variable #2: 'TUMOR_TISSUE_SITE'

30 genes related to 'TUMOR_TISSUE_SITE'.

Table S3.  Basic characteristics of clinical feature: 'TUMOR_TISSUE_SITE'

TUMOR_TISSUE_SITE Labels N
  ADRENAL GLAND 67
  EXTRA-ADRENAL SITE 12
     
  Significant markers N = 30
  Higher in EXTRA-ADRENAL SITE 30
  Higher in ADRENAL GLAND 0
List of top 10 genes differentially expressed by 'TUMOR_TISSUE_SITE'

Table S4.  Get Full Table List of top 10 genes differentially expressed by 'TUMOR_TISSUE_SITE'

W(pos if higher in 'EXTRA-ADRENAL SITE') wilcoxontestP Q AUC
ANXA1|ANNEXIN-1 657 0.0005086 0.0548 0.8172
CAV1|CAVEOLIN-1 649 0.0007601 0.0548 0.8072
BAK1|BAK 160 0.0009715 0.0548 0.801
MYH9|MYOSIN-IIA_PS1943 638 0.001297 0.0548 0.7935
PREX1|PREX1 636 0.001426 0.0548 0.791
YAP1|YAP 629 0.001976 0.0632 0.7823
SMAD3|SMAD3 183 0.002841 0.0682 0.7724
SYK|SYK 621 0.002841 0.0682 0.7724
STAT3|STAT3_PY705 613 0.004038 0.0861 0.7624
NDRG1|NDRG1_PT346 605 0.005676 0.103 0.7525
Clinical variable #3: 'GENDER'

2 genes related to 'GENDER'.

Table S5.  Basic characteristics of clinical feature: 'GENDER'

GENDER Labels N
  FEMALE 40
  MALE 39
     
  Significant markers N = 2
  Higher in MALE 2
  Higher in FEMALE 0
List of 2 genes differentially expressed by 'GENDER'

Table S6.  Get Full Table List of 2 genes differentially expressed by 'GENDER'. 0 significant gene(s) located in sex chromosomes is(are) filtered out.

W(pos if higher in 'MALE') wilcoxontestP Q AUC
RPS6KA1|P90RSK_PT359_S363 416 0.0003647 0.07 0.7333
BIRC2 |CIAP 1097 0.001912 0.184 0.7032
Clinical variable #4: 'KARNOFSKY_PERFORMANCE_SCORE'

No gene related to 'KARNOFSKY_PERFORMANCE_SCORE'.

Table S7.  Basic characteristics of clinical feature: 'KARNOFSKY_PERFORMANCE_SCORE'

KARNOFSKY_PERFORMANCE_SCORE Labels N
  class100 29
  class90 4
     
  Significant markers N = 0
Clinical variable #5: 'HISTOLOGICAL_TYPE'

No gene related to 'HISTOLOGICAL_TYPE'.

Table S8.  Basic characteristics of clinical feature: 'HISTOLOGICAL_TYPE'

HISTOLOGICAL_TYPE Labels N
  PARAGANGLIOMA 5
  PARAGANGLIOMA; EXTRA-ADRENAL PHEOCHROMOCYTOMA 6
  PHEOCHROMOCYTOMA 68
     
  Significant markers N = 0
Clinical variable #6: 'NUMBER_OF_LYMPH_NODES'

One gene related to 'NUMBER_OF_LYMPH_NODES'.

Table S9.  Basic characteristics of clinical feature: 'NUMBER_OF_LYMPH_NODES'

NUMBER_OF_LYMPH_NODES Mean (SD) 1.6 (4)
  Value N
  0 6
  1 3
  13 1
     
  Significant markers N = 1
  pos. correlated 0
  neg. correlated 1
List of one gene differentially expressed by 'NUMBER_OF_LYMPH_NODES'

Table S10.  Get Full Table List of one gene significantly correlated to 'NUMBER_OF_LYMPH_NODES' by Spearman correlation test

SpearmanCorr corrP Q
RPS6|S6_PS240_S244 -0.8739 0.0009485 0.182
Clinical variable #7: 'RACE'

One gene related to 'RACE'.

Table S11.  Basic characteristics of clinical feature: 'RACE'

RACE Labels N
  ASIAN 4
  BLACK OR AFRICAN AMERICAN 9
  WHITE 65
     
  Significant markers N = 1
List of one gene differentially expressed by 'RACE'

Table S12.  Get Full Table List of one gene differentially expressed by 'RACE'

kruskal_wallis_P Q
GSK3A GSK3B|GSK3-ALPHA-BETA 0.0001786 0.0343
Methods & Data
Input
  • Expresson data file = PCPG-TP.rppa.txt

  • Clinical data file = PCPG-TP.merged_data.txt

  • Number of patients = 79

  • Number of genes = 192

  • Number of clinical features = 7

Selected clinical features
  • Further details on clinical features selected for this analysis, please find a documentation on selected CDEs (Clinical Data Elements). The first column of the file is a formula to convert values and the second column is a clinical parameter name.

  • Survival time data

    • Survival time data is a combined value of days_to_death and days_to_last_followup. For each patient, it creates a combined value 'days_to_death_or_last_fup' using conversion process below.

      • if 'vital_status'==1(dead), 'days_to_last_followup' is always NA. Thus, uses 'days_to_death' value for 'days_to_death_or_fup'

      • if 'vital_status'==0(alive),

        • if 'days_to_death'==NA & 'days_to_last_followup'!=NA, uses 'days_to_last_followup' value for 'days_to_death_or_fup'

        • if 'days_to_death'!=NA, excludes this case in survival analysis and report the case.

      • if 'vital_status'==NA,excludes this case in survival analysis and report the case.

    • cf. In certain diesase types such as SKCM, days_to_death parameter is replaced with time_from_specimen_dx or time_from_specimen_procurement_to_death .

  • This analysis excluded clinical variables that has only NA values.

Correlation analysis

For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R

Wilcoxon rank sum test (Mann-Whitney U test)

For two groups (mutant or wild-type) of continuous type of clinical data, wilcoxon rank sum test (Mann and Whitney, 1947) was applied to compare their mean difference using 'wilcox.test(continuous.clinical ~ as.factor(group), exact=FALSE)' function in R. This test is equivalent to the Mann-Whitney test.

Q value calculation

For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References
[1] Spearman, C, The proof and measurement of association between two things, Amer. J. Psychol 15:72-101 (1904)
[2] Mann and Whitney, On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other, Annals of Mathematical Statistics 18 (1), 50-60 (1947)
[3] Benjamini and Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B 59:289-300 (1995)