Lung Adenocarcinoma: Preprocessing of clinical data
Maintained by TCGA GDAC Team (Broad Institute/Dana-Farber Cancer Institute/Harvard Medical School)
Overview
Introduction

The clinical information for each TCGA tumor sample is stored in a xml file. Patient ID, tumor and treatment info are entries of the xml file. These xml files have been preprocessed for further analysis.

Summary

Clinical data for tier 1 clinical variables are generated.

Table 1.  Tier1 clinical variables

Tumor.Feature Date.Statistics
gender dccuploaddate
primarysiteofdesease dateofbirth
histologicaltype dateofdeath
tumorstage dateoflastfollowup
tumorgrade dateoftumorrecurrence
patienttumorrecurrencestatus dateofinitialpathologicdiagnosis
radiationtherapy datelastknownalive
neoadjuvanttherapy vitalstatus
pathologicspread(pt)
pathologicspread(pn)
karnofskyperformancescore
Results
Tier 1 Data Statistics

Table 2.  Statistics of selected clinical variables.

Clinical.Variable Statistics
age mean: 66, std: 10
vitalstatus 156 living, 67 deceased
gender 98 male, 125 female
histologicaltype 136 lung adenocarcinoma- not otherwise specified (nos), 54 lung adenocarcinoma mixed subtype, 1 lung mucinous adenocarcinoma, 8 lung papillary adenocarcinoma, 4 mucinous (colloid) adenocarcinoma, 5 lung acinar adenocarcinoma, 10 lung bronchioloalveolar carcinoma nonmucinous, 3 lung bronchioloalveolar carcinoma mucinous, 2 lung micropapillary adenocarcinoma
tumorstage 68 stage ib, 38 stage iiia, 46 stage ia, 9 stage iiib, 9 stage iv, 29 stage iib, 18 stage iia, 1 stage i
pathologicspread(pt) 101 t2, 36 t1, 15 t4, 17 t3, 14 t1b, 23 t2a, 6 t2b, 10 t1a
pathologicspread(pn) 132 n0, 41 n2, 43 n1, 1 n3, 5 nx
Tier 1 Data

Table 3.  Get Full Table Illustration of the tier 1 data for three patients

Clinical.Variable Sample_1 Sample_2 Sample_3
yearstobirth 67 68 66
daystodeath NA NA NA
daystolastfollowup 1158 606 426
vitalstatus 0 0 0
dccuploaddate 31-8-2012 31-8-2012 31-8-2012
Methods & Data
Work Flow

1. Each xml file is converted to a tab-delimited text file by our R package.

2. All text files are aggregated into one big table by the Clinical_Aggregate_Tier1 pipeline. The 1st column of the table is the entry name of the xml file and the rest columns are the associated data for samples.

3. Data for the tier 1 clinical variables are extracted by the Clinical_Picker_Tier1 pipeline.

Diagram of Clinical Data Dicer

Figure 1.  Diagram that displays the work flow of processing clinical data. Clinical variables of interest and their associated values are marked in red and blue, respectively.

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.