The clinical information for each TCGA tumor sample is stored in a xml file. Patient ID, tumor and treatment info are entries of the xml file. These xml files have been preprocessed for further analysis.
Clinical data for tier 1 clinical variables are generated.
Table 1. Tier1 clinical variables
Tumor.Feature | Date.Statistics |
---|---|
gender | dccuploaddate |
primarysiteofdesease | dateofbirth |
histologicaltype | dateofdeath |
tumorstage | dateoflastfollowup |
tumorgrade | dateoftumorrecurrence |
patienttumorrecurrencestatus | dateofinitialpathologicdiagnosis |
radiationtherapy | datelastknownalive |
neoadjuvanttherapy | vitalstatus |
pathologicspread(pt) | |
pathologicspread(pn) | |
karnofskyperformancescore |
Table 2. Statistics of selected clinical variables.
Clinical.Variable | Statistics |
---|---|
age | mean: 60, std: 14 |
vitalstatus | 3933 living, 1738 deceased |
gender | 2288 male, 3373 female |
histologicaltype | 31 squamous cell carcinoma, 1 adenocarcinoma, 364 colon adenocarcinoma, 56 colon mucinous adenocarcinoma, 23 untreated primary (de novo) gbm, 2 treated primary gbm, 311 head & neck squamous cell carcinoma, 502 kidney clear cell renal carcinoma, 95 kidney papillary renal cell carcinoma, 53 astrocytoma, 67 oligodendroglioma, 40 oligoastrocytoma, 62 hepatocellular carcinoma, 187 lung adenocarcinoma- not otherwise specified (nos), 65 lung adenocarcinoma mixed subtype, 10 lung papillary adenocarcinoma, 2 lung mucinous adenocarcinoma, 4 mucinous (colloid) adenocarcinoma, 2 lung clear cell adenocarcinoma, 6 lung acinar adenocarcinoma, 3 lung bronchioloalveolar carcinoma mucinous, 9 lung bronchioloalveolar carcinoma nonmucinous, 3 lung micropapillary adenocarcinoma, 1 lung solid pattern predominant adenocarcinoma, 269 lung squamous cell carcinoma- not otherwise specified (nos), 7 lung basaloid squamous cell carcinoma, 1 lung small cell squamous cell carcinoma, 1 lung papillary squamous cell caricnoma, 1 lung papillary squamous cell carcinoma, 570 serous cystadenocarcinoma, 127 prostate adenocarcinoma acinar type, 149 rectal adenocarcinoma, 13 rectal mucinous adenocarcinoma, 16 stomach adenocarcinoma - diffuse type, 89 stomach adenocarcinoma - not otherwise specified (nos), 10 stomach intestinal adenocarcinoma - tubular type, 3 stomach intestinal adenocarcinoma - papillary type, 28 stomach intestinal adenocarcinoma - type not otherwise specified (nos), 8 stomach intestinal adenocarcinoma - mucinous type, 59 thyroid papillary carcinoma - follicular (>= 99% follicular patterned), 106 thyroid papillary carcinoma - classical/usual, 21 thyroid papillary carcinoma - tall cell (>= 50% tall cell features), 7 other, 79 serous endometrial adenocarcinoma, 333 endometrioid endometrial adenocarcinoma, 18 mixed serous and endometrioid, 7 endometrioid endometrial adenocarcinoma (grade 3), 3 endometrioid endometrial adenocarcinoma (grade 1 or 2), 9 endometrioid endometrial adenocarcinoma (grade 1), 2 endometrioid endometrial adenocarcinoma (grade 2) |
patienttumorrecurrencestatus | 5671 without recurrence |
tumorgrade | 248 grade 3, 106 grade 2, 90 grade 1, 7 high grade |
tumorstage | 203 stage iia, 106 stage iiib, 108 stage iib, 47 stage iiic, 192 stage iv, 415 stage i, 179 stage iva, 4 stage iic, 177 stage ii, 218 stage iii, 140 stage iiia, 6 stage ivb, 114 stage ia, 201 stage ib, 403 iiic, 3 ib, 24 iiib, 87 iv, 19 iic, 3 ia, 8 iiia, 10 ic, 4 iib, 3 iia |
pathologicspread(pt) | 585 t3, 10 t4b, 84 t4, 547 t2, 131 t4a, 167 t1, 1 t0, 1 tis, 55 tx, 153 t1b, 60 t3b, 177 t1a, 72 t2a, 137 t3a, 2 t3c, 34 t2b |
pathologicspread(pn) | 1087 n0, 355 n1, 193 n2, 17 n1b, 73 n2b, 11 n2a, 17 n1a, 3 n1c, 401 nx, 33 n2c, 24 n3, 2 n3a |
Table 3. Get Full Table Illustration of the tier 1 data for three patients
Clinical.Variable | Sample_1 | Sample_2 | Sample_3 |
---|---|---|---|
yearstobirth | 73 | 57 | 65 |
daystodeath | NA | 223 | 81 |
daystolastfollowup | 389 | NA | NA |
vitalstatus | 0 | 1 | 1 |
dccuploaddate | 27-7-2012 | 27-7-2012 | 27-7-2012 |
1. Each xml file is converted to a tab-delimited text file by our R package.
2. All text files are aggregated into one big table by the Clinical_Aggregate_Tier1 pipeline. The 1st column of the table is the entry name of the xml file and the rest columns are the associated data for samples.
3. Data for the tier 1 clinical variables are extracted by the Clinical_Picker_Tier1 pipeline.
Figure 1. Diagram that displays the work flow of processing clinical data. Clinical variables of interest and their associated values are marked in red and blue, respectively.
