LowPass Copy number analysis (GISTIC2)

Colon Adenocarcinoma (Primary solid tumor)

23 September 2013 | analyses__2013_09_23

Maintainer Information

Citation Information

Maintained by Spring Yingchun Liu (Broad Institute)

Cite as Broad Institute TCGA Genome Data Analysis Center (2013): LowPass Copy number analysis (GISTIC2). Broad Institute of MIT and Harvard. doi:10.7908/C1ST7N57

Overview

Introduction

GISTIC identifies genomic regions that are significantly gained or lost across a set of tumors. The pipeline first filters out normal samples from the segmented copy-number data by inspecting the TCGA barcodes and then executes GISTIC version 2.0.19 (Firehose task version: 125).

Summary

There were 68 tumor samples used in this analysis: 17 significant arm-level results, 9 significant focal amplifications, and 10 significant focal deletions were found.

Results

Focal results

Figure 1. Genomic positions of amplified regions: the X-axis represents the normalized amplification signals (top) and significance by Q value (bottom). The green line represents the significance cutoff at Q value=0.25.

Table 1. Get Full Table Amplifications Table - 9 significant amplifications found. Click the link in the last column to view a comprehensive list of candidate genes. If no genes were identified within the peak, the nearest gene appears in brackets.

Cytoband	Q value	Residual Q value	Wide Peak Boundaries	# Genes in Wide Peak
14q11.2	4.9231e-06	4.9231e-06	chr14:21725002-22028999	10
17q12	4.9231e-06	4.9231e-06	chr17:35082002-35511330	5
8p11.23	0.0023807	0.0023807	chr8:38093542-38720999	8
7p14.1	0.0034225	0.0034225	chr7:38259002-38726000	5
5q31.1	0.0038827	0.0038827	chr5:134285002-134292999	1
19q13.11	0.040955	0.040955	chr19:23976001-32820288	19
3p11.1	0.060697	0.060697	chr3:90387002-95001999	6
20q12	0.061642	0.061642	chr20:41531002-43781999	32
13q12.13	0.18467	0.18467	chr13:18975002-33955000	101

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 14q11.2.

Table S1. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
HNRNPC
SALL2
TOX4
SUPT16H
METTL3
RPGRIP1
CHD8
RAB2B
SNORD8
SNORD9

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 17q12.

Table S2. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
hsa-mir-2909
ACACA
LHX1
AATF
MIR2909

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 8p11.23.

Table S3. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
FGFR1
WHSC1L1
TACC1
DDHD2
PPAPDC1B
LETM2
RNF5P1
C8orf86

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 7p14.1.

Table S4. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
AMPH
STARD3NL
FAM183B
TARP
LOC100506776

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 5q31.1.

Table S5. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
PCBD2

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 19q13.11.

Table S6. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
CCNE1
UQCRFS1
URI1
ZNF254
ZNF536
POP4
TSHZ3
PLEKHF1
C19orf12
DKFZp566F0947
LOC148145
LOC148189
LOC284395
VSTM2B
RPSAP58
ZNF726
LOC100101266
LOC100505835
THEG5

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 3p11.1.

Table S7. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
PROS1
NSUN3
ARL13B
DHFRL1
LOC255025
STX19

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 20q12.

Table S8. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
ADA
HNF4A
KCNS1
MYBL2
SRSF6
STK4
YWHAB
WISP2
SGK2
TOMM34
SERINC3
PTPRT
PKIG
L3MBTL1
IFT52
C20orf111
JPH2
KCNK15
GDAP1L1
LOC79015
TTPAL
PABPC1L
TOX2
FITM2
WFDC12
RIMS4
R3HDML
GTSF1L
WFDC5
MIR3646
LOC100505783
LOC100505826

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 13q12.13.

Table S9. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
BRCA2
CDX2
FLT3
hsa-mir-2276
PARP4
ALOX5AP
ATP12A
CDK8
FGF9
FLT1
GJA3
GJB2
GPR12
GTF3A
HMGB1
PDX1
MIPEP
PABPC3
UBL3
RNF6
RPL21
SGCG
SLC7A1
TUBA3C
ZMYM2
IFT88
MTMR6
ZMYM5
KL
NUPL1
FRY
USPL1
SAP18
N4BP2L2
GJB6
HSPH1
WASF3
PDS5B
MTUS2
SACS
LATS2
SNORD102
POLR1D
CRYL1
POMP
ATP8A2
IL17D
MPHOSPH8
PSPC1
TNFRSF19
CENPJ
RNF17
XPO4
MRP63
KATNAL1
C13orf33
STARD13
N4BP2L1
TPTE2
CG030
RXFP2
TEX26
B3GALTL
EEF1DP3
FAM123A
USP12
MTIF3
GSX1
N6AMT2
SKA3
EFHA1
SPATA13
LNX2
ZDHHC20
PAN3
PHF2P1
SLC46A3
ANKRD20A9P
C1QTNF9
LINC00442
TPTE2P6
RASL11A
C1QTNF9B
SHISA2
ATP5EP2
LOC440131
C1QTNF9B-AS1
SNORA27
BASP1P1
TPTE2P1
PRHOXNB
ZAR1L
ANKRD26P3
RPL21P28
LINC00426
LINC00421
PAN3-AS1
MIR2276
LINC00327
TEX26-AS1
MIR4499

Figure 2. Genomic positions of deleted regions: the X-axis represents the normalized deletion signals (top) and significance by Q value (bottom). The green line represents the significance cutoff at Q value=0.25.

Table 2. Get Full Table Deletions Table - 10 significant deletions found. Click the link in the last column to view a comprehensive list of candidate genes. If no genes were identified within the peak, the nearest gene appears in brackets.

Cytoband	Q value	Residual Q value	Wide Peak Boundaries	# Genes in Wide Peak
16p13.3	6.0373e-15	6.0373e-15	chr16:6491002-6589853	1
8p11.22	0.0091733	0.0091733	chr8:39099379-39516999	5
8p22	0.0091733	0.0091733	chr8:1-30983845	262
4q35.2	0.016325	0.016325	chr4:168192002-191154276	102
16q23.1	0.02841	0.02841	chr16:77057138-77366999	3
4q22.1	0.036381	0.036381	chr4:91433002-92003999	1
15q11.1	0.055614	0.055614	chr15:1-24844893	40
1p36.13	0.076178	0.076178	chr1:5019002-33293999	427
3p14.2	0.089678	0.089678	chr3:59472001-61198999	1
17p12	0.096544	0.096544	chr17:10790001-11970794	4

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 16p13.3.

Table S10. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
RBFOX1

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 8p11.22.

Table S11. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
ADAM3A
ADAM18
ADAM32
ADAM5P
LOC100130964

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 8p22.

Table S12. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
PCM1
WRN
hsa-mir-3148
hsa-mir-4288
hsa-mir-4287
hsa-mir-548h-4
hsa-mir-320a
hsa-mir-548v
hsa-mir-383
hsa-mir-598
hsa-mir-1322
hsa-mir-4286
hsa-mir-124-1
hsa-mir-597
hsa-mir-548i-3
hsa-mir-596
NAT1
NAT2
ADRA1A
ANGPT2
ASAH1
ATP6V1B2
BLK
BMP1
POLR3D
BNIP3L
CHRNA2
CLU
CTSB
DEFA1
DEFA3
DEFA4
DEFA5
DEFA6
DEFB1
DEFB4A
DPYSL2
DUSP4
EGR3
EPB49
EPHX2
CLN8
EXTL3
PTK2B
FDFT1
FGL1
GATA4
GFRA2
GNRH1
GSR
GTF2E2
LOXL2
LPL
MSR1
MSRA
NEFM
NEFL
NKX3-1
PDGFRL
PNOC
PPP2CB
PPP2R2A
PPP3CC
SFTPC
SLC7A2
SLC18A1
STC1
FZD3
TUSC3
UBXN8
TNKS
ADAM7
TNFRSF10D
TNFRSF10C
TNFRSF10B
TNFRSF10A
FGF17
DOK2
MTMR7
MYOM2
DLGAP2
MFHAS1
ENTPD4
ARHGEF10
PHYHIP
KBTBD11
SORBS3
NPM2
DLC1
SPAG11B
DCTN6
PNMA2
ADAM28
RBPMS
LZTS1
XPO7
TRIM35
RHOBTB2
KIF13B
PSD3
LEPROTL1
SLC39A14
FBXO25
FGF20
ADAMDEC1
CNOT7
PURG
ZDHHC2
SLC25A37
SCARA3
TMEM66
KCTD9
PINX1
PIWIL2
ELP3
INTS10
CCDC25
AGPAT5
INTS9
CSGALNACT1
HR
PBK
ZNF395
DEFB103B
BIN3
TEX15
MTUS1
KIAA1456
KIAA1967
SH2D4A
PDLIM2
CSMD1
EBF2
FAM160B2
MTMR9
HMBOX1
MCPH1
PPP1R3B
NUDT18
DOCK5
FLJ14107
REEP4
STMN4
SOX7
FAM167A
SLC35G5
LINC00208
C8orf12
FAM86B1
ERI1
LONRF1
CHMP7
RP1L1
CLDN23
VPS37A
NKX2-6
SGCZ
DEFB104A
LOC157273
SGK223
PEBP4
CDCA2
ESCO2
FBXO16
LOC157627
C8orf42
ERICH1
TDH
C8orf48
ZNF596
DEFT1P
R3HCC1
PRSS55
C8orf74
LGI3
DEFB105A
DEFB106A
DEFB107A
DEFB109P1
DEFB130
NEIL2
LOC254896
FLJ10661
XKR6
LOC286059
LOC286083
EFHA2
LOC286114
SCARA5
LOC286135
LOC340357
LOC349196
USP17L2
XKR5
FAM90A25P
LOC389641
C8orf80
LOC392196
MIR124-1
MIR320A
DEFB103A
OR4F21
FAM90A13
FAM90A5
FAM90A7
FAM90A8
FAM90A18
FAM90A9
FAM90A10
DEFA10P
MIR383
DEFB107B
DEFB104B
DEFB106B
DEFB105B
C8orf58
DEFB135
DEFB136
DEFB134
C8orf75
MBOAT4
DEFB109P1B
RPL23AP53
FAM90A14
FAM86B2
SPAG11A
MIR596
MIR597
MIR598
DEFA1B
FAM90A20
FAM90A19
ZNF705D
LOC100128750
FAM66B
LOC100128993
ZNF705G
FAM66E
LOC100132396
FAM66D
FAM66A
LOC100133267
LOC100287015
DEFT1P2
DEFB4B
MIR1322
MIR548I3
MIR4287
MIR3148
MIR4288
MIR3926-2
MIR3622A
MIR3926-1
MIR3622B
LOC100506990
LOC100507156
LOC100507341
MIR548O2
MIR4659A
MIR4660
MIR4659B
LOC100652791

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 4q35.2.

Table S13. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
DUX4
hsa-mir-1305
hsa-mir-4276
hsa-mir-548t
AGA
SLC25A4
CASP3
CLCN3
DCTD
F11
ACSL1
FAT1
FRG1
GPM6A
HMGB2
HPGD
HSP90AA4P
ING2
IRF2
KLKB1
MTNR1A
NEK1
TLR3
VEGFC
GLRA3
SORBS2
SAP30
HAND2
MFAP3L
ADAM29
ANXA10
SCRG1
PALLD
FAM149A
FBXO8
DUX2
PDLIM3
AADAT
GALNT7
CLDN22
C4orf27
NEIL3
UFSP2
DDX60
CDKN2AIP
ODZ3
LRP2BP
STOX2
KIAA1430
SH3RF1
SPCS3
TRAPPC11
MLF1IP
NBLA00301
WWC2
CEP44
SNX25
CBR4
MGC45800
DDX60L
WDR17
ZFP42
SPATA4
ENPP6
ASB5
C4orf38
RWDD4
CCDC111
TRIML2
CCDC110
CYP4V2
LOC285441
LOC285501
LOC339975
TRIML1
ANKRD37
LOC389247
HELT
LOC401164
FAM92A3
HSP90AA6P
C4orf47
DUX4L4
GALNTL6
FRG2
SLED1
FLJ38576
DUX4L6
DUX4L5
DUX4L3
LINC00290
LOC728175
DUX4L2
LOC731424
CLDN24
LOC100288255
MIR1305
MIR4276
MIR3945
LOC100506085
LOC100506122
LOC100506229

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 16q23.1.

Table S14. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
MON1B
ADAMTS18
SYCE1L

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 4q22.1.

Table S15. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
FAM190A

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 15q11.1.

Table S16. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
hsa-mir-1268
hsa-mir-3118-6
hsa-mir-3118-4
NBEAP1
NDN
MKRN3
CYFIP1
MAGEL2
NIPA2
TUBGCP5
NIPA1
LOC283683
OR4N4
HERC2P3
GOLGA6L1
GOLGA8IP
WHAMMP3
POTEB
LOC348120
GOLGA8E
OR4M2
OR4N3P
HERC2P2
NF1P2
CHEK2P2
LOC646214
CXADRP2
REREP3
LOC653061
GOLGA6L6
LOC727924
GOLGA8C
PWRN1
PWRN2
HERC2P7
GOLGA8DP
MIR4509-1
MIR4509-2
MIR4508
MIR4509-3

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 1p36.13.

Table S17. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
LCK
PAX7
RPL22
SDHB
ARID1A
MDS2
hsa-mir-4254
hsa-mir-1976
hsa-mir-3115
hsa-mir-4253
hsa-mir-1256
hsa-mir-1290
hsa-mir-1273d
hsa-mir-34a
hsa-mir-4252
ALPL
RERE
BAI2
C1QA
C1QB
C1QC
CA6
CAPZB
CASP9
RUNX3
TNFRSF8
CDA
CDC42
CD52
RCC1
CLCN6
CLCNKA
CLCNKB
CNR2
COL16A1
CORT
DDOST
DFFA
E2F2
ECE1
EPHA2
ENO1
EPB41
EPHA8
EPHB2
EXTL1
EYA3
FABP3
FGR
MTOR
FUCA1
IFI6
GALE
SFN
GPR3
HCRTR1
HDAC1
ZBTB48
HMGN2
HMGCL
HSPG2
HTR1D
HTR6
ID3
TNFRSF9
STMN1
MATN1
MFAP2
MTHFR
NBL1
NPPA
NPPB
OPRD1
PAFAH2
PEX14
PGD
PIK3CD
PLA2G2A
PLA2G5
PLOD1
EXOSC10
PPP1R8
PTAFR
RAP1GAP
RBBP4
RHCE
RHD
RPA2
RPL11
RPS6KA1
RSC1A1
SRSF4
SLC2A5
SLC9A1
SRM
TAF12
TCEA3
TCEB3
TNFRSF1B
ZBTB17
SLC30A2
LUZP1
PRDM2
LAPTM5
PTP4A2
SNHG3
NR0B2
KCNAB2
FCN3
YARS
AKR7A2
ALDH4A1
EIF3I
EIF4G3
TNFRSF25
PER3
MAP3K6
DHRS3
VAMP3
SNRNP40
C1orf38
H6PD
SDC3
CROCC
PUM1
KLHL21
ZBTB40
MFN2
PTPRU
CELA3A
WASF2
ANGPTL7
HNRNPR
SRRM1
CNKSR1
UBE4B
MAD2L2
PDPN
KHDRBS1
GMEB1
NUDC
MASP2
SRSF10
UTS2
RCAN3
MST1P2
MST1P9
PADI2
LYPLA2
PARK7
CTRC
ACOT7
DNAJC8
CLSTN1
AKR7A3
SPEN
KDM1A
WDTC1
KIAA0090
KIF1B
PLEKHM2
OTUD3
KAZN
CAMTA1
DNAJC16
UBR4
ATP13A2
TARDBP
CELA3B
ICMT
PADI4
TMEM50A
KPNA6
STX12
CLIC4
SYF2
CHD5
C1orf144
LDLRAP1
FBXO2
FBXO6
PLA2G2D
RNU11
HSPB7
AHDC1
SMPDL3B
PRO0611
LINC00339
UBIAD1
PADI1
PLA2G2E
SLC45A1
HP1BP3
CELA2B
ZNF593
MECR
MRTO4
YTHDF2
ZCCHC17
PADI3
ERRFI1
WNT4
FBXO42
RNF186
HES2
GPN2
FBLIM1
MED18
PQLC2
CASZ1
TRNAU1AP
AIM1L
TMEM51
BSDC1
XKR8
TMEM39B
ARHGEF10L
VPS13D
TMEM57
CAMK2N1
ASAP3
PNRC2
PIGV
NBPF1
NECAP2
IQCC
DNAJC11
RCC2
TMEM234
FAM54B
CTNNBIP1
C1orf63
AGTRAP
PITHD1
MAN1C1
NIPAL3
SEPN1
PLEKHG5
PTCHD2
KIF17
KIAA1522
GRHL3
IL22RA1
MIIP
CELA2A
GPATCH3
TINAGL1
PLA2G2F
S100PBP
CEP85
NMNAT1
PINK1
MARCKSL1
PRAMEF1
PRAMEF2
PHACTR4
C1orf135
CCDC28B
EFHD2
RSG1
NKAIN1
MUL1
NOL9
LIN28A
AGMAT
FAM110D
DHDDS
GPR157
SPSB1
ZNF436
TAS1R2
TAS1R1
SYNC
ACTL8
TSSK3
SH3BGRL3
SESN2
ESPN
TMEM222
USP48
NBPF3
ZDHHC18
SLC25A33
DDI2
LZIC
TRIM63
FAM167B
CROCCP2
SYTL1
IGSF21
SNHG12
KIAA2013
THAP3
C1orf201
SPOCD1
UBXN11
C1orf158
FBXO44
ATPIF1
CROCCP3
FHAD1
RAB42
FAM46B
RBP7
C1orf172
LRRC38
AADACL3
IFFO2
MYOM3
KLHDC7A
VWA5B1
UBXN10
ARHGEF19
C1orf127
PHF13
C1orf213
DCDC2B
LOC149086
PDIK1L
C1orf64
SLC2A7
IL28RA
FAM43B
PAQR7
FAM76A
TMEM201
TXLNA
C1orf126
AKR7L
TMCO4
ZNF683
NPHP4
LOC284551
LOC284632
SLC25A34
ESPNP
MTMR9LP
ZBTB8OS
LOC339505
AADACL4
PRAMEF5
HNRNPCL1
PRAMEF9
PRAMEF10
SERINC2
FAM131C
PADI6
C1orf187
SPATA21
APITD1
CATSPER4
GPR153
RNF207
TMEM82
TRNP1
CD164L2
HES3
PRAMEF12
PRAMEF21
PRAMEF8
PRAMEF18
PRAMEF17
PLA2G2C
TMEM200B
PRAMEF4
PRAMEF13
SH2D5
C1orf130
PRAMEF3
LDLRAD2
MIR34A
PRAMEF11
PRAMEF6
LOC440563
UQCRHL
MINOS1
PRAMEF7
PEF1
LOC644961
C1orf200
PRAMEF19
PRAMEF20
LOC646471
LOC649330
ZBTB8A
LOC653566
PRAMEF22
PRAMEF15
PRAMEF16
SCARNA1
SNORA44
SNORA61
SNORA59B
SNORA59A
SNORA16A
SNORD85
SNORD99
SNORD103A
SNORD103B
ZBTB8B
LOC729059
PRAMEF14
FLJ37453
LOC100128071
LOC100129196
MIR1976
NPPA-AS1
MIR3115
MIR4253
MIR4252
MIR4254
MIR3917
MIR3675
ENO1-AS1
LOC100506730
LOC100506801
LOC100506963
APITD1-CORT
C1orf151-NBL1
MIR4695
MIR4420
MIR4684
MIR4689
MIR4632
MIR4417
MIR378F
RCAN3AS

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 3p14.2.

Table S18. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
FHIT

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 17p12.

Table S19. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
MAP2K4
DNAH9
ZNF18
SHISA6

Arm-level results

Table 3. Get Full Table Arm-level significance table - 17 significant results found. The significance cutoff is at Q value=0.25.

Arm	# Genes	Amp Frequency	Amp Z score	Amp Q value	Del Frequency	Del Z score	Del Q value
1p	2121	0.05	0.0281	1	0.15	3.78	0.000457
1q	1955	0.22	5.13	7.36e-07	0.09	0.856	0.712
2p	924	0.01	-3.06	1	0.00	-3.38	1
2q	1556	0.01	-2.25	1	0.01	-2.25	1
3p	1062	0.03	-2.43	1	0.06	-1.76	1
3q	1139	0.06	-1.69	1	0.02	-2.7	1
4p	489	0.00	-3.31	1	0.26	1.81	0.155
4q	1049	0.02	-2.55	1	0.19	1.39	0.328
5p	270	0.07	-2.59	1	0.11	-1.77	1
5q	1427	0.03	-1.86	1	0.11	0.0049	1
6p	1173	0.09	-0.818	1	0.05	-1.84	1
6q	839	0.09	-1.35	1	0.05	-2.28	1
7p	641	0.46	6.45	5.25e-10	0.03	-2.29	1
7q	1277	0.42	7.62	2.49e-13	0.03	-1.82	1
8p	580	0.33	2.75	0.0133	0.48	5.99	8.16e-09
8q	859	0.43	6.11	2.79e-09	0.19	0.717	0.789
9p	422	0.08	-2.21	1	0.06	-2.49	1
9q	1113	0.08	-1.25	1	0.06	-1.58	1
10p	409	0.02	-3.41	1	0.10	-1.72	1
10q	1268	0.02	-2.47	1	0.10	-0.354	1
11p	862	0.10	-1.19	1	0.10	-1.19	1
11q	1515	0.07	-0.872	1	0.11	0.281	1
12p	575	0.18	0.204	1	0.14	-0.658	1
12q	1447	0.16	1.36	0.348	0.07	-0.88	1
13q	654	0.48	6.92	3.02e-11	0.06	-1.79	1
14q	1341	0.02	-2.07	1	0.27	4.04	0.00018
15q	1355	0.02	-1.9	1	0.34	6.02	8.16e-09
16p	872	0.12	-0.604	1	0.05	-2.17	1
16q	702	0.11	-1.21	1	0.05	-2.41	1
17p	683	0.00	-2.67	1	0.47	6.8	1.07e-10
17q	1592	0.06	-0.768	1	0.09	0.026	1
18p	143	0.09	-1.65	1	0.54	6.58	3.18e-10
18q	446	0.03	-2.18	1	0.55	7.8	1.29e-13
19p	995	0.09	-1.15	1	0.03	-2.45	1
19q	1709	0.09	0.216	1	0.03	-1.46	1
20p	355	0.51	6.31	9.32e-10	0.19	-0.0246	1
20q	753	0.65	10.9	0	0.08	-1.08	1
21q	509	0.00	-3.49	1	0.18	-0.0454	1
22q	921	0.04	-2.16	1	0.26	2.57	0.0255
Xq	1312	0.37	6.43	5.25e-10	0.07	-0.973	1

Methods & Data

Input

Description

Segmentation File: The segmentation file contains the segmented data for all the samples identified by GLAD, CBS, or some other segmentation algorithm. (See GLAD file format in the Genepattern file formats documentation.) It is a six column, tab-delimited file with an optional first line identifying the columns. Positions are in base pair units.The column headers are: (1) Sample (sample name), (2) Chromosome (chromosome number), (3) Start Position (segment start position, in bases), (4) End Position (segment end position, in bases), (5) Num markers (number of markers in segment), (6) Seg.CN (log2() -1 of copy number).
Markers File: The markers file identifies the marker names and positions of the markers in the original dataset (before segmentation). It is a three column, tab-delimited file with an optional header. The column headers are: (1) Marker Name, (2) Chromosome, (3) Marker Position (in bases).
Reference Genome: The reference genome file contains information about the location of genes and cytobands on a given build of the genome. Reference genome files are created in Matlab and are not viewable with a text editor.
CNV Files: There are two options for the cnv file. The first option allows CNVs to be identified by marker name. The second option allows the CNVs to be identified by genomic location. Option #1: A two column, tab-delimited file with an optional header row. The marker names given in this file must match the marker names given in the markers file. The CNV identifiers are for user use and can be arbitrary. The column headers are: (1) Marker Name, (2) CNV Identifier. Option #2: A 6 column, tab-delimited file with an optional header row. The 'CNV Identifier' is for user use and can be arbitrary. 'Narrow Region Start' and 'Narrow Region End' are also not used. The column headers are: (1) CNV Identifier, (2) Chromosome, (3) Narrow Region Start, (4) Narrow Region End, (5) Wide Region Start, (6) Wide Region End
Amplification Threshold: Threshold for copy number amplifications. Regions with a log2 ratio above this value are considered amplified.
Deletion Threshold: Threshold for copy number deletions. Regions with a log2 ratio below the negative of this value are considered deletions.
Cap Values: Minimum and maximum cap values on analyzed data. Regions with a log2 ratio greater than the cap are set to the cap value; regions with a log2 ratio less than -cap value are set to -cap. Values must be positive.
Broad Length Cutoff: Threshold used to distinguish broad from focal events, given in units of fraction of chromosome arm.
Remove X-Chromosome: Flag indicating whether to remove data from the X-chromosome before analysis. Allowed values= {1,0} (1: Remove X-Chromosome, 0: Do not remove X-Chromosome.
Confidence Level: Confidence level used to calculate the region containing a driver.
Join Segment Size: Smallest number of markers to allow in segments from the segmented data. Segments that contain fewer than this number of markers are joined to the neighboring segment that is closest in copy number.
Arm Level Peel Off: Flag set to enable arm-level peel-off of events during peak definition. The arm-level peel-off enhancement to the arbitrated peel-off method assigns all events in the same chromosome arm of the same sample to a single peak. It is useful when peaks are split by noise or chromothripsis. Allowed values= {1,0} (1: Use arm level peel off, 0: Use normal arbitrated peel-off).
Maximum Sample Segments: Maximum number of segments allowed for a sample in the input data. Samples with more segments than this threshold are excluded from the analysis.
Gene GISTIC: When enabled (value = 1), this option causes GISTIC to analyze deletions using genes instead of array markers to locate the lesion. In this mode, the copy number assigned to a gene is the lowest copy number among the markers that represent the gene.

Values

List of inputs used for this run of GISTIC2. All files listed should be included in the archived results.

Segmentation File = /xchip/cga/gdac-prod/tcga-gdac/jobResults/PrepareGisticDNASeq/COAD-TP/4425314/segmentationfile.txt
Markers File = /xchip/cga/gdac-prod/tcga-gdac/jobResults/PrepareGisticDNASeq/COAD-TP/4425314/markersfile.txt
Reference Genome = /xchip/cga/reference/gistic2/hg19_with_miR_20120227.mat
CNV Files = /xchip/cga/reference/gistic2/CNV.hg19.bypos.111213.txt
Amplification Threshold = 0.3
Deletion Threshold = 0.3
Cap Values = 2
Broad Length Cutoff = 0.5
Remove X-Chromosome = 0
Confidence Level = 0.99
Join Segment Size = 10
Arm Level Peel Off = 1
Maximum Sample Segments = 10000
Gene GISTIC = 0

Table 4. Get Full Table First 10 out of 68 Input Tumor Samples.

Tumor Sample Names
TCGA-A6-2671-01A-01D-1405-02
TCGA-A6-2674-01A-02D-1167-02
TCGA-A6-2676-01A-01D-1167-02
TCGA-A6-2678-01A-01D-1167-02
TCGA-A6-2679-01A-02D-1405-02
TCGA-A6-2680-01A-01D-1405-02
TCGA-A6-2681-01A-01D-1405-02
TCGA-A6-2683-01A-01D-1167-02
TCGA-A6-2684-01A-01D-1405-02
TCGA-A6-3807-01A-01D-1167-02

Figure 3. Segmented copy number profiles in the input data

Output

All Lesions File (all_lesions.conf_##.txt, where ## is the confidence level)

The all lesions file summarizes the results from the GISTIC run. It contains data about the significant regions of amplification and deletion as well as which samples are amplified or deleted in each of these regions. The identified regions are listed down the first column, and the samples are listed across the first row, starting in column 10.

Region Data

Columns 1-9 present the data about the significant regions as follows:

Unique Name: A name assigned to identify the region.
Descriptor: The genomic descriptor of that region.
Wide Peak Limits: The 'wide peak' boundaries most likely to contain the targeted genes. These are listed in genomic coordinates and marker (or probe) indices.
Peak Limits: The boundaries of the region of maximal amplification or deletion.
Region Limits: The boundaries of the entire significant region of amplification or deletion.
Q values: The Q value of the peak region.
Residual Q values: The Q value of the peak region after removing ('peeling off') amplifications or deletions that overlap other, more significant peak regions in the same chromosome.
Broad or Focal: Identifies whether the region reaches significance due primarily to broad events (called 'broad'), focal events (called 'focal'), or independently significant broad and focal events (called 'both').
Amplitude Threshold: Key giving the meaning of values in the subsequent columns associated with each sample.

Sample Data

Each of the analyzed samples is represented in one of the columns following the lesion data (columns 10 through end). The data contained in these columns varies slightly by section of the file. The first section can be identified by the key given in column 9 - it starts in row 2 and continues until the row that reads 'Actual Copy Change Given.' This section contains summarized data for each sample. A '0' indicates that the copy number of the sample was not amplified or deleted beyond the threshold amount in that peak region. A '1' indicates that the sample had low-level copy number aberrations (exceeding the low threshold indicated in column 9), and a '2' indicates that the sample had high-level copy number aberrations (exceeding the high threshold indicated in column 9).The second section can be identified the rows in which column 9 reads 'Actual Copy Change Given.' The second section exactly reproduces the first section, except that here the actual changes in copy number are provided rather than zeroes, ones, and twos.The final section is similar to the first section, except that here only broad events are included. A 1 in the samples columns (columns 10+) indicates that the median copy number of the sample across the entire significant region exceeded the threshold given in column 9. That is, it indicates whether the sample had a geographically extended event, rather than a focal amplification or deletion covering little more than the peak region.

Amplification Genes File (amp_genes.conf_##.txt, where ## is the confidence level)

The amp genes file contains one column for each amplification peak identified in the GISTIC analysis. The first four rows are:

Cytoband
Q value
Residual Q value
Wide Peak Boundaries

These rows identify the lesion in the same way as the all lesions file.The remaining rows list the genes contained in each wide peak. For peaks that contain no genes, the nearest gene is listed in brackets.

Deletion Genes File (del_genes.conf_##.txt, where ## is the confidence level)

The del genes file contains one column for each deletion peak identified in the GISTIC analysis. The file format for the del genes file is identical to the format for the amp genes file.

Gistic Scores File (scores.gistic)

The scores file lists the Q values [presented as -log10(q)], G scores, average amplitudes among aberrant samples, and frequency of aberration, across the genome for both amplifications and deletions. The scores file is viewable with the Genepattern SNPViewer module and may be imported into the Integrated Genomics Viewer (IGV).

Segmented Copy Number (raw_copy_number.{fig|pdf|png} )

The segmented copy number is a pdf file containing a colormap image of the segmented copy number profiles in the input data.

Amplification Score GISTIC plot (amp_qplot.{fig|pdf|png|v2.pdf})

The amplification pdf is a plot of the G scores (top) and Q values (bottom) with respect to amplifications for all markers over the entire region analyzed.

Deletion Score GISTIC plot (del_qplot.{fig|pdf|png|v2.pdf})

The deletion pdf is a plot of the G scores (top) and Q values (bottom) with respect to deletions for all markers over the entire region analyzed.

Tables (table_{amp|del}.conf_##.txt, where ## is the confidence level)

Tables of basic information about the genomic regions (peaks) that GISTIC determined to be significantly amplified or deleted. These describe three kinds of peak boundaries, and list the genes contained in two of them. The region start and region end columns (along with the chromosome column) delimit the entire area containing the peak that is above the significance level. The region may be the same for multiple peaks. The peak start and end delimit the maximum value of the peak. The extended peak is the peak determined by robust, and is contained within the wide peak reported in {amp|del}_genes.txt by one marker.

Broad Significance Results (broad_significance_results.txt)

A table of per-arm statistical results for the data set. Each arm is a row in the table. The first column specifies the arm and the second column counts the number of genes known to be on the arm. For both amplification and deletion, the table has columns for the frequency of amplification or deletion of the arm, and a Z score and Q value.

Broad Values By Arm (broad_values_by_arm.txt)

A table of chromosome arm amplification levels for each sample. Each row is a chromosome arm, and each column a sample. The data are in units of absolute copy number -2.

All Data By Genes (all_data_by_genes.txt)

A gene-level table of copy number values for all samples. Each row is the data for a gene. The first three columns name the gene, its NIH locus ID, and its cytoband - the remaining columns are the samples. The copy number values in the table are in units of (copy number -2), so that no amplification or deletion is 0, genes with amplifications have positive values, and genes with deletions are negative values. The data are converted from marker level to gene level using the extreme method: a gene is assigned the greatest amplification or the least deletion value among the markers it covers.

Broad Data By Genes (broad_data_by_genes.txt)

A gene-level table of copy number data similar to the all_data_by_genes.txt output, but using only broad events with lengths greater than the broad length cutoff. The structure of the file and the methods and units used for the data analysis are otherwise identical to all_data_by_genes.txt.

Focal Data By Genes (focal_data_by_genes.txt)

A gene-level table of copy number data similar to the all_data_by_genes.txt output, but using only focal events with lengths greater than the focal length cutoff. The structure of the file and the methods and units used for the data analysis are otherwise identical to all_data_by_genes.txt.

All Thresholded By Genes (all_thresholded.by_genes.txt)

A gene-level table of discrete amplification and deletion indicators at for all samples. There is a row for each gene. The first three columns name the gene, its NIH locus ID, and its cytoband - the remaining columns are the samples. A table value of 0 means no amplification or deletion above the threshold. Amplifications are positive numbers: 1 means amplification above the amplification threshold; 2 means amplifications larger to the arm level amplifications observed for the sample. Deletions are represented by negative table values: -1 represents deletion beyond the threshold; -2 means deletions greater than the minimum arm-level deletion observed for the sample.

Sample Cutoffs (sample_cutoffs.txt)

A table of the per-sample threshold cutoffs (in units of absolute copy number -2) used to distinguish the high level amplifications (+/-2) from ordinary amplifications (+/-1) in the all_thresholded.by_genes.txt output file. The table contains three columns: the sample identifier followed by the low (deletion) and high (amplification) cutoff values. The cutoffs are calculated as the minimum arm-level amplification level less the deletion threshold for deletions and the maximum arm-level amplification plus the amplification threshold for amplifications.

Focal Input To Gistic (focal_input.seg.txt)

A list of copy number segments describing just the focal events present in the data. The segment amplification/deletion levels are in units of (copy number -2), with amplifications positive and deletions negative numbers. This file may be viewed with IGV.

Gene Counts vs. Copy Number Alteration Frequency (freqarms_vs_ngenes.{fig|pdf})

An image showing the correlation between gene counts and frequency of copy number alterations.

Confidence Intervals (regions_track.conf_##.bed, where ## is the confidence level)

A file indicating the position of the confidence intervals around GISTIC peaks that can be loaded as a track in a compatible viewer browser such as IGV or the UCSC genome browser.

GISTIC

GISTIC identifies genomic regions that are significantly gained or lost across a set of tumors. It takes segmented copy number ratios as input, separates arm-level events from focal events, and then performs two tests: (i) identifies significantly amplified/deleted chromosome arms; and (ii) identifies regions that are significantly focally amplified or deleted. For the focal analysis, the significance levels (Q values) are calculated by comparing the observed gains/losses at each locus to those obtained by randomly permuting the events along the genome to reflect the null hypothesis that they are all 'passengers' and could have occurred anywhere. The locus-specific significance levels are then corrected for multiple hypothesis testing. The arm-level significance is calculated by comparing the frequency of gains/losses of each arm to the expected rate given its size. The method outputs genomic views of significantly amplified and deleted regions, as well as a table of genes with gain or loss scores. A more in depth discussion of the GISTIC algorithm and its utility is given in [1], [3], and [5].

CNV Description

Regions of the genome that are prone to germ line variations in copy number are excluded from the GISTIC analysis using a list of germ line copy number variations (CNVs). A CNV is a DNA sequence that may be found at different copy numbers in the germ line of two different individuals. Such germ line variations can confound a GISTIC analysis, which finds significant somatic copy number variations in cancer. A more in depth discussion is provided in [6]. GISTIC currently uses two CNV exclusion lists. One is based on the literature describing copy number variation, and a second one comes from an analysis of significant variations among the blood normals in the TCGA data set.

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References

[1] Beroukhim et al, Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma, Proc Natl Acad Sci U S A. Vol. 104:50 (2007)

[2] GISTIC version 1

[3] Mermel et al, GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers, Genome Biology Vol. 12:4 (2011)

[4] GISTIC version 2

[5] Beroukhim et al., The landscape of somatic copy-number alteration across human cancers, Nature Vol. 463:7283 (2010)

[6] McCarroll, S. A. et al., Integrated detection and population-genetic analysis of SNPs and copy number variation, Nat Genet Vol. 40(10):1166-1174 (2008)

[7] The Sanger Institute: Cancer Gene Census

Made with Nozzle