LowPass Copy number analysis (GISTIC2)

Skin Cutaneous Melanoma (Metastatic)

15 January 2014 | analyses__2014_01_15

Maintainer Information

Citation Information

Maintained by Spring Yingchun Liu (Broad Institute)

Cite as Broad Institute TCGA Genome Data Analysis Center (2014): LowPass Copy number analysis (GISTIC2). Broad Institute of MIT and Harvard. doi:10.7908/C11V5CF8

Overview

Introduction

GISTIC identifies genomic regions that are significantly gained or lost across a set of tumors. The pipeline first filters out normal samples from the segmented copy-number data by inspecting the TCGA barcodes and then executes GISTIC version 2.0.20 (Firehose task version: 126).

Summary

There were 103 tumor samples used in this analysis: 20 significant arm-level results, 8 significant focal amplifications, and 6 significant focal deletions were found.

Results

Focal results

Figure 1. Genomic positions of amplified regions: the X-axis represents the normalized amplification signals (top) and significance by Q value (bottom). The green line represents the significance cutoff at Q value=0.25.

Table 1. Get Full Table Amplifications Table - 8 significant amplifications found. Click the link in the last column to view a comprehensive list of candidate genes. If no genes were identified within the peak, the nearest gene appears in brackets.

Cytoband	Q value	Residual Q value	Wide Peak Boundaries	# Genes in Wide Peak
1p12	0.004072	0.004072	chr1:119726941-150032773	98
3p13	0.0074385	0.0074385	chr3:69568300-70340162	1
12q15	0.012434	0.012434	chr12:68983694-69478370	7
22q13.2	0.026049	0.026049	chr22:40441512-42059356	27
7p22.2	0.081588	0.081588	chr7:1-8646788	93
11q13.3	0.11498	0.11498	chr11:68594424-71666457	34
5p15.31	0.15635	0.15635	chr5:8145603-10301016	12
20q13.2	0.19463	0.19463	chr20:47396006-63025520	199

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 1p12.

Table S1. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
BCL9
NOTCH2
PDE4DIP
hsa-mir-3118-3
hsa-mir-3118-2
hsa-mir-3118-1
FCGR1A
FCGR1B
FMO5
GJA5
GJA8
HMGCS2
HSD3B1
HSD3B2
PDZK1
PRKAB2
HIST2H2AA3
HIST2H2AC
HIST2H2BE
HIST2H4A
ITGA10
PEX11B
SEC22B
CHD1L
SV2A
RBM8A
SF3B4
PIAS3
POLR3C
TXNIP
MTMR11
ADAM30
CD160
NBPF14
PHGDH
RNF115
BOLA1
HAO2
ACP6
GPR89B
OTUD7B
FAM91A2
REG4
POLR3GL
ZNF697
GNRHR2
HIST2H3C
LIX1L
HSD3BP4
HFE2
ANKRD35
PPIAL4A
PDIA3P
NBPF11
NUDT17
NBPF15
ANKRD34A
HIST2H2AB
HIST2H3A
HIST2H2BC
HIST2H2BA
NBPF7
LOC375010
NOTCH2NL
FLJ39739
LOC388692
NBPF9
HIST2H2BF
HIST2H4B
LOC644242
PPIAL4G
PPIAL4D
LOC645166
EMBP1
SRGAP2P2
PPIAL4B
LOC653513
GPR89A
PPIAL4C
HIST2H3D
FAM72B
HIST2H2AA4
FAM72D
LOC728855
LOC728875
NBPF24
GPR89C
NBPF16
PDZK1P1
PPIAL4F
LOC728989
PPIAL4E
PFN1P2
LOC100130000
NBPF10
FCGR1C
LOC100286793
LOC100289211

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 3p13.

Table S2. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
MITF

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 12q15.

Table S3. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
MDM2
CPM
RAP1B
SLC35E3
NUP107
SNORA70G
LOC100507250

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 22q13.2.

Table S4. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
EP300
MKL1
hsa-mir-1281
ACO2
ADSL
XRCC6
MCHR1
PMM1
RANGAP1
ST13
TEF
RBX1
SLC25A17
TOB2
TNRC6B
ZC3H7B
CSDC2
PPPDE2
SGSM3
XPNPEP3
L3MBTL2
PHF5A
DNAJB7
CHADL
POLR3H
MIR1281
MIR4766

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 7p22.2.

Table S5. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
PMS2
CARD11
hsa-mir-589
hsa-mir-339
ACTB
GNA12
GPER
ICA1
LFNG
NUDT1
PDGFA
PRKAR1B
RAC1
RPA3
FSCN1
ZNF12
AIMP2
MAFK
MAD1L1
EIF3B
CYTH3
KIAA0415
KDELR2
ADAP1
IQCE
SUN1
WIPI2
INTS1
EIF2AK1
SNX8
FTSJ2
NXPH1
GET4
CCZ1
MIOS
RNF216
ZNF853
CYP2W1
HEATR2
ZDHHC4
CHST12
RADIL
PAPOLB
C1GALT1
FAM20C
RBAK
C7orf26
MICALL2
FBXL18
TTYH3
USP42
PSMG3
C7orf50
TNRC18
C7orf70
ZFAND2A
COX19
GLCCI1
KIAA1908
GPR146
AMZ1
TMEM184A
BRAT1
SDK1
FOXK1
MMD2
DAGLB
CCZ1B
SLC29A4
RSPH10B
TFAMP1
UNCX
COL28A1
LOC389458
ELFN1
GRID2IP
ZNF815
RNF216P1
PMS2CL
FLJ44511
LOC442497
MIR339
ZNF890P
OCM
MIR589
RSPH10B2
LOC729852
LOC100131257
LOC100288524
RBAK-LOC389458
MIR4648
MIR4655
MIR4656

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 11q13.3.

Table S6. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
CCND1
hsa-mir-548k
hsa-mir-3164
CPT1A
DHCR7
CTTN
FGF3
FGF4
IGHMBP2
KRTAP5-9
PPFIA1
FADD
FGF19
SHANK2
MYEOV
ANO1
NADSYN1
FAM86C1
RNF121
KRTAP5-8
MRGPRD
MRGPRF
MRPL21
TPCN2
ORAOV1
DEFB108B
KRTAP5-10
FLJ42102
KRTAP5-7
KRTAP5-11
LOC100129216
LOC100133315
MIR548K
MIR3664

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 5p15.31.

Table S7. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
SEMA5A
CCT5
TAS2R1
FAM173B
CMBL
LOC285692
LOC729506
SNORD123
LOC100505738
LOC100505806
MIR4458
MIR4636

Genes in Wide Peak

This is the comprehensive list of amplified genes in the wide peak for 20q13.2.

Table S8. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
GNAS
SS18L1
hsa-mir-647
hsa-mir-4326
hsa-mir-124-3
hsa-mir-133a-2
hsa-mir-3195
hsa-mir-1257
hsa-mir-646
hsa-mir-298
hsa-mir-4325
hsa-mir-1302-5
hsa-mir-1259
hsa-mir-3194
ATP5E
BMP7
CDH4
CEBPB
CHRNA4
COL9A3
CSE1L
CSTF1
CTSZ
CYP24A1
EDN3
EEF1A2
NPBWR2
KCNB1
KCNG1
KCNQ2
LAMA5
MC3R
MYT1
NFATC2
NTSR1
OPRL1
PCK1
PFDN4
PPP1R3D
PSMA7
PTGIS
PTK6
PTPN1
RPS21
SNAI1
SRMS
STAU1
AURKA
TAF4
TCEA2
TFAP2C
TPD52L2
UBE2V1
ZNF217
RAE1
BCAS1
STX16
TNFRSF6B
DPM1
VAPB
B4GALT5
SPATA2
OSBPL2
ATP9A
ARFRP1
RGS19
SYCP2
ARFGEF2
TCFL5
ADRM1
OGFR
DIDO1
HRH3
SLC9A8
ADNP
SPO11
PRPF6
GTPBP5
GMEB2
SNORD12C
MOCS3
SLCO4A1
STMN3
SLMO2
TH1L
C20orf43
RTEL1
SOX18
YTHDF1
LIME1
UCKL1
C20orf11
PCMTD2
C20orf20
PPP4R1L
RBM38
BCAS4
DDX27
ZFP64
ARFGAP1
DOK5
RNF114
SLC2A4RG
PMEPA1
CASS4
SALL4
ZNFX1
RAB22A
ZNF512B
PREX1
COL20A1
CDH26
SLC17A9
LOC63930
FAM217B
C20orf195
PPDPF
BIRC7
NPEPL1
DNAJC5
TUBB1
ZBP1
CABLES2
PARD6B
ZGPAT
PRIC285
FAM210B
PHACTR3
BHLHE23
NKAIN4
TSHZ2
C20orf85
ZNF831
C20orf166
GATA5
ZBTB46
GCNT7
CBLN4
CTCFL
SAMD10
ABHD16B
LINC00266-1
FAM65C
C20orf151
LOC149773
GNAS-AS1
LSM14B
APCDD1L
C20orf201
FAM209A
C20orf166-AS1
LINC00176
LOC284751
C20orf197
LOC284757
TMEM189
TMEM189-UBE2V1
FAM209B
SUMO1P1
MIR1-1
MIR124-3
MIR133A2
MIR296
ZNFX1-AS1
SNORD12
MIR645
MIR647
HAR1A
HAR1B
UCKL1-AS1
SNORD12B
MIR298
MIR941-1
MIR941-4
MIR941-2
MIR941-3
LOC100127888
DPH3P1
LINC00029
LOC100144597
FLJ16779
MIR1914
MIR1257
MIR4325
MIR3194
MIR4326
MIR3196
MTRNR2L3
LOC100505815
LOC100506384
RTEL1-TNFRSF6B
SLMO2-ATP5E
STX16-NPEPL1
MIR4756
MIR4758
MIR4532
MIR4533
MIR5095
LOC100652730

Figure 2. Genomic positions of deleted regions: the X-axis represents the normalized deletion signals (top) and significance by Q value (bottom). The green line represents the significance cutoff at Q value=0.25.

Table 2. Get Full Table Deletions Table - 6 significant deletions found. Click the link in the last column to view a comprehensive list of candidate genes. If no genes were identified within the peak, the nearest gene appears in brackets.

Cytoband	Q value	Residual Q value	Wide Peak Boundaries	# Genes in Wide Peak
9p21.3	1.5461e-60	9.4529e-23	chr9:21946194-21977643	2
9p21.3	4.6329e-60	1.8582e-17	chr9:21975057-22009600	3
11q23.3	1.4059e-05	1.4059e-05	chr11:105910449-135006516	278
4q34.3	0.11077	0.11077	chr4:182105085-182565336	0 [LINC00290]
10q23.31	0.13031	0.13031	chr10:89609324-89919188	2
1p36.31	0.1634	0.1634	chr1:1-9528391	156

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 9p21.3.

Table S9. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
CDKN2A
C9orf53

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 9p21.3.

Table S10. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
CDKN2A
CDKN2B
CDKN2B-AS1

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 11q23.3.

Table S11. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
ATM
CBL
DDX6
DDX10
FLI1
MLL
PAFAH1B2
POU2AF1
SDHD
PCSK7
ARHGEF12
hsa-mir-3167
hsa-mir-100
hsa-mir-4301
hsa-mir-34c
ACAT1
ACRV1
APLP2
APOA1
APOA4
APOC3
ARCN1
FXYD2
CXCR5
CD3D
CD3E
CD3G
CHEK1
CRYAB
DLAT
DPAGT1
DRD2
ETS1
FDX1
SLC37A4
GRIK4
GUCY1A2
H2AFX
HMBS
HSPA8
HSPB2
HTR3A
IL10RA
IL18
STT3A
KCNJ1
KCNJ5
VWA5A
MCAM
NCAM1
NFRKB
NNMT
NPAT
NRGN
OPCML
PPP2R1B
PTS
PVRL1
RDX
RPS25
SC5DL
SCN2B
SCN4B
ST3GAL4
SLN
SORL1
SRPR
ST14
TAGLN
TECTA
THY1
UPK2
ZBTB16
ZNF202
CUL5
BARX2
ZNF259
USP2
HTR3B
ZW10
UBE4A
EI24
FEZ1
ARHGAP32
C2CD2L
RBM7
MPZL2
HYOU1
ATP5L
ADAMTS8
TREH
CEP164
IGSF9B
EXPH5
PHLDB1
SIK2
NCAPD3
SIK3
VSIG2
BACE1
TRIM29
CADM1
POU2F3
HINFP
REXO2
OR8G2
OR8B8
OR8G1
TIMM8B
OR8B2
ACAD8
B3GAT1
DCPS
ZBTB44
THYN1
DDX25
NTM
CDON
SIDT2
TRAPPC4
SPA17
FXYD6
SIAE
C11orf71
ROBO4
SLC35F2
RAB39A
BTG4
FAM55D
TTC12
C11orf57
ELMOD1
FOXRED1
SCN3B
VPS11
TEX12
CRTAM
TMPRSS4
IFT46
PRDM10
DSCAML1
GRAMD1B
ARHGAP20
USP28
AASDHPPT
PKNOX2
TP53AIP1
ABCG4
ROBO3
C11orf1
RNF26
FAM118B
NLRX1
C11orf61
ALG9
CLMP
PDZD3
C11orf63
CCDC15
TMPRSS5
PUS3
MFRP
JAM3
BCO2
TMPRSS13
KIRREL3
BUD13
TMEM25
RPUSD4
TBRG1
UBASH3B
DIXDC1
ZC3H12C
GLB1L2
ESAM
ALKBH8
FDXACB1
C11orf52
VPS26B
GLB1L3
TIRAP
C1QTNF5
PANX3
APOA5
TMEM45B
C11orf93
PIH1D2
FAM55A
FAM55B
AMICA1
KBTBD3
CWF19L2
KDELC2
LAYN
TTC36
PATE1
C11orf65
ADAMTS15
MPZL3
C11orf45
HYLS1
TMEM218
SLC37A2
OR8B12
OR8G5
OR10G8
OR10G9
OR10S1
OR6T1
OR4D5
TBCEL
TMEM136
SPATA19
HEPACAM
OAF
ANKK1
RNF214
LOC283143
BCL9L
FOXR1
CCDC153
OR8D1
OR8D2
OR8B4
KIRREL3-AS3
LOC283174
LOC283177
CCDC84
TMEM225
OR8D4
C11orf53
LOC341056
C11orf34
BSX
OR6X1
OR6M1
OR10G4
OR10G7
OR8B3
OR8A1
C11orf87
C11orf92
C11orf88
MIR100HG
PATE2
PATE4
FLJ39051
SNX19
MIRLET7A2
MIR100
MIR125B1
MIR34B
MIR34C
BLID
LINC00167
HEPN1
LOC643923
CLDN25
LOC649133
RPL23AP64
LOC100128239
LOC100132078
PATE3
LOC100288346
BACE1-AS
MIR4301
MIR3167
LOC100499227
MIR3656
LOC100507392
LOC100526771
HSPB2-C11orf52
FXYD6-FXYD2
MIR4697
MIR4493
MIR4491
MIR4492
LOC100652768

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 10q23.31.

Table S12. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
PTEN
KLLN

Genes in Wide Peak

This is the comprehensive list of deleted genes in the wide peak for 1p36.31.

Table S13. Genes in bold are cancer genes as defined by The Sanger Institute's Cancer Gene Census [7].

Genes
RPL22
TNFRSF14
PRDM16
hsa-mir-34a
hsa-mir-4252
hsa-mir-551a
hsa-mir-4251
hsa-mir-429
hsa-mir-1302-2
RERE
CA6
CDK11B
DFFB
DVL1
MEGF6
ENO1
GABRD
GNB1
ZBTB48
TNFRSF9
PEX10
PRKCZ
SCNN1D
SKI
SLC2A5
TP73
TNFRSF4
MMP23B
MMP23A
KCNAB2
TNFRSF25
TNFRSF18
PER3
VAMP3
H6PD
ISG15
PLCH2
CEP104
KLHL21
SLC35E2
UTS2
RER1
PARK7
ACOT7
CAMTA1
ICMT
CHD5
NOC2L
OR4F3
ARHGEF16
SSU72
WRAP73
SLC45A1
SDF4
ERRFI1
MXRA8
HES2
CPSF3L
C1orf159
AURKAIP1
MRPL20
ATAD3A
PANK4
DNAJC11
AJAP1
TP73-AS1
PLEKHG5
LRRC47
HES4
VWA1
NADK
MMEL1
OR4F5
NOL9
LINC00115
MORN1
GPR157
SPSB1
GLTPD1
TAS1R1
OR4F16
CCNL2
ESPN
TAS1R3
ATAD3B
PLEKHN1
C1orf170
KIAA1751
THAP3
LOC115110
ACAP3
UBE2J2
PUSL1
B3GALT6
TPRG1L
FAM213B
ACTRT2
MIB2
SAMD11
LOC148413
PHF13
CCDC27
SLC2A7
CALML6
C1orf86
ATAD3C
LOC254099
TTLL10
NPHP4
FAM41C
LOC284661
C1orf174
KLHL17
TMEM240
TMEM52
AGRN
GPR153
FAM132A
HES5
LOC388588
RNF207
HES3
RNF223
MIR200A
MIR200B
MIR34A
FLJ42875
ANKRD65
MIR429
FAM138F
LOC643837
TMEM88B
C1orf233
FAM138A
WASH7P
MIR551A
CDK11A
SLC35E2B
LOC728716
LOC729737
OR4F29
LOC100129534
LOC100130417
LOC100132062
LOC100132287
LOC100133331
LOC100133445
LOC100133612
DDX11L1
TTC34
LOC100288069
MIR4251
MIR4252
ENO1-AS1
MIR4689
MIR4417

Arm-level results

Table 3. Get Full Table Arm-level significance table - 20 significant results found. The significance cutoff is at Q value=0.25.

Arm	# Genes	Amp Frequency	Amp Z score	Amp Q value	Del Frequency	Del Z score	Del Q value
1p	2121	0.15	-0.796	1	0.17	-0.338	0.999
1q	1955	0.42	6.03	1.08e-08	0.11	-1.42	0.999
2p	924	0.10	-2.02	1	0.12	-1.56	0.999
2q	1556	0.12	-1.54	1	0.11	-1.77	0.999
3p	1062	0.06	-2.94	1	0.08	-2.47	0.999
3q	1139	0.06	-2.88	1	0.10	-1.95	0.999
4p	489	0.08	-2.37	1	0.17	-0.284	0.999
4q	1049	0.06	-2.88	1	0.18	0.13	0.943
5p	270	0.13	-1.09	1	0.14	-0.862	0.999
5q	1427	0.08	-2.32	1	0.27	2.28	0.0452
6p	1173	0.42	5.66	7.64e-08	0.32	3.03	0.00601
6q	839	0.13	-0.848	1	0.59	10.5	0
7p	641	0.41	6.04	1.08e-08	0.10	-1.72	0.999
7q	1277	0.47	7.5	1.29e-12	0.09	-1.8	0.999
8p	580	0.14	-0.93	1	0.19	0.223	0.915
8q	859	0.29	2.78	0.0137	0.07	-2.53	0.999
9p	422	0.02	-2.57	1	0.61	11.4	0
9q	1113	0.04	-2.79	1	0.47	7.52	2.78e-13
10p	409	0.02	-3.2	1	0.44	6.97	1.31e-11
10q	1268	0.00	-3.43	1	0.48	7.86	2.66e-14
11p	862	0.11	-1.62	1	0.22	1.14	0.39
11q	1515	0.14	-0.992	1	0.31	3.33	0.00249
12p	575	0.09	-2.21	1	0.14	-1.05	0.999
12q	1447	0.03	-3.72	1	0.09	-2.33	0.999
13q	654	0.25	1.66	0.204	0.22	0.983	0.465
14q	1341	0.05	-2.81	1	0.29	2.95	0.00697
15q	1355	0.16	-0.54	1	0.09	-2.16	0.999
16p	872	0.09	-2.29	1	0.12	-1.59	0.999
16q	702	0.08	-2.31	1	0.19	0.241	0.915
17p	683	0.14	-0.965	1	0.31	3.37	0.00248
17q	1592	0.33	3.6	0.00106	0.24	1.37	0.287
18p	143	0.11	-1.52	1	0.24	1.71	0.157
18q	446	0.05	-3.09	1	0.20	0.634	0.658
19p	995	0.12	-1.44	1	0.13	-1.2	0.999
19q	1709	0.14	-1.13	1	0.08	-2.52	0.999
20p	355	0.30	3.13	0.00499	0.08	-2.17	0.999
20q	753	0.37	4.94	3.09e-06	0.03	-3.14	0.999
21q	509	0.13	-1.25	1	0.17	-0.0966	0.999
22q	921	0.24	1.64	0.204	0.16	-0.422	0.999
Xq	1312	0.19	0.221	1	0.21	0.677	0.658

Methods & Data

Input

Description

Segmentation File: The segmentation file contains the segmented data for all the samples identified by GLAD, CBS, or some other segmentation algorithm. (See GLAD file format in the Genepattern file formats documentation.) It is a six column, tab-delimited file with an optional first line identifying the columns. Positions are in base pair units.The column headers are: (1) Sample (sample name), (2) Chromosome (chromosome number), (3) Start Position (segment start position, in bases), (4) End Position (segment end position, in bases), (5) Num markers (number of markers in segment), (6) Seg.CN (log2() -1 of copy number).
Markers File: The markers file identifies the marker names and positions of the markers in the original dataset (before segmentation). It is a three column, tab-delimited file with an optional header. The column headers are: (1) Marker Name, (2) Chromosome, (3) Marker Position (in bases).
Reference Genome: The reference genome file contains information about the location of genes and cytobands on a given build of the genome. Reference genome files are created in Matlab and are not viewable with a text editor.
CNV Files: There are two options for the cnv file. The first option allows CNVs to be identified by marker name. The second option allows the CNVs to be identified by genomic location. Option #1: A two column, tab-delimited file with an optional header row. The marker names given in this file must match the marker names given in the markers file. The CNV identifiers are for user use and can be arbitrary. The column headers are: (1) Marker Name, (2) CNV Identifier. Option #2: A 6 column, tab-delimited file with an optional header row. The 'CNV Identifier' is for user use and can be arbitrary. 'Narrow Region Start' and 'Narrow Region End' are also not used. The column headers are: (1) CNV Identifier, (2) Chromosome, (3) Narrow Region Start, (4) Narrow Region End, (5) Wide Region Start, (6) Wide Region End
Amplification Threshold: Threshold for copy number amplifications. Regions with a log2 ratio above this value are considered amplified.
Deletion Threshold: Threshold for copy number deletions. Regions with a log2 ratio below the negative of this value are considered deletions.
Cap Values: Minimum and maximum cap values on analyzed data. Regions with a log2 ratio greater than the cap are set to the cap value; regions with a log2 ratio less than -cap value are set to -cap. Values must be positive.
Broad Length Cutoff: Threshold used to distinguish broad from focal events, given in units of fraction of chromosome arm.
Remove X-Chromosome: Flag indicating whether to remove data from the X-chromosome before analysis. Allowed values= {1,0} (1: Remove X-Chromosome, 0: Do not remove X-Chromosome.
Confidence Level: Confidence level used to calculate the region containing a driver.
Join Segment Size: Smallest number of markers to allow in segments from the segmented data. Segments that contain fewer than this number of markers are joined to the neighboring segment that is closest in copy number.
Arm Level Peel Off: Flag set to enable arm-level peel-off of events during peak definition. The arm-level peel-off enhancement to the arbitrated peel-off method assigns all events in the same chromosome arm of the same sample to a single peak. It is useful when peaks are split by noise or chromothripsis. Allowed values= {1,0} (1: Use arm level peel off, 0: Use normal arbitrated peel-off).
Maximum Sample Segments: Maximum number of segments allowed for a sample in the input data. Samples with more segments than this threshold are excluded from the analysis.
Gene GISTIC: When enabled (value = 1), this option causes GISTIC to analyze deletions using genes instead of array markers to locate the lesion. In this mode, the copy number assigned to a gene is the lowest copy number among the markers that represent the gene.

Values

List of inputs used for this run of GISTIC2. All files listed should be included in the archived results.

Segmentation File = /xchip/cga/gdac-prod/tcga-gdac/jobResults/PrepareGisticDNASeq/SKCM-TM/6154881/segmentationfile.txt
Markers File = /xchip/cga/gdac-prod/tcga-gdac/jobResults/PrepareGisticDNASeq/SKCM-TM/6154881/markersfile.txt
Reference Genome = /xchip/cga/reference/gistic2/hg19_with_miR_20120227.mat
CNV Files = /xchip/cga/reference/gistic2/CNV.hg19.bypos.111213.txt
Amplification Threshold = 0.3
Deletion Threshold = 0.3
Cap Values = 2
Broad Length Cutoff = 0.5
Remove X-Chromosome = 0
Confidence Level = 0.99
Join Segment Size = 10
Arm Level Peel Off = 1
Maximum Sample Segments = 10000
Gene GISTIC = 0

Table 4. Get Full Table First 10 out of 103 Input Tumor Samples.

Tumor Sample Names
TCGA-D3-A1Q7-06A-11D-A18Z-02
TCGA-D3-A1Q8-06A-11D-A18Z-02
TCGA-D3-A1Q9-06A-11D-A18Z-02
TCGA-D3-A1QB-06A-11D-A18Z-02
TCGA-D3-A2J6-06A-11D-A18Z-02
TCGA-D3-A2JC-06A-11D-A18Z-02
TCGA-D3-A2JD-06A-11D-A18Z-02
TCGA-D3-A2JP-06A-11D-A18Z-02
TCGA-D3-A3C3-06A-12D-A18Z-02
TCGA-D3-A3C8-06A-12D-A18Z-02

Figure 3. Segmented copy number profiles in the input data

Output

All Lesions File (all_lesions.conf_##.txt, where ## is the confidence level)

The all lesions file summarizes the results from the GISTIC run. It contains data about the significant regions of amplification and deletion as well as which samples are amplified or deleted in each of these regions. The identified regions are listed down the first column, and the samples are listed across the first row, starting in column 10.

Region Data

Columns 1-9 present the data about the significant regions as follows:

Unique Name: A name assigned to identify the region.
Descriptor: The genomic descriptor of that region.
Wide Peak Limits: The 'wide peak' boundaries most likely to contain the targeted genes. These are listed in genomic coordinates and marker (or probe) indices.
Peak Limits: The boundaries of the region of maximal amplification or deletion.
Region Limits: The boundaries of the entire significant region of amplification or deletion.
Q values: The Q value of the peak region.
Residual Q values: The Q value of the peak region after removing ('peeling off') amplifications or deletions that overlap other, more significant peak regions in the same chromosome.
Broad or Focal: Identifies whether the region reaches significance due primarily to broad events (called 'broad'), focal events (called 'focal'), or independently significant broad and focal events (called 'both').
Amplitude Threshold: Key giving the meaning of values in the subsequent columns associated with each sample.

Sample Data

Each of the analyzed samples is represented in one of the columns following the lesion data (columns 10 through end). The data contained in these columns varies slightly by section of the file. The first section can be identified by the key given in column 9 - it starts in row 2 and continues until the row that reads 'Actual Copy Change Given.' This section contains summarized data for each sample. A '0' indicates that the copy number of the sample was not amplified or deleted beyond the threshold amount in that peak region. A '1' indicates that the sample had low-level copy number aberrations (exceeding the low threshold indicated in column 9), and a '2' indicates that the sample had high-level copy number aberrations (exceeding the high threshold indicated in column 9).The second section can be identified the rows in which column 9 reads 'Actual Copy Change Given.' The second section exactly reproduces the first section, except that here the actual changes in copy number are provided rather than zeroes, ones, and twos.The final section is similar to the first section, except that here only broad events are included. A 1 in the samples columns (columns 10+) indicates that the median copy number of the sample across the entire significant region exceeded the threshold given in column 9. That is, it indicates whether the sample had a geographically extended event, rather than a focal amplification or deletion covering little more than the peak region.

Amplification Genes File (amp_genes.conf_##.txt, where ## is the confidence level)

The amp genes file contains one column for each amplification peak identified in the GISTIC analysis. The first four rows are:

Cytoband
Q value
Residual Q value
Wide Peak Boundaries

These rows identify the lesion in the same way as the all lesions file.The remaining rows list the genes contained in each wide peak. For peaks that contain no genes, the nearest gene is listed in brackets.

Deletion Genes File (del_genes.conf_##.txt, where ## is the confidence level)

The del genes file contains one column for each deletion peak identified in the GISTIC analysis. The file format for the del genes file is identical to the format for the amp genes file.

Gistic Scores File (scores.gistic)

The scores file lists the Q values [presented as -log10(q)], G scores, average amplitudes among aberrant samples, and frequency of aberration, across the genome for both amplifications and deletions. The scores file is viewable with the Genepattern SNPViewer module and may be imported into the Integrated Genomics Viewer (IGV).

Segmented Copy Number (raw_copy_number.{fig|pdf|png} )

The segmented copy number is a pdf file containing a colormap image of the segmented copy number profiles in the input data.

Amplification Score GISTIC plot (amp_qplot.{fig|pdf|png|v2.pdf})

The amplification pdf is a plot of the G scores (top) and Q values (bottom) with respect to amplifications for all markers over the entire region analyzed.

Deletion Score GISTIC plot (del_qplot.{fig|pdf|png|v2.pdf})

The deletion pdf is a plot of the G scores (top) and Q values (bottom) with respect to deletions for all markers over the entire region analyzed.

Tables (table_{amp|del}.conf_##.txt, where ## is the confidence level)

Tables of basic information about the genomic regions (peaks) that GISTIC determined to be significantly amplified or deleted. These describe three kinds of peak boundaries, and list the genes contained in two of them. The region start and region end columns (along with the chromosome column) delimit the entire area containing the peak that is above the significance level. The region may be the same for multiple peaks. The peak start and end delimit the maximum value of the peak. The extended peak is the peak determined by robust, and is contained within the wide peak reported in {amp|del}_genes.txt by one marker.

Broad Significance Results (broad_significance_results.txt)

A table of per-arm statistical results for the data set. Each arm is a row in the table. The first column specifies the arm and the second column counts the number of genes known to be on the arm. For both amplification and deletion, the table has columns for the frequency of amplification or deletion of the arm, and a Z score and Q value.

Broad Values By Arm (broad_values_by_arm.txt)

A table of chromosome arm amplification levels for each sample. Each row is a chromosome arm, and each column a sample. The data are in units of absolute copy number -2.

All Data By Genes (all_data_by_genes.txt)

A gene-level table of copy number values for all samples. Each row is the data for a gene. The first three columns name the gene, its NIH locus ID, and its cytoband - the remaining columns are the samples. The copy number values in the table are in units of (copy number -2), so that no amplification or deletion is 0, genes with amplifications have positive values, and genes with deletions are negative values. The data are converted from marker level to gene level using the extreme method: a gene is assigned the greatest amplification or the least deletion value among the markers it covers.

Broad Data By Genes (broad_data_by_genes.txt)

A gene-level table of copy number data similar to the all_data_by_genes.txt output, but using only broad events with lengths greater than the broad length cutoff. The structure of the file and the methods and units used for the data analysis are otherwise identical to all_data_by_genes.txt.

Focal Data By Genes (focal_data_by_genes.txt)

A gene-level table of copy number data similar to the all_data_by_genes.txt output, but using only focal events with lengths greater than the focal length cutoff. The structure of the file and the methods and units used for the data analysis are otherwise identical to all_data_by_genes.txt.

All Thresholded By Genes (all_thresholded.by_genes.txt)

A gene-level table of discrete amplification and deletion indicators at for all samples. There is a row for each gene. The first three columns name the gene, its NIH locus ID, and its cytoband - the remaining columns are the samples. A table value of 0 means no amplification or deletion above the threshold. Amplifications are positive numbers: 1 means amplification above the amplification threshold; 2 means amplifications larger to the arm level amplifications observed for the sample. Deletions are represented by negative table values: -1 represents deletion beyond the threshold; -2 means deletions greater than the minimum arm-level deletion observed for the sample.

Sample Cutoffs (sample_cutoffs.txt)

A table of the per-sample threshold cutoffs (in units of absolute copy number -2) used to distinguish the high level amplifications (+/-2) from ordinary amplifications (+/-1) in the all_thresholded.by_genes.txt output file. The table contains three columns: the sample identifier followed by the low (deletion) and high (amplification) cutoff values. The cutoffs are calculated as the minimum arm-level amplification level less the deletion threshold for deletions and the maximum arm-level amplification plus the amplification threshold for amplifications.

Focal Input To Gistic (focal_input.seg.txt)

A list of copy number segments describing just the focal events present in the data. The segment amplification/deletion levels are in units of (copy number -2), with amplifications positive and deletions negative numbers. This file may be viewed with IGV.

Gene Counts vs. Copy Number Alteration Frequency (freqarms_vs_ngenes.{fig|pdf})

An image showing the correlation between gene counts and frequency of copy number alterations.

Confidence Intervals (regions_track.conf_##.bed, where ## is the confidence level)

A file indicating the position of the confidence intervals around GISTIC peaks that can be loaded as a track in a compatible viewer browser such as IGV or the UCSC genome browser.

GISTIC

GISTIC identifies genomic regions that are significantly gained or lost across a set of tumors. It takes segmented copy number ratios as input, separates arm-level events from focal events, and then performs two tests: (i) identifies significantly amplified/deleted chromosome arms; and (ii) identifies regions that are significantly focally amplified or deleted. For the focal analysis, the significance levels (Q values) are calculated by comparing the observed gains/losses at each locus to those obtained by randomly permuting the events along the genome to reflect the null hypothesis that they are all 'passengers' and could have occurred anywhere. The locus-specific significance levels are then corrected for multiple hypothesis testing. The arm-level significance is calculated by comparing the frequency of gains/losses of each arm to the expected rate given its size. The method outputs genomic views of significantly amplified and deleted regions, as well as a table of genes with gain or loss scores. A more in depth discussion of the GISTIC algorithm and its utility is given in [1], [3], and [5].

CNV Description

Regions of the genome that are prone to germ line variations in copy number are excluded from the GISTIC analysis using a list of germ line copy number variations (CNVs). A CNV is a DNA sequence that may be found at different copy numbers in the germ line of two different individuals. Such germ line variations can confound a GISTIC analysis, which finds significant somatic copy number variations in cancer. A more in depth discussion is provided in [6]. GISTIC currently uses two CNV exclusion lists. One is based on the literature describing copy number variation, and a second one comes from an analysis of significant variations among the blood normals in the TCGA data set.

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References

[1] Beroukhim et al, Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma, Proc Natl Acad Sci U S A. Vol. 104:50 (2007)

[2] GISTIC version 1

[3] Mermel et al, GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers, Genome Biology Vol. 12:4 (2011)

[4] GISTIC version 2

[5] Beroukhim et al., The landscape of somatic copy-number alteration across human cancers, Nature Vol. 463:7283 (2010)

[6] McCarroll, S. A. et al., Integrated detection and population-genetic analysis of SNPs and copy number variation, Nat Genet Vol. 40(10):1166-1174 (2008)

[7] The Sanger Institute: Cancer Gene Census

Made with Nozzle