Mutation Analysis (MutSig v2.0 and MutSigCV v0.9 merged result)

Skin Cutaneous Melanoma (Metastatic)

23 May 2013 | analyses__2013_05_23

Maintainer Information

Citation Information

Maintained by Dan DiCara (Broad Institute)

Cite as Broad Institute TCGA Genome Data Analysis Center (2013): Mutation Analysis (MutSig v2.0 and MutSigCV v0.9 merged result). Broad Institute of MIT and Harvard. doi:10.7908/C1MP51CS

Overview

Introduction

This report serves to describe the mutational landscape and properties of a given individual set, as well as rank genes and genesets according to mutational significance. MutSig v2.0 and MutSigCV v0.9 merged result was used to generate the results found in this report.

Working with individual set: SKCM-TM
Number of patients in set: 228

Input

The input for this pipeline is a set of individuals with the following files associated for each:

An annotated .maf file describing the mutations called for the respective individual, and their properties.
A .wig file that contains information about the coverage of the sample.

Summary

MAF used for this analysis:SKCM-TM.final_analysis_set.maf
Significantly mutated genes (q ≤ 0.1): 50
Mutations seen in COSMIC: 639
Significantly mutated genes in COSMIC territory: 42
Significantly mutated genesets: 2

Mutation Preprocessing

Read 228 MAFs of type "Broad"
Total number of mutations in input MAFs: 199529
After removing 4 mutations outside chr1-24: 199525
After removing 795 blacklisted mutations: 198730
After removing 3688 noncoding mutations: 195042

Mutation Filtering

Number of mutations before filtering: 195042
After removing 2459 mutations outside gene set: 192583
After removing 179 mutations outside category set: 192404
After removing 7 "impossible" mutations in
gene-patient-category bins of zero coverage: 189759

Results

Breakdown of Mutations by Type

Table 1. Get Full Table Table representing breakdown of mutations by type.

type	count
Frame_Shift_Del	924
Frame_Shift_Ins	271
In_Frame_Del	339
In_Frame_Ins	45
Missense_Mutation	117473
Nonsense_Mutation	7172
Nonstop_Mutation	49
Silent	64243
Splice_Site	1835
Translation_Start_Site	53
Total	192404

Breakdown of Mutation Rates by Category Type

Table 2. Get Full Table A breakdown of mutation rates per category discovered for this individual set.

category	n	N	rate	rate_per_mb	relative_rate	exp_ns_s_ratio
(C/T)p*C->T	88940	1812910184	0.000049	49	2.5	1.6
(A/G)p*C->T	9827	1521271502	6.5e-06	6.5	0.33	1.9
A->G	5054	3220290978	1.6e-06	1.6	0.08	2.3
transver	13699	6554472664	2.1e-06	2.1	0.11	5
indel+null	10478	6554472664	1.6e-06	1.6	0.082	NaN
double_null	159	6554472664	2.4e-08	0.024	0.0012	NaN
Total	128157	6554472664	2e-05	20	1	3.5

Target Coverage for Each Individual

The x axis represents the samples. The y axis represents the exons, one row per exon, and they are sorted by average coverage across samples. For exons with exactly the same average coverage, they are sorted next by the %GC of the exon. (The secondary sort is especially useful for the zero-coverage exons at the bottom).

Figure 1.

Distribution of Mutation Counts, Coverage, and Mutation Rates Across Samples

Figure 2. Patients counts and rates file used to generate this plot: SKCM-TM.patients.counts_and_rates.txt

Needs description.

Figure 3. Needs description.

Figure 4. Needs description.

CoMut Plot

Figure 5. Get High-res Image The matrix in the center of the figure represents individual mutations in patient samples, color-coded by type of mutation, for the significantly mutated genes. The rate of synonymous and non-synonymous mutations is displayed at the top of the matrix. The barplot on the left of the matrix shows the number of mutations in each gene. The percentages represent the fraction of tumors with at least one mutation in the specified gene. The barplot to the right of the matrix displays the q-values for the most significantly mutated genes. The purple boxplots below the matrix (only displayed if required columns are present in the provided MAF) represent the distributions of allelic fractions observed in each sample. The plot at the bottom represents the base substitution distribution of individual samples, using the same categories that were used to calculate significance.

Significantly Mutated Genes

Column Descriptions:

N = number of sequenced bases in this gene across the individual set
n = number of (nonsilent) mutations in this gene across the individual set
npat = number of patients (individuals) with at least one nonsilent mutation
nsite = number of unique sites having a non-silent mutation
nsil = number of silent mutations in this gene across the individual set
n1 = number of nonsilent mutations of type: (C/T)p*C->T
n2 = number of nonsilent mutations of type: (A/G)p*C->T
n3 = number of nonsilent mutations of type: A->G
n4 = number of nonsilent mutations of type: transver
n5 = number of nonsilent mutations of type: indel+null
n6 = number of nonsilent mutations of type: double_null
p_cons = p-value for enrichment of mutations at evolutionarily most-conserved sites in gene
p_joint = p-value for clustering + conservation
p = p-value (overall)
q = q-value, False Discovery Rate (Benjamini-Hochberg procedure)

Table 3. Get Full Table A Ranked List of Significantly Mutated Genes. Number of significant genes found: 50. Number of genes displayed: 35. Click on a gene name to display its stick figure depicting the distribution of mutations and mutation types across the chosen gene (this feature may not be available for all significant genes).

rank	gene	description	N	n	npat	nsite	nsil	n1	n2	n3	n4	n5	n6	p_cons	p_joint	p_cv	p	q
1	C15orf23	chromosome 15 open reading frame 23	232193	15	14	7	4	15	0	0	0	0	0	1	0	0.0059	0	0
2	CDKN2A	cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4)	209713	33	33	15	1	10	0	0	1	22	0	0	0	8.3e-15	0	0
3	NRAS	neuroblastoma RAS viral (v-ras) oncogene homolog	133556	67	67	9	1	2	1	23	41	0	0	1.2e-06	0	0	0	0
4	BRAF	v-raf murine sarcoma viral oncogene homolog B1	505506	123	117	17	3	13	1	5	103	1	0	0	0	1e-15	0	0
5	OXA1L	oxidase (cytochrome c) assembly 1-like	347630	7	7	3	3	7	0	0	0	0	0	1	0	0.56	0	0
6	TP53	tumor protein p53	276551	39	35	33	1	17	0	4	4	14	0	0.028	3.6e-06	1.4e-13	0	0
7	STK19	serine/threonine kinase 19	246849	13	11	8	0	11	1	0	1	0	0	0.96	0	1	0	0
8	PTEN	phosphatase and tensin homolog (mutated in multiple advanced cancers 1)	257977	18	18	16	0	1	0	3	3	11	0	0.71	0.12	7.1e-15	3.2e-14	7.1e-11
9	DSG1	desmoglein 1	725270	56	39	49	10	45	2	0	3	6	0	0.65	0.11	4.6e-09	1.1e-08	0.000022
10	PPP6C	protein phosphatase 6, catalytic subunit	228898	19	18	14	2	13	0	0	2	4	0	0.32	0.12	7e-09	1.8e-08	0.000032
11	RAC1	ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1)	140096	17	17	8	1	15	0	0	2	0	0	0.17	2e-05	0.0016	5.9e-07	0.00096
12	AOAH	acyloxyacyl hydrolase (neutrophil)	439824	22	21	20	9	12	0	0	5	5	0	0.18	0.014	3.1e-06	8e-07	0.0012
13	IDH1	isocitrate dehydrogenase 1 (NADP+), soluble	287241	12	12	4	1	10	0	0	2	0	0	0.97	3.4e-06	0.029	1.7e-06	0.0024
14	OR51S1	olfactory receptor, family 51, subfamily S, member 1	221261	30	27	21	8	23	3	0	2	2	0	0.99	0.014	8e-06	1.9e-06	0.0025
15	IL32	interleukin 32	113333	9	9	6	3	2	1	0	1	5	0	0.9	0.022	6.7e-06	2.5e-06	0.003
16	PPIAL4G	peptidylprolyl isomerase A (cyclophilin A)-like 4G	113772	12	12	9	7	7	0	0	3	2	0	0.99	0.15	1.6e-06	4e-06	0.0044
17	UGT2B15	UDP glucuronosyltransferase 2 family, polypeptide B15	692010	25	21	24	13	14	2	1	2	6	0	0.78	5.4e-06	0.067	5.7e-06	0.0061
18	TTN	titin	23403499	1045	176	962	369	807	69	33	94	33	9	1	1.8e-06	0.26	7.3e-06	0.0073
19	THEMIS	thymocyte selection associated	425802	33	27	25	14	27	1	1	2	2	0	0.64	0.00055	0.0011	9.2e-06	0.0087
20	LRTM1	leucine-rich repeats and transmembrane domains 1	237820	28	23	27	14	17	4	2	3	2	0	0.98	0.1	0.000012	0.000018	0.016
21	TCEB3C	transcription elongation factor B polypeptide 3C (elongin A3)	230241	31	26	23	9	23	2	0	2	4	0	1	0.1	0.000014	0.000021	0.018
22	ELF5	E74-like factor 5 (ets domain transcription factor)	185836	6	6	6	2	4	0	0	0	2	0	0.12	0.000046	0.051	0.000032	0.026
23	GPR141	G protein-coupled receptor 141	210024	14	14	10	5	11	0	1	1	1	0	0.12	0.14	0.000021	4e-05	0.031
24	NMNAT3	nicotinamide nucleotide adenylyltransferase 3	148123	8	8	8	8	3	1	0	1	3	0	0.0011	0.0012	0.0027	0.000046	0.035
25	EIF2B1	eukaryotic translation initiation factor 2B, subunit 1 alpha, 26kDa	216804	9	9	3	3	1	0	0	1	7	0	0.97	1	4.2e-06	0.000056	0.041
26	CHGB	chromogranin B (secretogranin 1)	426128	28	25	28	9	23	1	0	0	4	0	0.077	0.022	0.00021	0.000063	0.044
27	COPG2	coatomer protein complex, subunit gamma 2	280429	11	9	8	1	1	1	0	4	5	0	0.01	0.0029	0.0017	0.000066	0.044
28	RAPGEF5	Rap guanine nucleotide exchange factor (GEF) 5	370075	13	12	11	3	10	1	0	0	2	0	0.74	0.000036	0.16	0.000076	0.049
29	MS4A2	membrane-spanning 4-domains, subfamily A, member 2 (Fc fragment of IgE, high affinity I, receptor for; beta polypeptide)	180888	10	10	10	1	8	0	0	0	2	0	0.87	0.59	0.000012	0.000089	0.055
30	MPP7	membrane protein, palmitoylated 7 (MAGUK p55 subfamily member 7)	395617	31	25	27	5	25	1	0	2	3	0	0.61	0.12	0.000057	0.000091	0.055
31	C1QTNF9	C1q and tumor necrosis factor related protein 9	188774	14	14	12	3	10	1	0	2	1	0	0.96	0.12	0.000061	0.000091	0.055
32	DNAH7	dynein, axonemal, heavy chain 7	2773342	164	83	150	60	122	11	6	3	19	3	0.99	0.35	2e-05	0.000091	0.055
33	FAM58A	family with sequence similarity 58, member A	133422	5	5	4	0	0	0	0	0	5	0	0.085	0.02	0.00039	0.000098	0.055
34	LCE1B	late cornified envelope 1B	82168	12	12	12	0	8	0	0	2	2	0	0.68	1	8.2e-06	0.0001	0.055
35	SCN5A	sodium channel, voltage-gated, type V, alpha subunit	1272711	83	57	80	43	59	8	4	4	8	0	0.89	0.42	0.000022	0.00011	0.059

C15orf23

Figure S1. This figure depicts the distribution of mutations and mutation types across the C15orf23 significant gene.

CDKN2A

Figure S2. This figure depicts the distribution of mutations and mutation types across the CDKN2A significant gene.

NRAS

Figure S3. This figure depicts the distribution of mutations and mutation types across the NRAS significant gene.

BRAF

Figure S4. This figure depicts the distribution of mutations and mutation types across the BRAF significant gene.

OXA1L

Figure S5. This figure depicts the distribution of mutations and mutation types across the OXA1L significant gene.

TP53

Figure S6. This figure depicts the distribution of mutations and mutation types across the TP53 significant gene.

STK19

Figure S7. This figure depicts the distribution of mutations and mutation types across the STK19 significant gene.

PTEN

Figure S8. This figure depicts the distribution of mutations and mutation types across the PTEN significant gene.

DSG1

Figure S9. This figure depicts the distribution of mutations and mutation types across the DSG1 significant gene.

PPP6C

Figure S10. This figure depicts the distribution of mutations and mutation types across the PPP6C significant gene.

RAC1

Figure S11. This figure depicts the distribution of mutations and mutation types across the RAC1 significant gene.

IDH1

Figure S12. This figure depicts the distribution of mutations and mutation types across the IDH1 significant gene.

OR51S1

Figure S13. This figure depicts the distribution of mutations and mutation types across the OR51S1 significant gene.

IL32

Figure S14. This figure depicts the distribution of mutations and mutation types across the IL32 significant gene.

PPIAL4G

Figure S15. This figure depicts the distribution of mutations and mutation types across the PPIAL4G significant gene.

UGT2B15

Figure S16. This figure depicts the distribution of mutations and mutation types across the UGT2B15 significant gene.

TTN

Figure S17. This figure depicts the distribution of mutations and mutation types across the TTN significant gene.

THEMIS

Figure S18. This figure depicts the distribution of mutations and mutation types across the THEMIS significant gene.

LRTM1

Figure S19. This figure depicts the distribution of mutations and mutation types across the LRTM1 significant gene.

TCEB3C

Figure S20. This figure depicts the distribution of mutations and mutation types across the TCEB3C significant gene.

ELF5

Figure S21. This figure depicts the distribution of mutations and mutation types across the ELF5 significant gene.

GPR141

Figure S22. This figure depicts the distribution of mutations and mutation types across the GPR141 significant gene.

NMNAT3

Figure S23. This figure depicts the distribution of mutations and mutation types across the NMNAT3 significant gene.

EIF2B1

Figure S24. This figure depicts the distribution of mutations and mutation types across the EIF2B1 significant gene.

CHGB

Figure S25. This figure depicts the distribution of mutations and mutation types across the CHGB significant gene.

COPG2

Figure S26. This figure depicts the distribution of mutations and mutation types across the COPG2 significant gene.

RAPGEF5

Figure S27. This figure depicts the distribution of mutations and mutation types across the RAPGEF5 significant gene.

MS4A2

Figure S28. This figure depicts the distribution of mutations and mutation types across the MS4A2 significant gene.

MPP7

Figure S29. This figure depicts the distribution of mutations and mutation types across the MPP7 significant gene.

C1QTNF9

Figure S30. This figure depicts the distribution of mutations and mutation types across the C1QTNF9 significant gene.

DNAH7

Figure S31. This figure depicts the distribution of mutations and mutation types across the DNAH7 significant gene.

FAM58A

Figure S32. This figure depicts the distribution of mutations and mutation types across the FAM58A significant gene.

LCE1B

Figure S33. This figure depicts the distribution of mutations and mutation types across the LCE1B significant gene.

COSMIC analyses

In this analysis, COSMIC is used as a filter to increase power by restricting the territory of each gene. Cosmic version: v48.

Table 4. Get Full Table Significantly mutated genes (COSMIC territory only). To access the database please go to: COSMIC. Number of significant genes found: 42. Number of genes displayed: 10

rank	gene	description	n	cos	n_cos	N_cos	cos_ev	p	q
1	STK19	serine/threonine kinase 19	13	2	6	456	12	0	0
2	IDH1	isocitrate dehydrogenase 1 (NADP+), soluble	12	5	10	1140	14920	0	0
3	NRAS	neuroblastoma RAS viral (v-ras) oncogene homolog	67	33	65	7524	80983	0	0
4	BRAF	v-raf murine sarcoma viral oncogene homolog B1	123	89	116	20292	1494071	0	0
5	TP53	tumor protein p53	39	356	37	81168	3489	0	0
6	CDKN2A	cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4)	33	332	33	75696	1291	0	0
7	EPHA6	EPH receptor A6	60	8	5	1824	5	4.6e-10	3e-07
8	EPHA7	EPH receptor A7	42	13	5	2964	5	5.2e-09	2.9e-06
9	PTEN	phosphatase and tensin homolog (mutated in multiple advanced cancers 1)	18	767	18	174876	374	2.5e-08	0.000013
10	NF1	neurofibromin 1 (neurofibromatosis, von Recklinghausen disease, Watson disease)	40	285	11	64980	25	1.1e-07	0.000048

Note:

n - number of (nonsilent) mutations in this gene across the individual set.

cos = number of unique mutated sites in this gene in COSMIC

n_cos = overlap between n and cos.

N_cos = number of individuals times cos.

cos_ev = total evidence: number of reports in COSMIC for mutations seen in this gene.

p = p-value for seeing the observed amount of overlap in this gene)

q = q-value, False Discovery Rate (Benjamini-Hochberg procedure)

Geneset Analyses

Table 5. Get Full Table A Ranked List of Significantly Mutated Genesets. (Source: MSigDB GSEA Cannonical Pathway Set).Number of significant genesets found: 2. Number of genesets displayed: 10

rank	geneset	description	genes	N_genes	mut_tally	N	n	npat	nsite	nsil	n1	n2	n3	n4	n5	n6	p_ns_s	p	q
1	SA_G1_AND_S_PHASES	Cdk2, 4, and 6 bind cyclin D in G1, while cdk2/cyclin E promotes the G1/S transition.	ARF1, ARF3, CCND1, CDK2, CDK4, CDKN1A, CDKN1B, CDKN2A, CFL1, E2F1, E2F2, MDM2, NXT1, PRB1, TP53	15	CCND1(1), CDK4(5), CDKN1A(4), CDKN1B(1), CDKN2A(33), CFL1(1), E2F1(5), E2F2(4), MDM2(3), NXT1(2), PRB1(29), TP53(39)	2837450	127	79	99	16	62	3	6	16	40	0	2.6e-09	4.2e-06	0.0026
2	ST_G_ALPHA_S_PATHWAY	The G-alpha-s protein activates adenylyl cyclases, which catalyze cAMP formation.	ASAH1, BF, BFAR, BRAF, CAMP, CREB1, CREB3, CREB5, EPAC, GAS, GRF2, MAPK1, RAF1, SNX13, SRC, TERF2IP	12	BRAF(123), CAMP(1), CREB3(1), CREB5(9), MAPK1(4), RAF1(8), SNX13(1), SRC(1), TERF2IP(1)	3751054	149	130	43	30	28	4	6	107	4	0	0.02	0.00013	0.04
3	SA_REG_CASCADE_OF_CYCLIN_EXPR	Expression of cyclins regulates progression through the cell cycle by activating cyclin-dependent kinases.	CCNA1, CCNA2, CCND1, CCNE1, CCNE2, CDK2, CDK4, CDKN1B, CDKN2A, E2F1, E2F2, E2F4, PRB1	13	CCNA1(13), CCND1(1), CCNE1(4), CCNE2(12), CDK4(5), CDKN1B(1), CDKN2A(33), E2F1(5), E2F2(4), E2F4(1), PRB1(29)	3023744	108	76	82	17	61	0	4	18	25	0	2.7e-06	0.032	1
4	HSA00472_D_ARGININE_AND_D_ORNITHINE_METABOLISM	Genes involved in D-arginine and D-ornithine metabolism	DAO	1	DAO(10)	240807	10	10	10	2	7	1	1	1	0	0	0.12	0.11	1
5	TERTPATHWAY	hTERC, the RNA subunit of telomerase, and hTERT, the catalytic protein subunit, are required for telomerase activity and are overexpressed in many cancers.	HDAC1, MAX, MYC, SP1, SP3, TP53, WT1, ZNF42	7	HDAC1(1), MYC(5), SP1(5), SP3(1), TP53(39), WT1(5)	2379389	56	44	50	8	26	2	7	6	15	0	0.00078	0.22	1
6	HSA00627_1,4_DICHLOROBENZENE_DEGRADATION	Genes involved in 1,4-dichlorobenzene degradation	CMBL	1	CMBL(4)	172740	4	4	4	1	4	0	0	0	0	0	0.48	0.46	1
7	ARFPATHWAY	Cyclin-dependent kinase inhibitor 2A is a tumor suppressor that induces G1 arrest and can activate the p53 pathway, leading to G2/M arrest.	ABL1, CDKN2A, E2F1, MDM2, MYC, PIK3CA, PIK3R1, POLR1A, POLR1B, POLR1C, POLR1D, RAC1, RB1, TBX2, TP53, TWIST1	16	ABL1(12), CDKN2A(33), E2F1(5), MDM2(3), MYC(5), PIK3CA(8), PIK3R1(3), POLR1A(11), POLR1B(8), POLR1C(1), RAC1(17), RB1(8), TBX2(6), TP53(39), TWIST1(1)	6784714	160	96	126	24	84	8	6	16	43	3	9.2e-10	0.47	1
8	FOSBPATHWAY	FOSB gene expression and drug abuse	CDK5, FOSB, GRIA2, JUND, PPP1R1B	5	CDK5(4), FOSB(5), GRIA2(34), JUND(1), PPP1R1B(2)	1226014	46	39	43	15	28	3	2	6	7	0	0.069	0.82	1
9	HSA00401_NOVOBIOCIN_BIOSYNTHESIS	Genes involved in novobiocin biosynthesis	GOT1, GOT2, TAT	3	GOT1(6), GOT2(5), TAT(17)	883649	28	18	27	7	21	2	2	1	2	0	0.015	0.88	1
10	SLRPPATHWAY	Small leucine-rich proteoglycans (SLRPs) interact with and reorganize collagen fibers in the extracellular matrix.	BGN, DCN, DSPG3, FMOD, KERA, LUM	5	BGN(6), DCN(17), FMOD(4), KERA(14), LUM(11)	1178331	52	38	48	16	41	4	3	1	3	0	0.0018	0.88	1

Methods & Data

Methods

In brief, we tabulate the number of mutations and the number of covered bases for each gene. The counts are broken down by mutation context category: four context categories that are discovered by MutSig, and one for indel and 'null' mutations, which include indels, nonsense mutations, splice-site mutations, and non-stop (read-through) mutations. For each gene, we calculate the probability of seeing the observed constellation of mutations, i.e. the product P1 x P2 x ... x Pm, or a more extreme one, given the background mutation rates calculated across the dataset. [1]

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.

References

[1] TCGA, Integrated genomic analyses of ovarian carcinoma, Nature 474:609 - 615 (2011)

Made with Nozzle