Mutation Analysis (MutSigCV v0.9)

Lung Adenocarcinoma (Primary solid tumor)

23 May 2013 | analyses__2013_05_23

Maintainer Information

Citation Information

Maintained by Dan DiCara (Broad Institute)

Cite as Broad Institute TCGA Genome Data Analysis Center (2013): Mutation Analysis (MutSigCV v0.9). Broad Institute of MIT and Harvard. doi:10.7908/C1QR4V58

Overview

Introduction

This report serves to describe the mutational landscape and properties of a given individual set, as well as rank genes and genesets according to mutational significance. MutSigCV v0.9 was used to generate the results found in this report.

Working with individual set: LUAD-TP
Number of patients in set: 248

Input

The input for this pipeline is a set of individuals with the following files associated for each:

An annotated .maf file describing the mutations called for the respective individual, and their properties.
A .wig file that contains information about the coverage of the sample.

Summary

MAF used for this analysis:LUAD-TP.final_analysis_set.maf
Significantly mutated genes (q ≤ 0.1): 168

Results

Target Coverage for Each Individual

The x axis represents the samples. The y axis represents the exons, one row per exon, and they are sorted by average coverage across samples. For exons with exactly the same average coverage, they are sorted next by the %GC of the exon. (The secondary sort is especially useful for the zero-coverage exons at the bottom).

Figure 1.

Distribution of Mutation Counts, Coverage, and Mutation Rates Across Samples

Figure 2. Patients counts and rates file used to generate this plot: LUAD-TP.patients.counts_and_rates.txt

CoMut Plot

Figure 3. Get High-res Image The matrix in the center of the figure represents individual mutations in patient samples, color-coded by type of mutation, for the significantly mutated genes. The rate of synonymous and non-synonymous mutations is displayed at the top of the matrix. The barplot on the left of the matrix shows the number of mutations in each gene. The percentages represent the fraction of tumors with at least one mutation in the specified gene. The barplot to the right of the matrix displays the q-values for the most significantly mutated genes. The purple boxplots below the matrix (only displayed if required columns are present in the provided MAF) represent the distributions of allelic fractions observed in each sample. The plot at the bottom represents the base substitution distribution of individual samples, using the same categories that were used to calculate significance.

Significantly Mutated Genes

Column Descriptions:

nnon = number of (nonsilent) mutations in this gene across the individual set
npat = number of patients (individuals) with at least one nonsilent mutation
nsite = number of unique sites having a non-silent mutation
nflank = number of noncoding mutations from this gene's flanking region, across the individual set
nsil = number of silent mutations in this gene across the individual set
p = p-value (overall)
q = q-value, False Discovery Rate (Benjamini-Hochberg procedure)

Table 1. Get Full Table A Ranked List of Significantly Mutated Genes. Number of significant genes found: 168. Number of genes displayed: 35. Click on a gene name to display its stick figure depicting the distribution of mutations and mutation types across the chosen gene (this feature may not be available for all significant genes).

gene	Nnon	Nsil	Nflank	nnon	npat	nsite	nsil	nflank	nnei	fMLE	p	score	time	q
TP53	234360	68448	328837	137	128	106	2	3	4	0.86	1.9e-15	510	0.33	3.4e-11
KRAS	151498	36947	240371	64	64	6	0	1	1	0.64	5.2e-15	170	0.33	3.4e-11
KEAP1	326774	95698	169320	42	42	37	0	2	20	0.58	5.6e-15	140	0.33	3.4e-11
STK11	135168	38871	118819	21	20	20	0	2	20	0.66	7.7e-15	100	0.19	3.5e-11
EGFR	762139	208330	1647564	34	28	19	7	26	20	0.91	1.2e-14	100	0.18	4.4e-11
CRIPAK	246948	82068	57463	15	13	8	4	0	20	0.78	2.3e-13	69	0.18	7e-10
RBM10	366145	103516	422882	14	14	12	1	4	16	0.77	4.1e-12	73	0.19	1.1e-08
FLG	2289073	679455	115159	140	65	132	26	4	15	1	8e-12	140	0.19	1.8e-08
CDKN2A	94378	27828	65677	15	15	14	1	4	6	1.4	1.3e-11	61	0.32	2.7e-08
NF1	2346812	656897	2970763	35	29	32	4	14	0	0.44	1.5e-11	110	0.18	2.7e-08
SMARCA4	830974	235187	1278748	20	19	18	2	7	20	0.67	1.6e-11	86	0.18	2.7e-08
GPR112	1753032	556983	1165020	60	50	56	20	21	20	0.86	4.4e-11	110	0.18	6.8e-08
COL11A1	1077726	338658	2665261	73	55	67	15	94	18	3	1.5e-09	130	0.28	2.2e-06
MUC7	206073	76627	103495	20	18	19	1	1	20	1.3	4.2e-09	53	0.18	5.5e-06
CSMD3	2219330	619687	3353706	177	107	160	40	186	5	5	7.4e-08	190	0.24	9e-05
HRNR	1194657	395410	112967	55	33	55	11	3	20	1.1	9.3e-08	89	0.18	0.00011
BRAF	429511	122750	917805	19	19	12	1	11	19	0.9	1e-07	54	0.18	0.00011
RB1	713108	187949	1498581	13	13	13	1	14	20	0.76	1.5e-07	59	0.24	0.00016
RIMS2	834187	234583	2133181	47	41	45	4	66	6	2.1	5.3e-07	94	0.33	0.00051
RIT1	131430	36456	239389	11	10	9	1	2	20	0.94	8.4e-07	36	0.31	0.00077
LRP1B	2802163	699979	4354925	159	93	147	29	168	6	3.4	9e-07	180	0.18	0.00078
SPRR3	96715	29512	50421	7	7	4	0	1	20	1.7	1e-06	36	0.16	0.00085
ZCCHC5	241236	68746	12483	16	15	15	4	0	11	0.9	1.2e-06	41	0.18	0.00093
LTBP1	956591	254468	1763850	31	31	28	7	37	20	1.2	1.3e-06	75	0.18	0.001
MYL10	103664	25792	237044	7	7	7	1	6	20	0.81	2e-06	31	0.24	0.0015
FTSJD1	454554	118787	39000	12	12	11	0	1	20	1.1	2.3e-06	51	0.18	0.0016
ARID1A	1108788	326611	872549	17	15	16	1	4	2	0.64	2.6e-06	70	0.18	0.0018
SMAD4	329324	91755	469614	9	9	8	1	3	20	0.76	3.7e-06	41	0.18	0.0024
OR4Q3	178808	53320	35127	14	14	14	5	3	20	1.5	3.8e-06	40	0.18	0.0024
SVOP	121567	37290	216214	7	7	7	0	6	8	0.74	5.7e-06	30	0.18	0.0035
MGA	1641699	481751	779596	25	18	25	3	5	3	0.69	6.2e-06	77	0.33	0.0037
OVCH1	595102	165704	903961	30	21	26	7	30	20	1.7	9e-06	63	0.32	0.0051
SETD2	1240637	332711	988668	23	19	22	1	2	7	0.88	0.000012	71	0.19	0.0068
COL19A1	668801	205582	2178715	28	25	28	6	53	9	1.5	0.000013	65	0.18	0.0071
PCK1	368513	102176	429324	14	14	12	2	9	20	1.1	0.000014	47	0.2	0.0075

Methods & Data

Methods

In brief, we tabulate the number of mutations and the number of covered bases for each gene. The counts are broken down by mutation context category: four context categories that are discovered by MutSig, and one for indel and 'null' mutations, which include indels, nonsense mutations, splice-site mutations, and non-stop (read-through) mutations. For each gene, we calculate the probability of seeing the observed constellation of mutations, i.e. the product P1 x P2 x ... x Pm, or a more extreme one, given the background mutation rates calculated across the dataset. [1]

Download Results

This is an experimental feature. The full results of the analysis summarized in this report can be downloaded from the TCGA Data Coordination Center.

References

[1] TCGA, Integrated genomic analyses of ovarian carcinoma, Nature 474:609 - 615 (2011)

Made with Nozzle