Identification of putative miR direct targets by sequencing data
Stomach and Esophageal carcinoma (Primary solid tumor)
02 April 2015  |  analyses__2015_04_02
Maintainer Information
Citation Information
Maintained by Hailei Zhang (Broad Institute)
Cite as Broad Institute TCGA Genome Data Analysis Center (2015): Identification of putative miR direct targets by sequencing data. Broad Institute of MIT and Harvard. doi:10.7908/C1GT5M8M
Overview
Introduction

This pipeline infers putative direct gene targets of miRs based on miRseq and mRNAseq expression profiles across multiple samples. This pipeline has the following steps:

  1. The CLR approach[8] is applied to infer putative miR:gene regulatory connections.

  2. Filtering miR:gene pairs based on Pearson correlation (<= -0.3 ).

  3. Filtering miR:gene pairs based on predicted interactions in three sequence prediction databases (Miranda, Pictar, Targetscan)

Summary

The CLR algorithm was applied on 775 miRs and 23517 mRNAs across 415 samples. After 2 filtering steps, the number of 188 miR:genes pairs were detected.

Results
Significant miR:gene pairs

Figure 1.  Get High-res Image All miR hubs with their strong anti-correlated genes and predicted interactions in three sequence prediction databases.

Table 1.  Get Full Table List of miR:gene pairs with corr < -0.30 and predicted interactions in three sequence prediction databases.

Mir Gene Corr p q Prediction.DBs Miranda Pictar Targetscan Total
hsa-miR-101-3p RAD54L -0.53 4e-10 1.3e-09 miranda,pictar,targetscan 1 1 1 3
hsa-miR-101-3p AMMECR1 -0.45 5.2e-06 5.5e-06 miranda,pictar,targetscan 1 1 1 3
hsa-miR-107 KIAA1033 -0.31 2.9e-07 4.6e-07 miranda,pictar,targetscan 1 1 1 3
hsa-miR-137 ESRRA -0.34 4.7e-07 7.1e-07 miranda,pictar,targetscan 1 1 1 3
hsa-miR-137 PNKD -0.33 1.3e-06 1.7e-06 miranda,pictar,targetscan 1 1 1 3
hsa-miR-141-3p ZCCHC24 -0.48 3.6e-08 7.4e-08 miranda,pictar,targetscan 1 1 1 3
hsa-miR-141-3p ZEB2 -0.47 0 0 miranda,pictar,targetscan 1 1 1 3
hsa-miR-141-3p CXCL12 -0.44 3.6e-09 9.7e-09 miranda,pictar,targetscan 1 1 1 3
hsa-miR-141-3p WIPF1 -0.44 6.7e-16 6.2e-15 miranda,pictar,targetscan 1 1 1 3
hsa-miR-141-3p DIXDC1 -0.42 3.1e-07 4.9e-07 miranda,pictar,targetscan 1 1 1 3
miR connections

Table 2.  Get Full Table All miR hubs with their associated genes in the putative direct target network.

Mir Number.of.Genes Genes
hsa-miR-29a-3p 19 ATRN, BLMH, CCDC117, COMMD2, DNAJB11, DNMT3B, E2F7, ENTPD7, IREB2, NANP, PDHX, SENP1, SNX4, SPAST, SS18L1, TDG, TRIM37, YY1,ABCE1
hsa-miR-195-5p 18 CDC25A, CHEK1, CHORDC1, CIT, E2F7, EIF4E, ESRP1, FAM60A, KIF23, OTX1, RAD23B, SERBP1, SPTY2D1, SRPK1, TUBA1A, WBP11, ZNF367,ATP13A3
hsa-miR-29c-3p 13 ABCE1, ANKRD13B, DNAJB11, DNMT3B, E2F7, FAM136A, HIATL1, NKRF, SLC16A1, TDG, TUBB, YY1,ABCB6
hsa-miR-26a-5p 13 DEPDC1B, E2F7, EZH2, FAM98A, GNPNAT1, GRSF1, KPNA2, LSM12, MTDH, NUS1, PPP3R1, SERBP1,CHORDC1
hsa-miR-200b-3p 10 CFL2, DENND5A, DNAJB5, FRMD6, GLI3, KCTD15, NCS1, WASF1, ZNF532,BASP1
hsa-miR-27a-3p 8 APAF1, AQP11, CCNG1, CCNJ, CHD7, SCML1, ZNF238,ADCY6
hsa-miR-200c-3p 7 CYP1B1, LHFP, ST6GALNAC5, WIPF1, ZEB1, ZEB2,CDH11
hsa-miR-141-3p 7 DIXDC1, DLC1, KHDRBS2, WIPF1, ZCCHC24, ZEB2,CXCL12
hsa-miR-16-5p 7 CAPZA2, CTDSPL, PRKAR2A, RAB11FIP2, RNF125, RSBN1,ATXN2
hsa-miR-26b-5p 5 LSM12, ME2, PPP3R1, UBE2D1,FAM98A
Gene connections

Table 3.  Get Full Table All gene hubs with their associated miRs in the putative direct target network.

Gene Number.of.Mirs Mirs
E2F7 5 hsa-miR-195-5p, hsa-miR-29a-3p, hsa-miR-29c-3p, hsa-miR-30d-5p,hsa-miR-26a-5p
CHD7 4 hsa-miR-152-3p, hsa-miR-23a-3p, hsa-miR-27a-3p,hsa-miR-22-3p
CDKN1B 3 hsa-miR-221-3p, hsa-miR-222-3p,hsa-miR-24-3p
TDG 3 hsa-miR-29b-3p, hsa-miR-29c-3p,hsa-miR-29a-3p
DNMT3B 3 hsa-miR-29b-3p, hsa-miR-29c-3p,hsa-miR-29a-3p
LSM12 3 hsa-miR-214-3p, hsa-miR-26b-5p,hsa-miR-26a-5p
TOP1 3 hsa-miR-23b-3p, hsa-miR-24-3p,hsa-miR-23a-3p
BORA 2 hsa-miR-23b-3p,hsa-miR-23a-3p
YY1 2 hsa-miR-29c-3p,hsa-miR-29a-3p
TMED5 2 hsa-miR-27b-3p,hsa-miR-98-5p
Methods & Data
Input

This section should list the files that were used as input.

  • miRseq (at precursor level) of RPM value (reads per million reads aligned to miRBase precursor) with log2 transformed = /xchip/cga/gdac-prod/tcga-gdac/jobResults/miRseq_mature_preprocess/STES-TP/14527770/STES-TP.miRseq_mature_RPM_log2.txt

  • mRNAseq of RSEM/RPKM value with log2 transformed = /xchip/cga/gdac-prod/tcga-gdac/jobResults/mRNAseq_preprocessor/STES-TP/14527769/STES-TP.mRNAseq_RPKM_log2.txt

  • miR:gene predicted interactions file = /xchip/cga_home/hailei/FH/CLR/human_interactions.predicted.v2.mirbase21.txt

  • Miranda = microrna.org Aug 2010 release, Microcosm version 5

  • Pictar = version 1

  • Targetscan = release 5.2

CLR method

The CLR (Context Likelihood of Relatedness) algorithm builds upon the relevance network strategies, by applying a background correction step. After computing the mutual information between regulators and their potential target genes, CLR calculates the statistical likelihood of each mutual information value within its network context. The algorithm compares the mutual information between a miR/gene pair to the background distribution of mutual information scores for all possible miR/gene pairs that include either the miR or its target. After this background correction, the most probable interactions are those whose mutual information scores stand significantly above the background distribution of mutual information scores[8] .

Pearson corelation

Pairwise Pearson correlations coefficients between all miR:gene pairs are first computed. All genes that have correlation values less than the user-defined threshold (-0.3) with a particular miR and have been predicted as targets of that miR in three sequence based prediction databases: Miranda[1][2] Pictar[3][4], TargetScan [5][6][7] are identified as putative direct targets of that miR. We infer a direct target miR:gene network which comprises all such putative direct associations.

  • threshold = -0.3

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

References
[1] Betel D, Wilson M, Gabow A, Marks DS, Sander C, microRNA target predictions: The microRNA.org resource: targets and expression, Nucleic Acids Res 36:D149-53 (2008)
[2] John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS, miRanda: Human MicroRNA targets., PLoS Biol 3(7):e264 (2005)
[3] Krek A, Grun D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, Piedade ID, Gunsalus KC, Stoffel M, Rajewsky N, Combinatorial microRNA target predictions, Nature Genetics 37:495-500 (2005)
[4] Chen K, Rajewsky N, Natural selection on human microRNA binding sites inferred from SNP data., Nat Genet 38:1452-1456 (2006)
[5] Lewis BP, Burge CB, Bartel DP, Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets., Cell 120(120):15-20 (2005)
[6] Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP, MicroRNA Targeting Specificity in Mammals: Determinants beyond Seed Pairing., Molecular Cell 27:91-105 (2007)
[7] Friedman RC, Farh KK, Burge CB, Bartel DP, Most Mammalian mRNAs Are Conserved Targets of MicroRNAs., Genome Research 19:92-105 (2009)
[8] Genovese G1, Ergun A, Shukla SA, Campos B, Hanna J, Ghosh P, Quayle SN, Rai K, Colla S, Ying H, Wu CJ, Sarkar S, Xiao Y, Zhang J, Zhang H, Kwong L, Dunn K, Wiedemeyer WR, Brennan C, Zheng H, Rimm DL, Collins JJ, Chin L., microRNA regulatory network inference identifies miR-34a as a novel regulator of TGF-β signaling in glioblastoma., Cancer Discovery 2(8):736-49 (2012)