Identification of putative miR direct targets by sequencing data
Prostate Adenocarcinoma (Primary solid tumor)
21 August 2015  |  analyses__2015_08_21
Maintainer Information
Citation Information
Maintained by Hailei Zhang (Broad Institute)
Cite as Broad Institute TCGA Genome Data Analysis Center (2015): Identification of putative miR direct targets by sequencing data. Broad Institute of MIT and Harvard. doi:10.7908/C10R9NQB

This pipeline infers putative direct gene targets of miRs based on miRseq and mRNAseq expression profiles across multiple samples. This pipeline has the following steps:

  1. The CLR approach[8] is applied to infer putative miR:gene regulatory connections.

  2. Filtering miR:gene pairs based on Pearson correlation (<= -0.3 ).

  3. Filtering miR:gene pairs based on predicted interactions in three sequence prediction databases (Miranda, Pictar, Targetscan)


The CLR algorithm was applied on 687 miRs and 18274 mRNAs across 493 samples. After 2 filtering steps, the number of 60 miR:genes pairs were detected.

Significant miR:gene pairs

Figure 1.  Get High-res Image All miR hubs with their strong anti-correlated genes and predicted interactions in three sequence prediction databases.

Table 1.  Get Full Table List of miR:gene pairs with corr < -0.30 and predicted interactions in three sequence prediction databases.

Mir Gene Corr p q Prediction.DBs Miranda Pictar Targetscan Total
hsa-miR-1-3p XPO6 -0.41 6.8e-09 1.8e-08 miranda,pictar,targetscan 1 1 1 3
hsa-miR-137 SLC1A5 -0.3 2.5e-12 1.4e-11 miranda,pictar,targetscan 1 1 1 3
hsa-miR-137 PNKD -0.3 0 0 miranda,pictar,targetscan 1 1 1 3
hsa-miR-141-3p CALU -0.32 2.7e-08 5.9e-08 miranda,pictar,targetscan 1 1 1 3
hsa-miR-16-5p SLC13A3 -0.31 5.2e-07 7.6e-07 miranda,pictar,targetscan 1 1 1 3
hsa-miR-193b-3p CNOT6 -0.34 4.8e-11 2.1e-10 miranda,pictar,targetscan 1 1 1 3
hsa-miR-195-5p ZNF622 -0.34 7.1e-08 1.4e-07 miranda,pictar,targetscan 1 1 1 3
hsa-miR-200b-3p B3GNT1 -0.42 0 0 miranda,pictar,targetscan 1 1 1 3
hsa-miR-204-5p RAB10 -0.35 4.2e-07 6.4e-07 miranda,pictar,targetscan 1 1 1 3
hsa-miR-205-5p FAM84B -0.34 3.4e-06 3.8e-06 miranda,pictar,targetscan 1 1 1 3
miR connections

Table 2.  Get Full Table All miR hubs with their associated genes in the putative direct target network.

Mir Number.of.Genes Genes
hsa-miR-30b-5p 15 ANKRD17, CCPG1, CSNK1G1, DOCK7, EAF1, EIF2C3, ELMOD2, FBXL20, GALNT1, IDH1, IGF2R, MAN1A2, MFSD6, ZDHHC21,ADAM9
hsa-miR-137 2 SLC1A5,PNKD
hsa-miR-205-5p 2 UBE2N,FAM84B
hsa-miR-29b-3p 2 SLC39A9,ENTPD7
hsa-miR-29c-3p 2 TMEM132A,MYBL2
hsa-miR-96-5p 2 PAK3,GAN
hsa-miR-320a 1 PHF8
hsa-miR-214-3p 1 SPTBN2
hsa-miR-141-3p 1 CALU
Gene connections

Table 3.  Get Full Table All gene hubs with their associated miRs in the putative direct target network.

Gene Number.of.Mirs Mirs
FBXL20 2 hsa-miR-30b-5p,hsa-miR-30c-5p
ANKRD17 2 hsa-miR-30b-5p,hsa-miR-30c-5p
ZDHHC21 2 hsa-miR-30b-5p,hsa-miR-30c-5p
EAF1 2 hsa-miR-30b-5p,hsa-miR-30c-5p
CSNK1G1 2 hsa-miR-30b-5p,hsa-miR-30c-5p
ELMOD2 2 hsa-miR-30b-5p,hsa-miR-30c-5p
RAB10 1 hsa-miR-204-5p
REEP3 1 hsa-miR-30c-5p
MTDH 1 hsa-miR-30c-5p
GALNT1 1 hsa-miR-30b-5p
Methods & Data

This section should list the files that were used as input.

  • miRseq (at precursor level) of RPM value (reads per million reads aligned to miRBase precursor) with log2 transformed = /xchip/cga/gdac-prod/tcga-gdac/jobResults/miRseq_mature_preprocess/PRAD-TP/19087630/PRAD-TP.miRseq_mature_RPM_log2.txt

  • mRNAseq of RSEM/RPKM value with log2 transformed = /xchip/cga/gdac-prod/tcga-gdac/jobResults/mRNAseq_preprocessor/PRAD-TP/19438991/PRAD-TP.uncv2.mRNAseq_RSEM_normalized_log2.txt

  • miR:gene predicted interactions file = /xchip/cga_home/hailei/FH/CLR/human_interactions.predicted.v2.mirbase21.txt

  • Miranda = Aug 2010 release, Microcosm version 5

  • Pictar = version 1

  • Targetscan = release 5.2

CLR method

The CLR (Context Likelihood of Relatedness) algorithm builds upon the relevance network strategies, by applying a background correction step. After computing the mutual information between regulators and their potential target genes, CLR calculates the statistical likelihood of each mutual information value within its network context. The algorithm compares the mutual information between a miR/gene pair to the background distribution of mutual information scores for all possible miR/gene pairs that include either the miR or its target. After this background correction, the most probable interactions are those whose mutual information scores stand significantly above the background distribution of mutual information scores[8] .

Pearson corelation

Pairwise Pearson correlations coefficients between all miR:gene pairs are first computed. All genes that have correlation values less than the user-defined threshold (-0.3) with a particular miR and have been predicted as targets of that miR in three sequence based prediction databases: Miranda[1][2] Pictar[3][4], TargetScan [5][6][7] are identified as putative direct targets of that miR. We infer a direct target miR:gene network which comprises all such putative direct associations.

  • threshold = -0.3

Download Results

In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.

[1] Betel D, Wilson M, Gabow A, Marks DS, Sander C, microRNA target predictions: The resource: targets and expression, Nucleic Acids Res 36:D149-53 (2008)
[2] John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS, miRanda: Human MicroRNA targets., PLoS Biol 3(7):e264 (2005)
[3] Krek A, Grun D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, Piedade ID, Gunsalus KC, Stoffel M, Rajewsky N, Combinatorial microRNA target predictions, Nature Genetics 37:495-500 (2005)
[4] Chen K, Rajewsky N, Natural selection on human microRNA binding sites inferred from SNP data., Nat Genet 38:1452-1456 (2006)
[5] Lewis BP, Burge CB, Bartel DP, Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets., Cell 120(120):15-20 (2005)
[6] Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP, MicroRNA Targeting Specificity in Mammals: Determinants beyond Seed Pairing., Molecular Cell 27:91-105 (2007)
[7] Friedman RC, Farh KK, Burge CB, Bartel DP, Most Mammalian mRNAs Are Conserved Targets of MicroRNAs., Genome Research 19:92-105 (2009)
[8] Genovese G1, Ergun A, Shukla SA, Campos B, Hanna J, Ghosh P, Quayle SN, Rai K, Colla S, Ying H, Wu CJ, Sarkar S, Xiao Y, Zhang J, Zhang H, Kwong L, Dunn K, Wiedemeyer WR, Brennan C, Zheng H, Rimm DL, Collins JJ, Chin L., microRNA regulatory network inference identifies miR-34a as a novel regulator of TGF-β signaling in glioblastoma., Cancer Discovery 2(8):736-49 (2012)