Index


What is miRWalk2.0 and how does it work?

miRWalk2.0 is an improved version of the previous database (i.e. miRWalk [1]). miRWalk2.0 is so far the only freely accessible, comprehensive archive, supplying the biggest available collection of predicted and the experimentally verified miRNA-target interactions with various novel and unique features (missing in a previous version - that is miRWalk [1] and other resources [2-17]) to greatly assist miRNA research community. Currently, it amalgamates miRNA-target interactions for human, mouse and rat. However, it provides miRNA-miRNA interactions on 15 species: human, orangutan, chimpanzee, monkey, mouse, rat, pig, chicken, dog, cow, frog, zebrafish, opossum, fruitfly and worm.

miRWalk2.0 not only documents miRNA binding sites within the complete sequence of a gene, but also combines this information with a comparison of binding sites resulting from 12 existing miRNA-target prediction programs: DIANA-microTv4.0 [2], DIANA-microT-CDS [3], miRanda-rel2010 [4], mirBridge [5], miRDB4.0 [6], miRmap [7], miRNAMap [8], doRiNA i.e.,PicTar2 [9], PITA [10], RNA22v2 [11], RNAhybrid2.1 [12] and Targetscan6.2 [13], to build novel comparative platforms of binding sites for the promoter (4 prediction datasets), cds (5 prediction datasets), 5'- (5 prediction datasets) and 3'-UTR (13 prediction datasets) regions. It also documents experimentally verified miRNA-target interaction information collected via an automated text-mining search and data obtained from existing resources (miRTarBase [14], PhenomiR2.0 [15], miR2Disease [16] and HMDD [17]) offer such information. A total of 13,650 publications are documented on validated miRNA-target interactions. It documents experimentally validated interactions on 3,081 miRNAs and reports more than 151,666,930 relationships associated with 19395 genes, 1,955 DOs, 12 gene classes, 4,371 GOBPs, 1,331 GOMFs, 715 GOCCs, 6,463 HPOs, 4,087 OMIM disorders, 546 pathways, 28 protein classes, 450 diseases, 671 organs and 87 cell lines. In addition, it presents the information on proteins known to be engaged in miRNA processing.



Figure 1:
miRWalk2.0 is developed with the aim of providing a public resource to supply putative as well as experimentally verified miRNA interactions associated with the complete sequence of genes, mitochondrial genomes, other miRNAs, pathways, gene-, disease-, and human phenotype-ontologies and OMIM disorders, classes, cell lines, and organs. The structure of miRWalk2.0 can be broadly divided into four sections: putative miRNA-target interactions, validated miRNA-target interactions, functional annotation and the web-interface. Briefly, first, all the genomic sequences (promoter, mitochondrial and miRNA) were downloaded and five prediction algorithms were locally executed to generate putative miRNA binding sites within the downloaded sequences. In parallel, the 8 prediction datasets were gathered from the existing resources and merged with the findings of the locally executed algorithms. Thereafter, these miRNA binding sites were separated into 6 different lists. Second, the experimentally verified miRNA-target interactions were retrieved via an automated text-mining survey in PubMed and the data obtained from four databases host such information. Third, the functional annotation information such as pathways, ontologies and diseases were obtained to further dissect the putative as well verified miRNA-target interactions. In the last step, the web-interface has been designed to host the collated information that was stored into a MySQL database (miRWalk2.0). The web-interface of miRWalk2.0 has two modules (the Predicted and Validated Target) which can be interrogated to acquire miRNA-target interactions for human, mouse and rat. Moreover, external links have been integrated with the result pages, allowing users to obtain more annotation and information on queried genes, miRNAs, pathways, ontologies, and/or diseases.



Figure 1. Roadmap of miRWalk2.0.


Why does miRWalk2.0 search possible miRNA binding sites within the complete sequence of a gene?

For more than a decade, attempts to study the interaction of miRNAs with their targets were limited to the mRNA 3'-UTR region. However, several investigators have recently suggested an alternative mode of gene regulation in which miRNAs anneal within the promoter, cds, 5'- and/or 3'-UTR regions of their targets thereby regulating their translation [18-21]. Therefore, it is of paramount importance to search possible miRNA binding sites within the complete sequence (promoter, 5'-UTR, CDS and 3'-UTR) of a gene.

In order to support such interactions, miRWalk2.0 offers possible miRNA interactions with all the regions of a gene by gathering 13 prediction datasets from existing miRNA-target resources [1-13]. These 13 different prediction datasets are preprocessed, unified and the processed information is further used to build novel comparative platforms of miRNA interactions, enabling the users to access new targets on the promoter, cds, 5'- and/or 3'-UTR regions.


What does miRWalk2.0 cover?


miRWalk2.0 novelties are as follows:

How do I use miRWalk2.0 database?

The web-interface of miRWalk2.0 is broadly classified into the Predicted Target (PTM) and the Validated Target (VTM) modules. These two modules are further categorized into different search pages, allowing users to fetch miRNA associated information using different identifiers.

Search methods implemented under the PTM: Gene-miRNA Targets search page

Figure 2:

Step1. Select a species, database and input identifier type from the given drop-down menus (Figure 2) and either paste or upload a list of identifiers.

Step2. Select at least one check box to obtain information on input identifiers and their functional association.

Step3. Select starting position (from 1 to 6) of a miRNA seed, region(s) of input genes on which you want to search possible miRNA binding sites (a maximum of 10kb i.e. 10,000 is allowed for the promoter region), enter minimum seed length of miRNA, and/or P-value and choose at least two algorithms to obtain a comparative overview of miRNA binding sites resulting from 13 different prediction datasets within the promoter, 5'-UTR, CDS and 3'-UTR regions.

Step4. Click on the "SEARCH" button to execute the query.




Figure 2. Gene-miRNA search page.



Figure 3:
In Gene-miRNA Targets search, a tabulated result page (Figure 3) is presented with links for the gene (Figure 4), allowing users to retrieve data including gene information (Figure 4), genomic location (Figure 5), gene synonymous, RefseqIDs and homologous information (Figure 6), external links (Figure 7), information on gene and protein classes (Figure 8), functional association (Figure 9) and miRNA binding sites predicted with different combination of algorithms (Figure 10). Additionally, information on human homologous genes across 15 species can be downloaded to conduct an interspecies analysis on homologous genes (Figure 6). Also, external links (Figure 7) are provided, permitting the user to obtain data on phenotype, genotype, SNPs, splice junction, functional networks, neighbouring genomic members, expressions of genes and proteins in human organs, their MS/MS spectra and relevant PubMed articles. This page offers a one-stop place to collect an abundance of information on queried genes (Figure 3).



Figure 3. Gene-miRNA result page.



Figure 4:
By clicking on the GeneTab link (Figure 3), a user can gather basic information (such as EntrezIDs, Chromosome, Map, definition) on queried genes (Figure 4) and can be easily downloaded by a single click on the "Download Table" link. The contents of this result table are hyper linked to external databases: Gene and Taxonomy at NCBI to obtain further information.



Figure 4. Gene information table.



Figure 5:
By clicking on the "Gene Location" link (Figure 3), one can obtain information on genomic location (such as ContigID, Start and end positions, chromosome, map, and strand) and epigenomics on queried identifiers (Figure 5) and can download by a single click on the "Download Table" link. This table has some hyper links to external databases (Gene, Nucleotide and Epigenomics at NCBI) to get additional information.



Figure 5. Gene location table.



Figure 6:
By clicking on the "Synonymous", "Refseq Table" and "Homologous Table" links (Figure 3), a user can collect information on synonymous (such as genes, synonymous, EnsemblIDs, RefseqIDs [44], UCSCIDs, VegaIDs [45], UniGeneIDs, LocusTagIDS, RefseqPIDs, HGNCIDs, UniProtIds, OMIMIDs and UniSTS), mRNAs (RefseqIDs, CDS start and end positions and the length of mRNAs) and homologous (comprehensive atlas of human homologous genes among 15 different species) on queried genes (Figure 6). These tables can be easily downloaded by a single click on "Download Table" links. These tables are hyperlinked to Gene and Nucleotide (Refseq) at NCBI to get additional information.



Figure 6. Information on synonymous, mRNAs and homologous genes.



Figure 7:
By clicking on the "External links" (Figure 3), users can obtain information on their genes of interest from several databases via. the given links (Figure 7). These external databases are UniGene, HGNC, OMIM, Ensembl, UCSC, AceView, DGV, CCDS, Genotype, ClinVar, dbVar, PheGenl, GeneMania, Nucleotide, EST, Probe, Protein, CDD, GEO, ProteomicsDB, Human Proteome Map (HPM), UniProt, PubChem Compound, PMC and PubMed. Interestingly, all these external links can be downloaded by clicking on the "Download Table" link.



Figure 7. External database links.



Figure 8:
Users can retrieve information on gene and protein classes (Figure 8) associated with their input identifiers by clicking on "Gene" and "Protein" classes links (Figure 3). Moreover, a comparative overview of protein and gene classes among 15 different species can be viewed and/or downloaded. The "Gene" and "Class" fields are also hyperlinked with external databases (Gene and Panther) to obtain further information on input genes and their protein classes.



Figure 8. Information on gene and protein classes.



Figure 9:
One can fetch information on pathways and ontologies associated with queried identifiers (Figure 9) by clicking on "KEGG", "WIKI", "Panther", "GOBP", "GOMF" and "GOCC" links (Figure 3). Moreover, comparative overviews of pathways and ontologies among 15 different species can be viewed and/or downloaded. The "Gene", "KEGG", "Wiki", "Panther" and "GO" fields are also hyperlinked with external databases (Gene, KEGG, WikiPathways, Panther and Gene Ontology) to obtain further information.



Figure 9. Information on genes associated with pathways (e.g., KEGG, Wiki and/or Panther) and ontologies (GO).



Figure 10:
One can obtain the possible miRNA binding sites within the complete sequence of genes resulting from the miRWalk algorithm and 12 other prediction datasets (Figure 10) by clicking on the links: Promoter, 5'-UTR, CDS and 3'-UTR integrated on the result page (Figure 3). "Green" and "red" colour cells in the comparative platforms indicate, whether a given miRNA-target interaction is "predicted" or "not predicted", respectively. Moreover, these tables can be downloaded at any time by clicking on "Download Table" links. The "Gene", "RefseqID" and "miRNA" fields are also hyperlinked with external databases (Gene, Nucleotide and miRBase, respectively) for further annotation.



Figure 10. miRNA-target interactions.



Figure 11:
The MicroRNA-target search page (Figure 11) is organized similar to "Gene-based" interface (Figure 2). A user can carry out "miRNA-based" searches by selecting a species, database and type of identifier; by providing identifiers of miRNA; picking result tables; selecting search parameters such as promoter, 5'-UTR, p-value and external databases; functional annotations; and clicking on "SEARCH" button to execute the query (as shown in Figure 11).



Figure 11. miRNA information retrieval system.



Figure 12:
The "miRNA-based" result page (Figure 12) is also organized in the similar manner as the "Gene-based" result page (Figure 3). The result page of miRNA-based interface hosts a multi-layered view of information i.e. sequences, accessions, families, other miRNAs having similar seeds, sequence alignment, host gene and other necessary data - lists of putative targets and statistically enriched pathways, ontologies, gene and protein classes on input miRNAs (Figure 12 to 14). Tables integrated under the "miRNA-based" result page are hyperlinked with miRBase, Gene, KEGG, Wiki-Pathways, Panther, GO and Taxonomy databases to gather further annotation data.



Figure 12. Result page of miRNA information retrieval system.



Figure 13:
Users can collect information on their miRNAs of interest for example, which other miRNAs having similar sequence, similar seeds, data on their families along with their identities, alignment of family members and miRNA host-gene information (Figure 13). These tables are hyperlinked with miRBase for further annotation on queried miRNAs.



Figure 13. Information on miRNAs, similar sequences, similar seeds, families, alignment and miRNA host-gene.



Figure 14:
Users can assemble data such as pre-miRNAs, pre-miRNA aligned profiles, identity of pre-miRNA aligned profile, possible targets predicted by 13 different prediction data-sets and enriched miRNAs within different pathways, ontologies and classes (Figure 14) on their miRNAs of interest. These tables are hyperlinked with miRBase, Gene, Refseq, KEGG, WikiPathways, Panther and GO for further information.



Figure 14. Information on pre-miRNAs, alignment and miRNAs enriched within different pathways.



Figure 15:
Recently, miRNAs have been shown to base-pair with other miRNAs [23]. These observations may not only help to understand the complexity of regulatory networks, but also can open new avenues to better understand how these regulators fine-tune each other to maintain the integrity of a cell. Nonetheless, this information is missing in the existing resources. This information is therefore generated and integrated into miRWalk2.0 with the help of a "miRNA-miRNA-based" search page (Figure 15). One can gather basic information such as miRNA identifiers, sequences, alignments, miRNA host-gene and miRNA-miRNA binding sites predicted by the miRWalk algorithm using the "miRNA-miRNA-based" search. In addition, a comprehensive platform is presented to offer a comparative overview of miRNA-miRNA binding sites (Figure 15) on queried miRNAs. These tables are hyperlinked with miRBase for further information.



Figure 15. miRNA:miRNA interaction query and result pages.


Figure 16:
Using the "Gene-miRNA-pathway Targets" or "Pathway information retrieval system" search, the users can gather putative miRNA binding sites within the complete sequence of all genes belonging to one or more queried pathways (a maximum of 10 is allowed). In addition, it is also possible to obtain a list of genes associated with a given pathway and to collect miRNAs which are enriched for their binding sites within these pathways (Figure 16). These tables are hyperlinked with miRBase, Gene, Refseq, KEGG and WikiPathways for further information.
Other search methods: "Gene-class Targets", "Chromosome Targets", "Gene-miRNA-OMIM Targets", "Disease Targets" and "Human Phenotype Ontologies (HPO) Targets" are organized in the similar manner as "Gene-miRNA-Pathway Targets" search and result pages.



Figure 16. Pathway interaction query and result pages.



Figure 17:
Using the "Mitochondrial Targets" search page, one can fetch putative miRNA binding sites within the complete mitochondrial genome by selecting a species of interest from the drop-down menu (Figure 17). The result page of Mitochondrial Targets is organized similar to other result pages. Users can obtain information on mitochondrial genes, their association with pathways and putative miRNA binding site predictions as well as a comparative view resulting from 5 different prediction datasets (Figure 17).



Figure 17. Mitochondrial interaction query and result pages.



Figure 18:
Large-scale experiments such as next-generation sequencing or transcriptomic profiling, produce large amounts of data (> 1,000 significant genes/miRNAs). Still, there is no single miRNA resource available which either allows users to perform functional enrichment analysis on all the significant candidates (at once) or supplies a functionality to download the customized datasets for stand-alone tools e.g. GSEA [28] and DAVID [46]. To foster large-scale enrichment analysis, a novel feature named "Customized data-sets" is implemented within miRWalk2.0 through which users can generate a customized list of putative targets on their miRNAs of interest from 13 different datasets for promoter, CDS, 5'- and/or 3'-UTR regions (Figure 18).



Figure 18. Customized data-set query and result pages.



Figure 19:
Previous studies suggest that several mammalian miRNA genes are co-expressed with their host-gene and/or neighbouring genes by utilizing their transcriptional machinery and promote synergistic and/or antagonistic effects on them Figure 19. miRWalk2.0 provides a genomic location search functionality for genes to determine which of miRNA(s) share the same or nearby location (Figure 19). A list of disease-specific or significant genes obtained from a microarray profiling study can be interrogated to attain miRNAs that may be expressed with queried genes which could be involved in the genetic regulation of a specific condition. Further, one can use this information to choose miRNAs which are located nearby or within the highly differentially regulated genes (Figure 19) and can perform qPCR experiments to validate potential miRNAs without considering miRNA microarray profiling studies.
Moreover, it is possible to obtain a list of all miRNAs located within the exon, introns, 5'- and/or 3'-UTR regions of human, mouse and rat genomes by selecting check-boxes given on the Genomic Location Search page (Figure 19).



Figure 19. Genomic location search and result pages.


Why does miRWalk2.0 supply miRNA binding sites within all the regions of a gene?

According to the current understanding, a new mode of action of miRNAs has shown through which they may regulate gene expression by binding on the promoter as well as on the coding sequence [18-21]. Therefore, it is of paramount importance to search possible miRNA binding sites within the complete sequence (promoter, 5'-UTR, CDS and 3'-UTR) of a gene.

In order to incorporate such interactions, we have generated possible miRNA interactions with all the regions of a gene by gathering 13 prediction datasets from the existing miRNA-target resources [1-13]. These interactions are documented into miRWalk2.0, enabling the users to access new targets on promoter, cds, 5'- and/or 3'-UTR regions.


Does miRWalk2.0 integrate all transcript encoding by a gene?

Yes, miRWalk2.0 integrates all transcripts encoding by a genes - as it has previously been shown that a gene can encode for different transcripts with different lengths due to alternative splicing process - for example, TP63 gene is known to encode six different transcripts with variant length on 5'-UTR, CDS and 3'-UTR regions.


What does "other databases" mean?

After scanning the complete sequence of all genes/miRNAs (including mitochondrial genomes) of human, mouse and rat for possible miRNA binding site using the "miRWalk algorithm", the prediction datasets resulting from 12 databases are gathered to build novel comparative platforms to compare results. Indeed, it has become a common practice to consider union and/or intersection of miRNA-target interactions resulting from multiple algorithms [50-56]. Therefore, miRWalk2.0 supplies novel platforms of miRNA-target interaction information on the promoter, 5'-UTR, CDS, 3'-UTR, mitochondrial genomes and miRNA-miRNA pairs.
It is important to select at least two algorithms with logical operators (OR or AND) to obtain a comparative view.


What is minimum seed length?

The minimum number of nucleotides (nt) of miRNA seed sequence (from the 5' end) through which a miRNA can bind with its targets i.e. promoter, 5'-UTR, CDS, 3'-UTR and/or miRNA.
It is not possible to search possible binding sites of miRNA with less than 7nt. Therefore, a user should enter at least 7 in the given text box area.


What is p-value?

A probability distribution of random matches of a subsequence (from the 5' end of miRNA sequence) in a given sequence (gene, miRNA and/or mitochondrial genome sequence), is calculated by using Poisson distribution. Where a low probability implies a significant hit. More information on the Poisson distribution has been described in [1-13].
The default p-value is set to 0.05.


A list of resources which are utilized in miRWalk2.0

We sincerely acknowledge all publicly available data sources (listed below) which have been used in miRWalk2.0.

Resource Version Information Link Reference

Information on genes, and their synonymous, identifiers and sequences

NCBI April 2014 symbols & identifiers Click Here
Refseq 61 mRNA sequences Click Here [44]
Organelle Genome Resources July 2014 Mitochondrial genome Click Here
Ensembl May 2014 Promoter (10kb upstream flanking region) Click Here [35]
HomoloGene May 2014 Homologous genes Click Here
MGI April 2014 genes & identifiers Click Here [47]
RGD April 2014 genes & identifiers Click Here [48]
HGNC April 2014 genes & identifiers Click Here [34]

Information on miRNAs, and their synonymous and identifiers

miRBase Release 10 to 20 miRNAs, synonymous & sequences (only rel20) Click Here [36]
NCBI April 2014 names & EntrezIDs Click Here
MGI April 2014 Identifiers Click Here [47]
RGD April 2014 Identifiers Click Here [48]
HGNC April 2014 Identifiers Click Here [34]
EMBL April 2014 Identifiers Click Here
RFAM April 2014 Identifiers Click Here [49]

Functional annotation information

DAVID 6.7 KEGG pathways and their gene-sets Click Here [46]
PANTHER 9 Pathways, protein classes and their gene-sets Click Here [25]
WikiPathways April 2014 Pathways and their gene-sets Click Here [26]
GSEA 2.0.14 Gene classes and their gene-sets Click Here [28]
DGV July 2013 CNV genes Click Here [38]
GO April 2014 GOBP, GOMF, GOCC and their gene-sets Click Here [21]
OMIM July 2014 OMIM disorders and their gene-sets Click Here [32]
HPO Build 553 Human Phenotype Ontologies and their gene-sets Click Here [30]
DO September 2014 Diseases and their gene-sets Click Here [29]

Putative miRNA-target interaction information

Diana-microT 4.0 and 5.0 miRNA binding sites within 3'-UTR Click Here [2]
Diana-microT-CDS 5.0 miRNA binding sites within CDS Click Here [3]
miRanda August 2010 Locally executed to identify miRNA binding sites within the complete sequence Click Here [4]
miRBridge 4.0 miRNA binding sites within 3'-UTR Click Here [5]
miRDB 4.0 miRNA binding sites within 3'-UTR Click Here [6]
miRMap 2013 miRNA binding sites within 3'-UTR Click Here [7]
miRNAMap 2008 miRNA binding sites within 3'-UTR Click Here [8]
doRiNA (PICTAR2) Version 2 miRNA binding sites within 3'-UTR Click Here [9]
PITA 2007 miRNA binding sites within CDS, 5'- & 3'-UTR, and miRNA (locally executed) Click Here [10]
RNA22 version 2 miRNA binding sites within CDS, 5'- & 3'-UTR Click Here [11]
RNAhybrid 2.1 Locally executed to identify miRNA binding sites within the complete sequence Click Here [12]
Targetscan 6.1 Locally executed to identify miRNA binding sites within the complete sequence Click Here [13]

Validated miRNA-target interaction information

miRTarBase 4.0 miRNA-target interactions Click Here [14]
PhenomiR 2.0 miRNA-disease interactions Click Here [15]
miR2Disease 2008 miRNA-disease interactions Click Here [16]
HMDD 2.0 miRNA-disease interactions Click Here [17]
PubMed September 2014 miRNA interactions with miRNAs, genes, diseases, cell lines, organs, processing proteins Click Here


A list of resources which are hyperlinked with the result tables of miRWalk2.0

We sincerely acknowledge all the useful data sources (see below table) which have been hyperlinked with miRWalk2.0.

Resource Link
Gene Click Here
UniGene Click Here
Nucleotide Click Here
HGNC Click Here
RGD Click Here
MGI Click Here
VEGA Click Here
OMIM Click Here
Ensembl Click Here
UCSC Click Here
AceView Click Here
DGV Click Here
CCDS Click Here
Genotype Click Here
ClinVar Click Here
dbVar Click Here
PheGenl Click Here
Neighbourhood Click Here
GeneMania Click Here
EST Click Here
Probe Click Here
Epigenomics Click Here
Protein Click Here
CDD Click Here
GEO Click Here
ProteomicsDB Click Here
HPM Click Here
UniProt Click Here
PubChem Compound Click Here
PMC Click Here
PubMed Click Here
miRBase Click Here
RFAM Click Here
KEGG Click Here
PANTHER Click Here
WikiPathways Click Here
DO Click Here
HPO Click Here
Taxonomy Click Here

What is the current status of miRWalk2.0?

Currently, the PTM of miRWalk2.0 hosts putative interaction information between more than 11,740 miRNAs and genes, miRNAs, mitochondrial genomes of human, mouse, and rat resulting from 13 different prediction datasets. In addition, it supplies the predicted miRNA binding sites on genes linked to biological pathways, gene ontologies, diseases, OMIM disorders, human phenotype ontologies, gene and protein classes.

In the VTM, more than 13,650 publications are documented on miRNAs. This module documents experimentally validated interactions on 3,081 miRNAs and reports more than 151,666,930 relationships associated with 19395 genes, 1,955 DOs, 12 gene classes, 4,371 GOBPs, 1,331 GOMFs, 715 GOCCs, 6,463 HPOs, 4,087 OMIM disorders, 546 pathways, 28 protein classes, 450 diseases, 671 organs and 87 cell lines. In addition, it presents the information on proteins known to be engaged in miRNA processing. This module is last updated on 29th September 2014.

Categories Human Mouse Rat Total
General information
Genes 20,022 22,232 22,817 308,700
mRNAs 65,520 28,916 28,928 512,412
miRNAs 2,578 1,908 728 11,748
Identifiers 994,135 684,553 364,268 5,077,757
Functional annotation information
KEGG pathways 204 197 196 2,701
Panther pathways 160 149 147 2,014
WikiPathways 226 146 150 1,519
Gene ontologies (GO) 7,506 5,441 5,447 49,035
Disease ontologies (DO) 2,035 NA NA 2,035
Human Phenotype ontologies (HPO) 6,727 NA NA 6,727
OMIM disorders 4,980 NA NA NA
Gene classes 12 12 12 180
Protein classes 29 27 29 430
Putative miRNA-target interaction information
Promoter (5 algorithms) 146,354,554 123,529,954 38,037,935 307,922,443
5'-UTR (5 algorithms) 71,409,379 18,621,336 3,663,590 93,694,305
CDS (5 algorithms) 143,634,119 25,667,955 16,594,605 185,896,679
3'-UTR 127,216,865 34,970,008 9,253,524 171,440,397
miRNA-miRNA (5 algorithms) 2,116,365 1,559,205 579,638 9,747,305
Mitochondrial (5 algorithms) 30,372 20,745 7,805 58,922

Validated miRNA-target interaction information
category Total (N) Interactions Articles Genes (14 species) miRNAs (14 species)
Genes 19,395 3,511,084 8,866 19,395 3,081
miRNAs 3,081 3,511,084 9,282 19,395 3,081
Diseases 450 209,397 3,650 3,656 2,347
KEGG pathways 200 4,715,062 4,783 6,200 2,454
PantherDB pathways 151 1,698,749 3,777 3,018 2,351
WikiPathways 195 2,939,796 4,981 4,609 2,296
GOBP 4,371 26,139,879 6,108 16,581 2,597
GOMF 1331 10,533,323 6,068 16,414 2,598
GOCC 715 11,395,932 6,128 17,682 6,128
Gene classes 12 3,510,396 6,144 19,377 2,605
Protein classes 28 3,306,759 5,389 9,242 2,515
DOs 1,955 17,101,736 5,491 6,085 2,276
HPOs 6,463 64,456,044 3,832 2,364 2,080
OMIM disorders 4,087 2,133,681 4,111 2,543 2,124
Organs 671 182,863 8,441 NA 2,659
Cell lines 87 15,092 1,156 NA 1,422
Articles
13,650


What are the future plans of miRWalk2.0?

More annotations and additional species will be integrated to further expand this resource.

We would like you to let us know if you encounter problems during the use of miRWalk2.0 or you have suggestions to improve the user interface as well as incorporation of new features to this resource.

To obtain further information about miRWalk2.0, please contact: miRWalkTeam at mirwalkteam@medma.uni-heidelberg.de


How does miRWalk2.0 store all the putative targets resulting from 13 prediction datasets?

All the possible targets predicted (with no threshold or filter) obtained from the established miRNA-target prediction programs (3rd party algorithms) are stored in miRWalk database. Currently, miRWalk2.0 hosts all the putative targets (both matched and unmatched with miRWalk prediction data) of 3rd party algorithms.


What is the miRWalk algorithm?

In 2011, we developed the miRWalk algorithm [1] to identify all possible interactions between miRNA and gene sequences. Briefly, based on Watson-Crick complementary, it starts walking on the complete gene sequence and mitochondrial genome with a starting miRNA seed of 7nt (heptamer) and identifies possible miRNA binding sites up to possible matching on the complete sequence of all known genes, returns all the identified bindings, then it assigns these miRNA binding sites to four regions (Promoter, 5'-UTR, CDS and 3'-UTR) of protein coding genes and mitochondrial genes. In addition, the probability distribution of random matches of a subsequence (from the 5' end of miRNA sequence) in the analyzed sequence is calculated by using Poisson distribution [12]. In a next step, miRWalk compares its identified miRNA binding sites with the results of 8 established miRNA-target prediction programs i.e. DIANA-microT, miRanda, miRDB, PicTar, PITA, RNA22, RNAhybrid and TargetScan/TargetScanS. Finally, it incorporates all the predicted miRNA binding sites produced by the miRWalk algorithm and the 8 established programs into a relational database (miRWalk). Thereafter, it performs an automated text-mining search in the titles/abstracts of PubMed to retrieve the experimentally verified information on human, mouse and rat miRNAs and their interactions linked to genes, pathways, diseases, organs, cell lines, OMIM disorders, and proteins known to be involved in miRNA processing. This information is complied and stored as experimentally verified miRNA-target interactions into miRWalk database.
Information on the predicted as well as validated miRNA-target interactions is generated by executing automated Perl and BioPerl scripts on the server of bwGRID Cluster Heidelberg (High Performance Cluster). Please read Dweep et al. [1] for more information on the miRWalk algorithm.


A list of customized dataset files for downloading

All the customized datasets are available for downloading in two most popular ready to use file formats (Rdata and GMT) via the Holistic view search page implemented under the PTM of miRWalk2.0.


How can I reduce the number of putative target genes on my miRNAs of interest?

Many computational approaches have been developed by considering different searching rules (e.g., base-pairing, thermodynamic stability, conservation and cooperativity, and multiplicity of miRNA binding sites) to identify possible miRNA-target interactions [1-4]. These algorithms have proven to be useful; however, comparative investigations carried out with these algorithms suggest that no program is consistently superior to all others [5-6]. Therefore, to overcome this issue, researchers have started focusing on the prediction information generated by combination of different programs [7-9]. Moreover, this approach has become very popular and has been applied in hundreds of publications [7-9]. Therefore, in order to further explore whether the consideration of different combination of algorithms is a stringent filter, we have estimated the median value of targets within 3-UTR region of human (hsa), mouse (mmu) and rat (rno) with different numbers of algorithms (e.g., at least 2 to 10). By considering only those interactions predicted with at least 2 algorithms, the median number of targets found for hsa, mmu and rat were 7967, 7793.5, and 3857, respectively. Interestingly, a rapid decrease in the median values was observed with an increase in the number of algorithms (Figure 20a). For example, by considering at least 4 algorithms, the median values were decreased to 3865 (hsa), 2724 (mmu) and 1146 (rno). When using at least 6 algorithms, the median values were further decreased to 1000, 1504 and 66 for hsa, mmu, and rat, respectively.

Figure 20:
Reduction of the median (targets) values within (a) 3-UTR, (b) Promoter, (c) 5-UTR and (d) CDS regions, when the number of algorithms is increased.



Figure 20. Reduction of the median (targets) values.


Similarly, this filtering criterion was also applied to other regions: the promoter (2kb) (Figure 20b), 5-UTR (Figure 20c) and CDS (Figure 20d) to find out changes in the median values. The median values were found to decrease in the similar fashion with increasing number of algorithms as observed for 3-UTR region. For instance, the median values for promoter, 5-UTR and CDS for human with at least 2 algorithms were 5505, 2217 and 10279, however, after increasing the number of algorithms to at least 3, the values were rapidly decreased to 3041, 608, and 4720. Therefore, these observations suggest that different algorithms can work as a stringent filter to reduce the number of target genes for one or more miRNAs.

Additionally, several studies have demonstrated that a considerable number of miRNAs co-target 3-UTR and the CDS or 5-UTR region [10-13]. For example, in Lee et al., it is shown that the reporter constructs containing miRNA binding sites on 5-UTR and 3-UTR down-regulate to a great extent compared to those harboring 3-UTR site alone [13]. In Fang et al., the authors reanalyzed the previously published studies and observed that genes harboring miRNA binding sites within the both regions (CDS and 3-UTR) show significantly stronger regulation compared with the ones having sites in the 3-UTR only [10]. These observations were further reconfirmed in an another study12 and the authors also found that some miRNAs (especially those related to cell cycle) appear to preferentially anneal to CDS region, which they found to be effective in rapid inhibition of translation [12].

Hence, these studies can also be applied as an additional filter to further reduce the number of target genes per miRNA. Moreover, steps need to reduce the number of target genes are depicted in (Figure 22). Briefly, first, one can obtain the miRNA binding site results within the promoter, 5-, CDS and/or 3-UTR regions by applying different algorithms approach (as described in Figure1-4). Second step is to combine these sites as per the region(s) of interest to collect co-target sites (only within 5-UTR+3-UTR and/or CDS+3-UTR). In the final step, one can carry out an overrepresentation analysis with co-target sites within their genes of interest. This enrichment analysis will further decrease the number of miRNAs to a few potential candidates.

To further compliment information hosted by the comparative platform of miRWalk2.0, ~13 million interactions gathered via CLIP datasets are integrated with the help of additional tables which display validated information (how many of predicted interactions are already verified and documented in miRTarBase and/or CLIP datasets) on putative gene-miRNA interactions (Figure 21). Moreover, these interactions are available for downloading in two formats (Rdata and GMT files) to enable stand-alone large-scale overrepresentation analysis. Also, information on the holistic view of these datasets can be downloaded via the Holistic.html page implemented under the Predicted Target module of miRWalk2.0.


References:
1. Dweep, H., Sticht, C. & Gretz, N. In-Silico Algorithms for the Screening of Possible microRNA Binding Sites and Their Interactions. Curr Genomics 14, 127-36 (2013).
2. Min, H. & Yoon, S. Got target? Computational methods for microRNA target prediction and their extension. Exp Mol Med 42, 233-44 (2010).
3. Peterson, S.M. et al. Common features of microRNA target prediction tools. Front Genet 5, 23 (2014).
4. Yue, D., Liu, H. & Huang, Y. Survey of Computational Algorithms for MicroRNA Target Prediction. Curr Genomics 10, 478-92 (2009).
5. Megraw, M., Sethupathy, P., Corda, B. & Hatzigeorgiou, A.G. miRGen: a database for the study of animal microRNA genomic organization and function. Nucleic Acids Res 35, D149-55 (2007).
6. Rajewsky, N. microRNA target predictions in animals. Nat Genet 38 Suppl, S8-13 (2006).
7. Bavamian, S. et al. Dysregulation of miR-34a links neuronal development to genetic risk factors for bipolar disorder. Mol Psychiatry (2015).
8. Dweep, H., Sticht, C., Kharkar, A., Pandey, P. & Gretz, N. Parallel analysis of mRNA and microRNA microarray profiles to explore functional regulatory patterns in polycystic kidney disease: using PKD/Mhm rat model. PLoS One 8, e53780 (2013).
9. Felekkis, K. et al. Increased number of microRNA target sites in genes encoded in CNV regions. Evidence for an evolutionary genomic interaction. Mol Biol Evol 28, 2421-4 (2011).
10. Fang, Z. & Rajewsky, N. The impact of miRNA target sites in coding sequences and in 3'UTRs. PLoS One 6, e18067 (2011).
11. Forman, J.J. & Coller, H.A. The code within the code: microRNAs target coding regions. Cell Cycle 9, 1533-41 (2010).
12. Hausser, J., Syed, A.P., Bilen, B. & Zavolan, M. Analysis of CDS-located miRNA target sites suggests that they can effectively inhibit translation. Genome Res 23, 604-15 (2013).
13. Lee, I. et al. New class of microRNA targets containing simultaneous 5'-UTR and 3'-UTR interaction sites. Genome Res 19, 1175-83 (2009).

Figure 21:
The Venn diagrams (a to c) describe RBP (RNA-binding proteins) interactions within 5-UTR, CDS and 3-UTR obtained via four different CLIP datasets of human; (d) depicts comparative view of miRNA-target interactions observed in CLASH dataset of human; and (e) shows RBP interactions within the 5-UTR, CDS and 3-UTR of mouse genes using three CLIP datasets.


Figure 21. Overview of CLIP datasets.


Figure 22:
In order to reduce the number of putative target genes on miRNAs of interest, the below steps can be followed (Figure 22).

Step 1. Collect information on target genes (by considering at least 2 algorithms) having binding sites of miRNAs of interest within the mRNA 5-, CDS and 3-UTR regions (as described in Figure 11-14) via the microRNA information retrieval system or Holistic.html implemented under the PTM of miRWalk2.0.

Step 2. Compile information obtained from step 1 and create separate lists (files) of target genes having binding sites for miRNAs of interest within different combinations i.e. 5-UTR+CDS, 5+3-UTR and CDS+3-UTR.

Step 3. Subject all the files resulting from step2 to stand-alone enrichment analyses and/or map them with experimentally verified data (validated target genes and/or CLIP datasets). For further help, please contact miRWalkTeam at mirwalkteam@medma.uni-heidelberg.de.


Figure 22. Steps to reduce the number of putative target genes on miRNAs of interest.


Similarly, the above three steps can be considered for reducing the number of miRNAs on a list of significant genes.


References