Abstracts - 2007
Discovery and Phylogenetic Analysis of microRNAs in Mammalian Species
Shay Artzi, Adam Kiezun & Noam Shomron
MicroRNAs (miRNAs) are small non-coding RNAs that control gene expression by negatively regulating translation. miRNA have emerged as a major class of regulatory genes in most metazoans and as important regulators for a diverse range of biological processes. Understanding the accurate direction of mammalian phylogenetic evolution is of great importance, but even now, when large genomic sequences are known, some relationships within the phylogenetic trees are controversial. We study one such controversial relationship within the mammalian phylogenetic tree: the three-taxon placement of rodent, primates and carnivores, using miRNA genes.
We have created a fully automated tool, miRNAminer, that identifies candidate miRNAs (precursor and mature sequences) by homology search and alignment. We applied the tool to the set of already known miRNAs from the metazoan phylum. miRNAminer searches for non-paralogous sets of miRNAs from the miRNA database from the Sanger Institute.
Using miRNAminer, we complemented the Sanger miRNA database with hundreds of ortholog mammalian miRNA genes. This enabled us to accurately compile a phylogenetic tree based on presence/absence patterns. Our miRNA database enables phylogenetic reconstruction, obtained via entirely different means, namely miRNA genes, leading to evidence supporting a primate-rodent clade with the exclusion of carnivores.
For searching genome databases, miRNAminer uses BLASTN. In our experiments, we used seven mammalian genomes from ENSEMBL. For each candidate miRNA, miRNAminer searches the genome for the precursor sequences of all known miRNAs of the same name. Sequences that have E-value at most 0.1 per chromosome are selected for further evaluation. To further filter potential miRNAs, miRNAminer uses the following criteria about conservation of mature miRNA sequences, conservation of precursor miRNA sequences and precursor miRNA secondary structure, i.e., fold (the first three criteria were proposed in previous work, such as [1, 2]). We estimated the parameters in the following selection criteria from data in the miRNA registry and chose the values that included at least 95% of known miRNA genes:
Phylogenetic Tree Reconstruction
We exploited the rarity  of miRNA loss or convergent evolution to create an algorithm for phylogenetic tree reconstruction that uses information about presence/absence of miRNA in a set of species. The algorithm applies bootstrapping to reconstruct trees for large datasets using a subroutine that works under a stronger assumption of no miRNA loss (i.e., a miRNA occurs in every species descendant from an ancestral species in which the miRNA appeared).
We also created an algorithm to reconstruct phylogenetic trees by minimizing miRNA loss. That is, given miRNA presence data in a given set of species, every phylogenetic tree topology implies a number miRNA that are lost (as predicted by the topology). The problem is to find the tree topology that minimizes this number. We implemented a stochastic search for the minimizing tree and compared that tree with the bootstrapped one. After enough iterations (ca. 30'000), bootstrapping and stochastic search reconstructed identical tree topologies for our experimental data.
Results and Discussion
We used miRNAminer to perform a comprehensive homology search for miRNA precursors in the studied species. For the search, we used all 2925 vertebrate miRNAs listed in the Sanger miRNA registry (release 9.0 of October 2006). Figure 1 shows the summary information of miRNAs listed in the Sanger registry and of new miRNAs identified by our method. To conclusively confirm the presence of the identified candidates in the studied species, an experimental verification is required. However, the candidates identified by our method are close homologs to known miRNAs and as such are not required to meet as stringent criteria to be annotated as miRNAs .
There is an ongoing discussion about the phylogenetic relationship of mammalian orders. A recent study  based on full genome sequences supports a primate-carnivore clade with the exclusion of rodents. Other studies support the primate-rodent clade . We used our bootstrapping reconstruction algorithm from to evaluate both hypothetical phylogenies according to the miRNA data. It is conjectured that miRNA data is nearly homoplasy-free , which makes it well suitable for reconstructing phylogenies.
Figure 2 shows the confidence scores for the two hypothesized phylogenies. The numbers were computed as the percentage of bootstrapped trees supporting the bifurcation in each intermediate node. The results strongly favor the primate-rodent clade (Figure 2b) over the primate-carnivore clade (Figure 2a). This confidence score of the relevant split is 62% vs. 20%. The phylogenetic tree in Figure 2b was constructed more than 50% of the time (2.5 times more often than any other tree) for 100,000 iterations of the bootstrapping algorithm. We also exhaustively checked all possible 7-species trees and this tree also minimizes the miRNA loss.
 X. Xie, J. Lu, E. Kulbokas, T. Golub, V. Mootha, K. Lindblad-Toh, E. Lander, and M. Kellis. Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals. Nature 434., 2005.
 L. F. Sempere, C. N. Cole, M. A. McPeek, and K. J. Peterson. The phylogenetic distribution of metazoan microRNAs: insights into evolutionary complexity and constraint. J. Exp. Zool., 2006.
 B. P. Lewis, C. B. Burge, and D. P. Bartel. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell, 120(1):1520, January 2006
 B. P. Lewis, I. H. Shih, M. W. Jones-Rhoades, D. P. Bartel, and C. B. Burge. Prediction of mammalian microRNA targets. Cell, 115(7):787798, December 2003.
 V. Ambros, B. Bartel, D. P. Bartel, C. B. Burge, J. C. Carrington, X. Chen, G. Dreyfuss, S. R. Eddy, S. Griffiths-Jones, M. Marshall, M. Matzke, G. Ruvkun, and T. Tuschl. A uniform system for microRNA annotation. RNA, 9(3):277279, March 2003
 G. Cannarozzi, A. Schneider, and G. Gonnet. A phylogenomic study of human, dog, and mouse. PLoS Comput Biol, 3(1), January 2007
 J. O. O. Kriegs, G. Churakov, M. Kiefmann, U. Jordan, J. Brosius, and J. Schmitz. Retroposed elements as archives for the evolutionary history of placental mammals. PLoS Biol, 4(4), March 2006