microRNA family sizes

microRNA family size distribution, miRbase 17

It is now well established in the microRNA field that many microRNAs come in families with the same seed (typically defined as position 2-8 from the 5′ end of the mature microRNA). The seed is highly determining for the which targets a microRNA regulates. microRNAs with the same seed are often thought to have redundant functions.

At Santaris Pharma we have developed a technology that we call tiny LNA antimiRs (trademark, trademark!), which is simply a short all LNA oligonucleotide that only hybridizes to the seed of a microRNA that we want to knock down. See our paper Obad et al in Nature Genetics for details.

The question I will answer in this post is:

How many human microRNAs occur in families with the same seed? And what is the distribution of family sizes?

The authoritative source for microRNA annotation is miRbase maintained by Sam Griffith Jones. However the interface to mirbase is made primarily for finding information about single miRNAs, not doing the summary statistics necessary to answer our question.  Fortunately, Anders Jacobsen and I have created miRmaid, which is a programmatic interface (API) to the data contained in miRbase.

With miRmaid the following code can answer the question:

Species.find_by_abbreviation('hsa').seed_families.uniq.select{ |sf| 
    sf.sequence.length == 7}.each do |sf|
    puts [sf.name, sf.matures.uniq.size].join(",")
end

The code is in ruby and is fairly easy to read: Find the species with abbreviation ‘hsa’ (this is human), get the unique seed families for this species and select those of length 7 (miRmaid also defines seed families based on 6 mer seeds), then for each seed family print its name and its size (number of members).

Creating a histogram of the results yields the above graphic.

Summarizing the numbers for the human microRNAs in miRbase version 17, we get that 1733 human mature microRNAs encoded by 1424 precursors (some have miRs annotated on both sides of the hairpin) have 1419 different 7mer seeds. There are 1222 miRs that do not share a 7-mer seed with any other annotated miR, while 511 have at least one other microRNA with the same 7mer seed.

From following the microRNA field for many years now I have observed that the number of annotated miRs have expanded especially as a consequence of developments in deep sequencing technology. A relatively large proportion of the newly discovered miRs are singleton that have low expression levels only detectable by deep sequencing. Among the more highly expressed and evolutionarily conserved miRs the family sizes tend to be larger.

 

 

Posted in miRNA, Oligoinformatics Tagged with:

Leave a Reply