The Latest

Apply now for exciting PhD course – August 2012

On January 25, 2012, in COAT, Education, miRNA, Sequencing, by Jeppe Vinther
0

We are planning an exciting PhD course to take place from the 20th to the 24th of August in Copenhagen. The course will cover the following topics: parallel sequencing technologies, microRNA Biology and ‐targeting, protein‐RNA interactions, RNA structure and antisense oligonucleotide (ASO) drugs. The course is aimed at researchers that have no formal training in bioinformatics, but want to learn how to do analysis of sequencing based data obtained from small RNA cloning experiments, RNA-seq etc, perform RNA structure prediction and do analysis of miRNA and ASO experiments. The course is a mixture of lectures and hands-on exercises and we will invite international experts to come talk about their research on the final day of the course. Our aim is to make it possible for the students to do the different types of analyses themselves after participation in the course. Updates regarding the course will be posted on rna.dk. For more information, including how to apply for participation, use this link.

 

How vast is the chemical space of RNA drugs?

On January 11, 2012, in Education, Oligoinformatics, by Peter Hagedorn
0

Chemical space

Chemical space is the space spanned by all chemical compounds. As of January 2012, there are more than 64 million organic and inorganic substances reported in the scientific literature (link). This is, however, still only a vanishing fraction of the total number of possible small organic molecules, which has been estimated to exceed 1060 (read: 10 to the power of 60) (link). Drug discoverers have to navigate the mind-boggling vastness of this chemical space to identify those few compounds that can be used as drugs.  In some cases, knowledge about the drug target and the chemistry involved in drug-target binding allows rational drug design principles to be applied. That is, drug space is truly navigated in the sense of purposeful steering. In many cases, as with for example high-throughput screening, however, the discovery process is more like drifting through space on a random-walk-like trajectory.

For the RNA drug subspace, sequence information allow us to estimate the number of possible oligos that can be designed against a given target. In this blog I will focus on oligos consisting of only DNA and LNA nucleotides. Also, I will consider only two types of general designs: gapmers against mRNA targets and mixmers against miRNA targets.

Gapmers

Gapmers are oligos that have a stretch of DNAs in the middle and LNAs in the flanks. A shorthand notation for a 10nt gapmer with 3 LNAs in each flank and 4 DNAs in the middle (the gap) is LLLDDDDLLL. When a gapmer oligo binds to the region of its RNA target transcript where it has perfect complementarity, the DNAs in the gap forms a DNA:RNA double strand with the target transcript which can recruit RNAse H. Upon RNAse H binding the transcript is cleaved (link) and the cleaved fragments degraded by exonucleases.

So, for a given target, how many gapmers can be designed? Say that the transcript is n nucleotides long, and that the oligo is l nucleotides long. Then we can tile nl+1 oligos along the transcript. For n >> l, n-l+1 is approximately equal to n. For example, a transcript that is 1000nt long can have 987 oligos of length 15 tiled along its length.

Concerning gapmer length, say that oligos between nmin and nmax in length are relevant to consider. That is, at each starting position, the oligo length can be between nmin and nmax, and there are  therefore nminnmax+1 different lengths that needs to be evaluated. Also, several (k) gapmer designs may be relevant. For example, the number of LNAs in each flank may vary between 2 and 3, giving k=4 (e.g. LLDDDDLL, LLLDDLLL, LLDDDLLL, LLLDDDLL).

The total number of oligos then equals

{ Σ(n-l+1) } × k , where the sum (Σ) is over n, from nmin to nmax

For the example used here with a 1000nt transcript, oligos between 10nt and 20nt in length, and four different gapmer designs, the total number of oligos that can be designed is 43384. A simple approximation is to say that at each position in the 1000nt long transcript, we can design 11 × 4 = 44 different gapmers (varying length and design), giving 44000 in all.

Mixmers

A mixmer binding to a miRNA hinders the binding of the RISC-incorporated miRNA mature strand to its mRNA targets. Effectively, it can be seen as an antagonist to the miRNAs natural “ligands”. For a mixmer, DNA and LNA nucleotides are mixed to optimize binding affinity and transport. A 10nt mixmer could for example be written LDLDLDLDLL.

In contract to gapmers, the length of the miRNA targets are usually only around 22nt in length – much shorther than the mRNA targets for gapmers. On the other hand, the possibility of mixing LNA and DNA in the oligos, make the number of different designs explode, as is shown in the following.

In its most simple form, for a mixmer of length l, where at each position we can have either a DNA or a LNA, the total number of possibilities is 2l. For a 10mer, for example, there are 210 = 1024 possibilities. Three design considerations constrain the possibilities: (1) There has to be at least one LNA in each flank to protect against degradation by endonucleases, (2) depending on the oligo length, the LNA/DNA ratio is usually constrained to a certain interval, e.g. between 20% and 80% for a 16mer, since both too high- and low affinity is unwanted, and (3) there should be no more than 3 DNAs in a row to avoid RNAse H recruitment.

I will skip the equations for the total number of possibilities under these constraints for now (perhaps a topic for a separate blog), and simply tabulate the possibilities for mixmers between 8nt and 23nt

8   2

9   16

10   74

11   92

12   352

13   1108

14   3024

15   7432

16   16862

17   39882

18   78280

19   152782

20   294270

21   567524

22   1091972

23   1053464

For this example I furthermore assumed that the seed portion (nucleotides 2-7 counting from the 5′-end) of the miRNA should be covered by the mimxer and that the LNA/DNA ratio interval went linearly from 100% LNA for 8mers to max 80% and min 20% for 23mers. All in all there are a bit more than 3.3 million possibilities. Notice that the even with these constraints, the number of mixmers that can be designed scales exponentially with the mixmer length.

Gapmer vs mixmer space

With design criteria as specified here, the number of gapmers that can be designed scales linearly with the length of the target mRNA, whereas the number of mixmers scales exponentially with the length of the mixmer. Perhaps surprisingly, this means that the mixmer space quickly becomes much larger than the gapmer space. For both mixmer and gapmer spaces, however, it seems reasonable that the number of oligos are <1010, even when allowing other nucleotide modifications besides LNA. This is still a relatively small number compared to small molecule compounds, where, just for derivatives of n-hexane, for example, there are more than 1029 possibilities (Weininger, pp 425–530 in Encyclopedia of Computational Chemistry).

 

RNA world hypothesis

On December 21, 2011, in Education, by Peter Hagedorn
0

All life, and therefore also life based on RNA, must be able to decrease or at least maintain its entropy. The functional repertoire of RNA needs to be quite varied for this to work. The RNA world hypothesis can therefore be seen as an ambitious research activity aimed at discovering such life-sustaining functions, like structure-based catalysis, that lie beyond the “classic” function as a message transfer molecule. As such, the structure activities in COAT might eventually contribute to this endeavor. For those interested in these things, Michael Yarus’s book  “Life from an RNA world” provides a good introduction. Happy Christmas vacation reading.

Cover from Life from an RNA world

 

COAT and the Department of Biology invites you to a talk by Professor John Mattick: The Central Role of RNA in the Evolution of Complexity

When: December 2nd 2011 at 2:00 p.m. Where: Lundbeck Foundation Auditorium, Biocentre, Ole Maaloes Vej 5, 2200-DK

The talk is in honour of Professor Peter Arctander on the occasion of his retirement. After the talk everyone is invited to attend a reception for Peter (starts ~15:30).

 

OTS day 1: new RNAi triggers

On September 8, 2011, in Conferences, by Morten Lindow
0

The Oligonucleotide Therapeutic Society annual meeting has started. Taking place in the architectural pleasing Royal Library in Copenhagen.

I will try to report on my subjective picking of highlights from this conference. I will be blogging from my ipad, but will be updating some of the posts later with pictures links etc.

The most interesting session of the first day was the ‘hot topics’ selcted from the submitted abstracts.

Two alternative ways to trigger RNAi was presented – interesting both scientifically and for their business perspectives, since they most likely fall outside of Alnylams IP-dominance in the RNAi field.

Dong-ki Lee, presented to me new, but at least partially published data on alternative structures to the canonical Tuschl type siRNA. Lee’s group demonstrated that assymmetric duplexes in which the passenger strand was truncated was in some cases superior to standard siRNA. LNA modifications increased potency.

The lecture hall of the Royal Danish Library

Eric Swayze, from ISIS, thourough medicinal chemistry work, showing that ssRNA, with various modifications, especially 2′F, cause ago2 dependent knockdown of target mRNAs. In vivo they reached the same ED50 as standard MOE-gapmers.

Tagged with:
 

microRNA family sizes

On August 3, 2011, in Oligoinformatics, by Morten Lindow
0

microRNA family size distribution, miRbase 17

It is now well established in the microRNA field that many microRNAs come in families with the same seed (typically defined as position 2-8 from the 5′ end of the mature microRNA). The seed is highly determining for the which targets a microRNA regulates. microRNAs with the same seed are often thought to have redundant functions.

At Santaris Pharma we have developed a technology that we call tiny LNA antimiRs (trademark, trademark!), which is simply a short all LNA oligonucleotide that only hybridizes to the seed of a microRNA that we want to knock down. See our paper Obad et al in Nature Genetics for details.

The question I will answer in this post is:

How many human microRNAs occur in families with the same seed? And what is the distribution of family sizes?

The authoritative source for microRNA annotation is miRbase maintained by Sam Griffith Jones. However the interface to mirbase is made primarily for finding information about single miRNAs, not doing the summary statistics necessary to answer our question.  Fortunately, Anders Jacobsen and I have created miRmaid, which is a programmatic interface (API) to the data contained in miRbase.

With miRmaid the following code can answer the question:

Species.find_by_abbreviation('hsa').seed_families.uniq.select{ |sf| 
    sf.sequence.length == 7}.each do |sf|
    puts [sf.name, sf.matures.uniq.size].join(",")
end

The code is in ruby and is fairly easy to read: Find the species with abbreviation ‘hsa’ (this is human), get the unique seed families for this species and select those of length 7 (miRmaid also defines seed families based on 6 mer seeds), then for each seed family print its name and its size (number of members).

Creating a histogram of the results yields the above graphic.

Summarizing the numbers for the human microRNAs in miRbase version 17, we get that 1733 human mature microRNAs encoded by 1424 precursors (some have miRs annotated on both sides of the hairpin) have 1419 different 7mer seeds. There are 1222 miRs that do not share a 7-mer seed with any other annotated miR, while 511 have at least one other microRNA with the same 7mer seed.

From following the microRNA field for many years now I have observed that the number of annotated miRs have expanded especially as a consequence of developments in deep sequencing technology. A relatively large proportion of the newly discovered miRs are singleton that have low expression levels only detectable by deep sequencing. Among the more highly expressed and evolutionarily conserved miRs the family sizes tend to be larger.

 

 

Tagged with:
 

Project: MicroRNA Masters project at Santaris Pharma

On July 8, 2011, in Education, by Morten Lindow
0

Santaris offers a Masters project on microRNA mechanisms in metabolic diseases starting in Sept 2011.

Santaris Pharma is a biopharmaceutical company that develops innovative RNA-targeted medicines based on its proprietary locked nucleic acid (LNA) drug platform. The Company’s LNA oligonucleotide compounds are designed to effectively down-regulate disease-implicated mRNAs and microRNAs.

microRNAs are short non-coding RNAs that have been shown to be involved in the development of human diseases, such as cancer, CNS disorders and cardiovascular disease. We are currently working on targeting of microRNAs implicated in metabolic disorders. The primary objective of the Masters project will be to study selected miRNAs that play important roles in metabolic disease, such as atherosclerosis and dyslipidemia, using cell culture assays and LNA-antimiR oligonucleotides.

We offer: a dynamic and fun environment, good laboratory facilities, and an opportunity to use and develop your skills in an industrial setting. Great office space, lunch, fruit, coffee etc will be provided.

We expect: highly motivated and self-driven students with knowledge about cell culture and molecular biology techniques. We expect the student to be present and work regularly at our site in Hørsholm.

To apply: Please send an email to mwl@santaris.com or suo@santaris.com, with your CV, information about yourself and your academic track record. We will be interviewing candidates for the project during August.

Tagged with:
 

UPDATE: This project is filled.

Background

At the Center for Computational and Applied Transcriptomics (COAT), experimental and computational methods are being used to investigate RNA structure and RNA-protein interactions on a transcriptome wide scale to enable the design of effective and safe RNA drugs (more information at rna.dk). As part of COAT, Santaris Pharma is looking for a bioinformatics masters student to work on a project as described below.

Outline of project

RNA molecules are single-stranded molecules with a strong tendency to fold back on themselves and form Watson-Crick pairs, leading to secondary structural motifs of various lengths and complexities. These motifs can further assemble into intricate three-dimensional architectures. RNA structure is a key element in many regulatory processes (Westhof and Romby, Nat Methods, 2010), and, more pragmatically, may limit the accessibility of RNA drugs (modified oligonucleotides designed to be reverse complementary to their target RNAs).

Experimentally, high-throughput sequencing technology have recently been used to determine RNA secondary structure in full transcriptomes of yeast (Kertesz et al., Nature, 2010) and mouse (Underwood et al., Nat Methods, 2010) cells.

Computationally, RNA structure prediction can be based on deriving minimal free energy structures that maximizes the number of base pairs and uses empirical estimates of thermodynamic parameters for neighbouring interactions and loop entropies (Zuker and Stiegler, NAR, 1981, and Hofacker et al., 1994). Alternatively, comparative genomics methods that assume different evolutionary pressures on single-stranded and double-stranded parts of a folded RNA molecule can be used (Pedersen et al., PLoS Comput Biol, 2006). The thesis project will compare the experimentally observed structures derived from the sequencing data with the predictions made from these methods.

Based on this work, possible accessibility limitations due to secondary structure will be evaluated for a number of RNA molecules and compared with Santaris Pharma’s experimentally observed performances of sets of LNA-modified oligonucleotides designed against each of them.

Requirements

  • Fluency in R and perhaps also in Python or similar.
  • Solid understanding of RNA biology.
  • Experience with analysing sequencing data is a plus

For information about how to apply, please see this post.

 

 

UPDATE: This project is now filled.

microRNAs are known to regulate the expression level of a large number of mRNA targets, which can be predicted with reasonable accuracy using bioinformatics. When the activity of a microRNA is perturbed it results in changed expression levels of its targets. Such expression level changes can be detected using global mRNA expression data (e.g from microarrays) and is known as a miR-signature. miR-signatures constitute an indirect, but functional measure of changed microRNA activity and thereby provides an alternative to direct profiling of microRNA expression levels. Array Express and Gene Expression Omnibus are large public databases containing a wealth of mRNA expression changes in wide variety of states, including disease states. 

The objective of this project is to mine the public expression database to look for miRsignatures pinpointing abnormal microRNA activities linked to diseases.  Such findings might suggest new therapeutic targets for LNA-antimiR therapy.This project will be part of the Cardiomir EU collaboration that Santaris is part of. While the method to be developed and applied is general, it will be focused on cardiometabolic disease for which there is an opportunity to do experimental followup-studies within the context of the Cardiomir project.

Requirements
  • Experience with analysis of many large data sets is an advantage.
  • A high level in R or a similar language is important.

For information about how to apply see this post. Discuss or ask questions about the project in the comments.

Tagged with: