Leveraging Big Data to Accelerate Drug Discovery

MedicalResearch.com Interview with: Neel S. Madhukar Graduate student in the lab of

Neel S. Madhukar

MedicalResearch.com Interview with:
Neel S. Madhukar
Graduate student in the lab of
Olivier Elemento, PhD, Associate Professor
Head, Laboratory of Cancer Systems Biology
Department of Physiology and Biophysics
Institute for Computational Biomedicine
Weill Cornell Medical College

Medical Research: What is the background for this study? What are the main findings?

Response: It takes on average 2.6 billion dollars and 10-15 years to develop a single new drug. Despite massive investment in drug discovery by pharmaceutical companies, the number of drugs obtaining FDA approval each year has remained constant over the past decade. One of biggest bottlenecks in the process of developing a new drug is to understand precisely how a drug works, that is, what it binds to in cells, how it binds, and what it does when it is bound. This process is collectively called target identification and characterization of mechanisms of action. At present, target identification is a slow and failure-prone process, driven by laborious experimentation. Every time we seek to develop a new drug, such laborious experimentation needs to be redone from scratch. We are not learning from data acquired from our past successes and failures.

In Dr Olivier Elemento’s research laboratory at Weill Cornell Medical College, we have embarked on a journey to reboot the drug discovery process by adopting a Big Data strategy leveraging the plethora of drug data available.

In work presented at AACR’15 in Phildadelphia (http://meyercancer.weill.cornell.edu/news/2015-04-21/new-tool-predicts-drug-targets-and-ids-new-anticancer-compounds) (http://www.abstractsonline.com/Plan/ViewAbstract.aspx?mID=3682&sKey=260955b6-99db-46c5-afd1-d0f832af42b5&cKey=d9d0b004-de88-4d49-962d-214863b271ba&mKey=19573a54-ae8f-4e00-9c23-bd6d62268424),

we described a new Big Data technique for accelerating drug discovery called BANDIT –a Bayesian Approach to determine Novel Drug Interaction Targets. BANDIT takes in information from a variety of datasets – in vitro drug efficacy across hundreds of cell lines, change in gene expression of thousands of genes upon drug treatment, millions of bioassays, complex description of drug structure, and reported side effects – and combines them through a probabilistic approach to predict the targets of a given small molecule. Each of these datasets is independently quite large and, thus combining them produced an enormous database that we had to use Big Data analytical algorithms together with supercomputers to analyze.

Here are the main findings of this study :

  • Through the creation of BANDIT we have put together the largest and most comprehensive database of small molecule target and effect information. This database will be invaluable to future research on improving drug discovery.
  • We created a Big Data driven method that predicts the targets of drugs in a rapid and comprehensive manner. Using drugs with known targets and a statistical procedure called cross-validation, we clearly demonstrated that when it comes to drug discovery, the more data the better. The more data describing small molecule activity we integrated into BANDIT the more accurately we recovered their targets across a test set of known drugs.
  • We applied BANDIT to over 50,000 small molecules with no known target or mechanistic information to predict targets and identify a potential therapeutic role. We made predictions for over 40% of these small molecules. In some cases, these predictions point to a new role for known drugs. One example is that we predicted Vismodegib a drug used to treat forms of skin cancer, to also act as a tyrosine kinase inhibitor, indicating its potential to be utilized in other treatment avenues.
  • We found a set of novel molecules that BANDIT predicted to disrupt microtubules. We focused on microtubules because they are important targets for cancer chemotherapy. Together with our colleagues Evi Giannakakou and Prashant Khade at Weill Cornell Medical College, we performed experiments that clearly validate BANDIT’s prediction that these small molecules inhibit microtubules.
  • We also demonstrated that BANDIT not only predicts the target of small molecules with high accuracy, but it also predicts precise mechanisms of action of small molecules, for example whether a molecule that acts on microtubules works by perturbing microtubule polymerization or depolymerization or works through a completely novel mechanism.

Medical Research: What should clinicians and patients take away from your report?

  • Using BANDIT, we predict that the time and cost of drug development will be significantly alleviated, and will allow treatments to reach patients much quicker than current practices would allow. BANDIT will help not only by predicting individual targets, but also by helping researchers determine the most cost-effective way to proceed with their drug development efforts. That is, BANDIT can indicate which data-generating experiment is most informative about a small molecule’s mechanism of action.
  • Some of the new molecules that we showed computationally and experimentally to target microtubules might end up being valuable novel chemotherapeutic agents. Microtubules play a key role in cell proliferation and inhibiting them has tremendous efficacy in killing cancer cells and targeting other proliferative diseases. A commonly used antimicrotubule drug such as Docetaxel represents more than a 3 billion dollars market. However many patients do not respond to approved antimicrotubule agents or develop resistance. Adding new antimicrotubule drugs to the oncologist’s arsenal would enable to better treat many patients and perhaps target certain cancers that are hard to treat with standard microtubule chemotherapy and/or provide additional options for patients who have become resistant to existing antimicrotubule drugs.
  • More broadly, one of the major problems in cancer patient care is developed drug resistance. There are many drugs that work well for a short period of time but eventually patients develop resistance and their condition relapses. By finding new molecules that act upon clinically relevant targets, there is a potential to discover new drugs that could overcome resistance mechanisms. BANDIT has the potential to dramatically accelerate the discovery of such molecules.
  • Additionally, many diseases – such as brain cancers – are difficult to treat because of the inability of drugs to permeate the brain barrier. But, by determining a set of molecules with varying structures that all have the same target there is the potential to find a new drug able to overcome this challenge.

Medical Research: What recommendations do you have for future research as a result of this study?

Response: We think that Big Data analytics will play an increasingly important role in drug discovery. As discussed before, we are not developing drugs fast enough. On the other hand, we have accumulated tremendous amounts of data on small molecule’s clinical safety, efficacy, and toxicity. Technologies such as automation, next-generation sequencing, genome engineering (CRISPR) can now be used to generate ever more such data. Now is the time to integrate everything we know about small molecules to predict their activity. We can build predictive models of drug toxicity and Big Data-driven models that predict how to combine drugs to maximize efficacy, decrease the likelihood of acquired resistance and minimize toxicity.

To complement the work presented at AACR15, in our laboratory at Weill Cornell, we are essentially adopting a “Moneyball” approach to drug discovery. Using machine learning, we are actively seeking features of small molecules that predict a drug’s lack of toxicity, efficacy and eventual FDA approval. For these analyses we are combing through large databases of successful and failed clinical trials in humans, seeking to combine these data with small molecule activity and target information. Our initial results look promising and indicate for example that one can predict with reasonably high accuracy which clinical trials will fail early due to toxicity reasons. Such Big Data models may speed up the drug discovery process by identifying early the drugs and targets that are most likely to fail to reach FDA approval.

Presented at the April 2015 AACR Conference in Philadelphia

MedicalResearch.com Interview with:Neel S. MadhukarGraduate student in the lab of, & Olivier Elemento, PhD, Associate Professor (2015). Leveraging Big Data to Accelerate Drug Discovery