Winning the rat race - The sbv IMPROVER species translation challenge

19. November 2013
EE0-002 (ENUS Building)


The talk presents and discusses our use of machine learning techniques for the analysis of biological data sets in the context of the recent sbv IMPROVER Species Translation Challenge. This challenge focused on understanding the limits of rodent models for human biology. The sbv (systems biology verification) IMPROVER project is a collaborative effort concerning the development of new methods for the verification of scientific data and results in the bio-medical domain. It is funded by Philip Morris International (PMI) R&D and organized jointly with IBM Research.

In the challenge, the competing teams were scored using a gold standard where predictions were compared to unreleased experimental data. Our interdisciplinary, transatlantic team achieved first rank performance in three of the four sub-challenges. In the first of these, the phosphorylation status of a number of proteins was to be predicted from gene expression levels in cells from the same species (rat) exposed to various chemical stimuli. The second sub-challenge required the prediction of the stimuli-dependent protein phosphorylation status in human cells, based on measurements in rat cells. Inter-species predictions concerning the expression of pre-defined gene sets were the goal of the third sub-challenge.

First, a brief introduction to the challenge set-up, the structure of the data sets, and essential pre-processing steps will be given. The presentation will focus on the use of machine learning techniques, which were instrumental for the analysis of the high-dimensional gene expression data and for obtaining accurate predictions based on the limited information available. Potential methodological extensions of our analysis beyond the actual challenge results will also be discussed.