Our research
Research Centre for Cheminformatics
Statistical modelling
We have participated in research projects related to regression, classificiation and survival modelling for over a decade. Our goal has always been to improve the predictive models we make. In 2014 we published a paper on cross-validation pitfalls where we provided clearly defined algorithms for selecting and assessing regression and classificiation models. Our aim with the paper was not to tell others how to do it, but to check with the wider scientific community whether there are any better ways to do it. One area of research which we are keen to pursue in the future is the use of bootstraping to select and assess the models, and compare them with models selected by cross-validation.
Stratified medicine
Since 2010 we have worked on research projects where, if possible, the goal was to find a sub-population which would have a better/worse survival than the remaining part of the population. Based on our practical experience we have found that using hazard-ratio between Kaplan-Meier curves of two sub-populations as a measure of their survival separation may be misleading. Currently our paper on that subject is under review. We also prefer to use the term Stratified medicine rather than Personalised medicine or Precision medicine, because the results of research are expected to be actionable for a sub-population.
Survival analysis
While working on numerous right-censored (survival) datasets, we have noticed how "fragile" they are, and how much our estimates depend a lot on certain data points as well as on certain assumptions. We have found that random censorship model, one of the fundamental assumptions in the theory of survival analysis, is presumed too often and too easily. Currently our paper on that subject is under review.
Causal inference and Bayesian statistics
We are very interested to apply our knowledge of Causal inference and Bayesian statistics in future research projects.
Taxonomy and phylogenetic trees
We are keen to understand better the use of statistics in current taxonomy with the special emphasis on phylogenetic trees. Our applications are in mycology and entomology.