Discover the numerous projects that our doctoral researchers are currently engaged in and have been working on in recent years.
At HIDSS4Health, our doctoral researchers and associated collaborators tackle cutting-edge challenges at the intersection of data science and the life sciences. Their interdisciplinary projects encompass diverse areas such as Imaging & Diagnostics, Surgery & Intervention 4.0, and advanced Models for Personalized Medicine.
These endeavors align with our mission to push boundaries, foster collaboration, and drive transformative advances in health and biomedical research. Explore how their innovative work is shaping the future of science and healthcare.
Imaging & Diagnostics / Surgery & Intervention 4.0 / Models for Personalized Medicine

Using Anatomical Knowledge to Improve Medical Image Analysis
Using Anatomical Knowledge to Improve Medical Image Analysis
Doctoral researcher: Alexander Jaus
Institution: KIT
Data science PI: Rainer Stiefelhagen
Life science PI: Jens Kleesiek
While there has been impressive progress in the field of biomedical image segmentation, current approaches hardly incorporate anatomical knowledge or common sense. Whereas radiologist have an explicit knowledge about the human body, this knowledge is only implicitly learned by current models which mostly focus on specific regions of the human body.
To alleviate this problem, we aim to develop a dataset spanning multiple modalities for the entire human body, allowing models to learn the entire anatomy. Furthermore, we will leverage a dataset consisting of radiological reports which are weak labels on how the described anatomy matches or differs from a certain norm. Combining these two approaches, we will train a model which incorporates prior knowledge of the human anatomies and pathologies.
While the described model offers many possible downstream tasks, one focus applications within this thesis will be to develop a framework which makes anatomical databases searchable beyond standard queries. By indexing the databases with the help of the developed model, fine grained semantic searches such as “retrieve all CT scans with left lungs with a diameter of X with pulmonary embolism” should be possible.

Analysis Pipelines and Data Fusion for Cerebral Organoids
Analysis Pipelines and Data Fusion for Cerebral Organoids
Doctoral researcher: Luca Deininger
Institution: KIT
Data science PI: Ralf Mikut
Life science PI: Sabine Jung-Klawitter
Organoids are self-assembled three-dimensional aggregates generated from human pluripotent stem cells (hPSC) with cell types and cytoarchitectures that resemble human organs and tissues. Cerebral organoids are a growing field of interest to understand and model the development of the human brain for health and disease, in particular common and rare inherited diseases. Data of cerebral organoids can include heterogeneous data sources, such as longitudinal Magnetic Resonance Imaging (MRI), immunofluorescence analysis of organoids as well as metabolomic and transcriptomic data at bulk (whole organoids) or single cell level.
A main bottleneck of organoid research is the lack of systematic analysis pipelines covering the full range of acquired data. The aim of the proposed project is to establish such a standard analysis pipeline for cerebral organoids and to show the potential of this pipeline for the characterization of human diseases, especially for inborn errors of neurotransmitter metabolism.

Anomaly Detection in Sparse Image Time Series
Anomaly Detection in Sparse Image Time Series
Doctoral researcher: Nico Disch
Institution: DKFZ
Data science PI: Klaus Maier-Hein
Life science PI: Jens Kleesiek, Rainer Stiefelhagen
Current image analysis of patient data only uses single images, with previous measurements not being incorporated into the model. Furthermore, image time series in other domains rely on uniformly sampled times, such as videos, in order to find anomalies or to predict time series progression. Other anomaly detection methods learn the distribution of images, however distribution shifts are difficult to encompass. Therefore a model has to be developed that is able to deal with sparse image time series and find anomalies in these time series. The developed models will help to distinguish regular and anomalous progression in patients’ data.
These insights might lead to a better understanding in the human process of aging and disease progression, and possibly aid the decision making of medical practitioners.

Using Anatomical Knowledge to Improve Medical Image Analysis
Interactive Annotation of Volumetric Imaging Data Incorporating Report Information
Doctoral researcher: Zdravko Marinov
Institution: KIT
Data science PI: Rainer Stiefelhagen
Life science PI: Jens Kleesiek
Annotated medical data is a prerequisite for successful and robust deep-learning models. However, the curation of labels for medical images is a difficult challenge due to the complexity of the data and the need of expert knowledge. Interactive annotation alleviates the burden of manual labeling by reducing the annotation time significantly and allowing the annotators to iteratively refine the labels until they are satisfied with their quality. This project aims to explore how to integrate interactive segmentation models for personalized diagnostics on multimodal medical data, e.g., PET-CT, using active learning and sparse labels. Medical reports will also be integrated to ease and enhance the annotation process and improve the segmentation by combining textual and visual information.
The simulation of user interactions during training and suggesting slices for annotation with active learning are open challenges for interactive segmentation. Typically, user interactions, e.g., in the form of clicks, are simulated in locations where the model has produced segmentation errors. However, in multimodal imaging, the sources of error are two-fold, and errors from one modality can be used for co-training of the other one. Thus, this project aims to explore various multimodal fusion paradigms to leverage the mutual information between the modalities and investigate ways to simulate interactions for multimodal imaging.

Model-based Artificial Intelligence in Surgical Data Science
Model-based Artificial Intelligence in Surgical Data Science
Doctoral researcher: Piotr Kalinowski
Institution: DKFZ
Data science PI: Lena Maier-Hein
Life science PI: Hannes Kenngott
Death within 30 days after surgery has recently been found to be the third-leading cause of death worldwide [1], with research suggesting that a large proportion of these deaths are due to surgical error. The field of Surgical Data Science [2] aims to address this issue with data-driven methods. However, a large international consortium of experts [3] revealed a lack of clinical success stories and attributed this issue to the lack of large, annotated databases.
In this project, we investigate an entirely new approach to address this roadblock. Specifically, we propose the encoding of existing medical knowledge in ontologies and integrating this prior knowledge in a Graph Neural Network (GNN) based approach to surgical decision support. The method will be validated based on existing data sets from different surgical disciplines.
References:
[1] D. Nepogodiev, J. Martin, B. Biccard, A. Makupe, A. Bhangu, and National Institute for Health Research Global Health Research Unit on Global Surgery, “Global burden of postoperative death,” The Lancet, vol. 393, no. 10170. p. 401, Feb. 02, 2019.
[2] L. Maier-Hein et al., “Surgical data science for next-generation interventions,” Nat Biomed Eng, vol. 1, no. 9, pp. 691–696, Sep. 2017.
[3] L. Maier-Hein et al. “Surgical data science - from concepts toward clinical translation.” Medical image analysis vol. 76 (2022).

Next-Generation-Sequencing Data
Pharmacogenomic Analyses
Doctoral researcher: Sebastian Pirmann
Institution: DKFZ
Data science PI: Benedikt Brors
Life science PI: Daniel Hübschmann
Pharmacogenomics (PGx) studies how variations in the genome affect drug response in patients. There are genomic variants in so-called pharmacogenes which have an impact on a drug’s pharmacokinetics, influencing drug response and side effects. Although this link is well known, it is still not clinical practice to comprehensively determine a patient’s PGx profile prior to drug treatment. Especially in cancer therapy, PGx profiling could increase treatment success and reduce potential side effects.
In this project, PGx analysis is done retrospectively for large cancer patient cohorts by using computational methods for genotyping from next-generation-sequencing data. This will allow to create distributions of PGx variants, genotypes, and phenotypes as well as to decipher the link to the observed treatment outcomes. In addition, sequencing data of matched tumor samples will be used to capture PGx differences acquired by e.g. somatic resistance mutations associated with metabolizer phenotype changes.
Other somatic effects contributing to the development of resistance, such as allele-specific expression or epigenetic changes of pharmacogenes, will also be assessed. Finally, this project aims to develop supervised models for prediction of treatment outcome and side effects based on the aforementioned PGx data.

Ensembling Experts for Improved Accuracy and Privacy
Ensembling Experts for Improved Accuracy and Privacy in Predictive Models for Healthcare
Doctoral researcher: Sonja Adomeit
Institution: UNI HD
Data science PI: Stefan Riezler
Life science PI: Daniel Durstewitz
In order to build machine learning models for predictive healthcare, the standard approach is to learn a unified prediction model on data pooled from several patients. Instead, we propose to learn predictive models on data labeled by experts for individual patients and combine these patient-specific models by ensembling techniques. However, working with individual patient information, poses the need for improved privacy protection of the training data. The goal of the proposed project is to find the optimal trade-off between accuracy and privacy both from a theoretical perspective, and in experimental applications for predictive models for sepsis and diagnosis tasks from psychiatry.