Task 5.1 - Health-disease model (m13-21): Empirical metagenome metadata will be obtained from the PreDicta and CURE cohorts. Dirichlet multinomial mixture models will be constructed to identify states of the respiratory microbiome associated with disease. Least angle regression and model weighting approaches will be used to identify the best individual metagenomic predictors of health/disease, feeding into T5.2

Task 5.2 - Time-series model (m19-36): Lotka-Volterra models will be fitted to time-series data from the CURE cohorts to predict the ecological dynamics of the respiratory microbiome. Initially, we will use existing time series data for bacterial families and our own preliminary data to parameterise a simple version of the model. We will use this to generate a simulated data set, which we will analyse to optimise our sampling intervals for the remaining data collection (Milestone 3, D5.2.1). We will add viral and bacterial families to the model sequentially, in the order of importance identified in T5.1 and cross-validated with the empirical data, to maximise prediction capacity. If necessary, separate models will be fitted for the sexes and distinct immune states.

Task 5.3 - Stochastic optimisation (m31-48): We will use the time-series models from T5.2 to simulate ecological dynamics from diseased states and to predict phage interventions that will guide the trajectories towards the healthy state. Stochastic optimisation will be used to identify potential control strategies (Milestone 6, D5.3.1). A software interface will be developed to disseminate the modelling tools. UMAN and UMEÅ will collaborate in all the above tasks through a shared post-doctoral researcher.

List of deliverables