Introduce (simple) neural networks by using longitudinal health data, highlight the importance of model explainability
In this 3 hour tutorial Daniel and Michael will introduce neural networks on a synthetic health dataset with health risk factors like BMI, blood pressure, age, etc. to predict various health out-comes of individuals over time. The tutorial consists of 4 parts: Creation of the synthetic dataset, applying generalized linear models, transitioning to shallow and deep neural networks, model ex-plainability and risk factor importance. All 4 parts will be accompanied with ready-to-use Python scripts for the audience to explore during and after the tutorial.
Part 1: Synthetic health dataset
Based on publicly available data and medical literature we will first construct/provide a simplified but realistic function from various individual health risk factors like BMI, blood pressure, age, etc. to one or several health outcomes, e.g. incidence rates of cardio-vascular diseases or mortality rates. We will then simulate a longitudinal dataset..
Part 2: Generalized linear models
Using the synthetic dataset, we will show how to set up generalized linear models to model or essentially ”reverse-engineer” the function that has been used to create the dataset. We will very briefly go through the usual challenges:
- Defining the ”formula” of the GLM.
- Linear, quadratic, higher order polynomials, other types of functions.
- How to select interactions?
- How to treat missing values?
- How to make use of the information provided by an earlier or later incidence?
Part 3: Neural networks
In this part of the tutorial, we will show (and visualize where possible)
- that neural networks with linear activation functions are equivalent to GLMs,
- how activation functions, network structures, network depths, etc. affect the family of functions that can be modelled by the neural network,
- how GLMs can be used to ”pre-train” neural networks.
Part 4: Model explainability and risk factor importance
- Why explainability is always relevant.
- Permutation importance: How to measure feature importance for any model?
- From Individual Conditional Expectations to Partial Dependence: Studying feature effects for any model.