Skip to main content

Objectives

Introduce (simple) neural networks by using longitudinal health data, highlight the importance of model explainability

In this 3 hour tutorial Daniel and Michael will introduce neural networks on a synthetic health dataset with health risk factors like BMI, blood pressure, age, etc. to predict various health out-comes of individuals over time. The tutorial consists of 4 parts: Creation of the synthetic dataset, applying generalized linear models, transitioning to shallow and deep neural networks, model ex-plainability and risk factor importance. All 4 parts will be accompanied with ready-to-use Python scripts for the audience to explore during and after the tutorial.

Agenda

Part 1: Synthetic health dataset

Based on publicly available data and medical literature we will first construct/provide a simplified but realistic function from various individual health risk factors like BMI, blood pressure, age, etc. to one or several health outcomes, e.g. incidence rates of cardio-vascular diseases or mortality rates. We will then simulate a longitudinal dataset..

Part 2: Generalized linear models

Using the synthetic dataset, we will show how to set up generalized linear models to model or essentially ”reverse-engineer” the function that has been used to create the dataset. We will very briefly go through the usual challenges:

  • Defining the ”formula” of the GLM.
  • Linear, quadratic, higher order polynomials, other types of functions.
  • How to select interactions?
  • How to treat missing values?
  • How to make use of the information provided by an earlier or later incidence?

Part 3: Neural networks

In this part of the tutorial, we will show (and visualize where possible)

  • that neural networks with linear activation functions are equivalent to GLMs,
  • how activation functions, network structures, network depths, etc. affect the family of functions that can be modelled by the neural network,
  • how GLMs can be used to ”pre-train” neural networks.

Part 4: Model explainability and risk factor importance

  • Why explainability is always relevant.
  • Permutation importance: How to measure feature importance for any model?
  • From Individual Conditional Expectations to Partial Dependence: Studying feature effects for any model.