An Introduction to Neural Networks using Longitudinal Health Data

Organized by La Mobilière and Swiss Re Institute

Objectives

Introduce (simple) neural networks by using longitudinal health data, highlight the importance of model explainability

In this 3 hour tutorial Daniel and Michael will introduce neural networks on a synthetic health dataset with health risk factors like BMI, blood pressure, age, etc. to predict various health out-comes of individuals over time. The tutorial consists of 4 parts: Creation of the synthetic dataset, applying generalized linear models, transitioning to shallow and deep neural networks, model ex-plainability and risk factor importance. All 4 parts will be accompanied with ready-to-use Python scripts for the audience to explore during and after the tutorial.

Agenda

Part 1: Synthetic health dataset

Based on publicly available data and medical literature we will first construct/provide a simplified but realistic function from various individual health risk factors like BMI, blood pressure, age, etc. to one or several health outcomes, e.g. incidence rates of cardio-vascular diseases or mortality rates. We will then simulate a longitudinal dataset..

Part 2: Generalized linear models

Using the synthetic dataset, we will show how to set up generalized linear models to model or essentially ”reverse-engineer” the function that has been used to create the dataset. We will very briefly go through the usual challenges:

  • Defining the ”formula” of the GLM.
  • Linear, quadratic, higher order polynomials, other types of functions.
  • How to select interactions?
  • How to treat missing values?
  • How to make use of the information provided by an earlier or later incidence?

Part 3: Neural networks

In this part of the tutorial, we will show (and visualize where possible)

  • that neural networks with linear activation functions are equivalent to GLMs,
  • how activation functions, network structures, network depths, etc. affect the family of functions that can be modelled by the neural network,
  • how GLMs can be used to ”pre-train” neural networks.

Part 4: Model explainability and risk factor importance

  • Why explainability is always relevant.
  • Permutation importance: How to measure feature importance for any model?
  • From Individual Conditional Expectations to Partial Dependence: Studying feature effects for any model.

The Swiss Innovation Catalyst in data-driven value creation