Parameter Searching For Epidemiology Models Using Bayesian Optimization

This document discusses using Bayesian optimization to find optimal parameters for epidemiology models. It provides background on epidemiology modeling and challenges with traditional parameter fitting methods. It then demonstrates Bayesian optimization on a toy SIR model, finding parameters with better accuracy and less computation than other methods.

Uploaded by

api-554183287

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

48 views14 pages

Parameter Searching For Epidemiology Models Using Bayesian Optimization

Uploaded by

api-554183287

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 14

Parameter Searching for

Epidemiology models using

Bayesian Optimization
Ryan Chrest
Sigma Xi
5/7/2021
● A way to model disease spread within a population.
● Epidemiology Modeling Background
Compartments of “groups” of the population as they become
infected and recover.
● Nonlinear System of ODE with parameters & initial conditions to
define rates, such as infection, recovery, etc.
● How do we find parameters given epidemic data?
Many ways to find “best” parameters

1. Mathematical derivation of the equations, numerically ﬁnding implicit solutions.

2. Best guess estimates from meaning of parameters
○ Such as the recovery rate of the SIR model: 1 / “average days of infection to recovery”
○ Using outside studies on the average days it takes to recover, say 5 days, the recovery rate would be
estimated as ⅕.
○ Then using these estimates to ﬁll in the implicit solutions.
3. Fitting the model to data with a loss function.
○ This is usually combined with (1 & 2) as well for a rough starting point in the search, but these estimates aren’t
always available.
Loss function
● The Mean Squared Error is the average squared
distance between the true value and predicted
value.
● This distance is seen in green.
Obstacles with ﬁtting model to data

1. The data is noisy, both from individual differences and error in reporting.

2. The loss function itself is usually extremely complex to minimize.

3. The evaluation of the model is computationally expensive.

○ We have to numerically solve the system of ODE every time we try new parameters.

○ This has to be done repeatedly for each combination.

1. Grid search, searching every possible combination of parameters.
Very large number of samples, extremely slow, inefficient, crude.
Searching
○
2. the ofparameter
Gradient Descent methods space
minimizing a function, for
convergence notsolutions
guaranteed.
○ Lower number of samples, slow, very sensitive to initial parameters for the search in order to converge to a meaningful
solution.
● Best guess estimate
Toy
○
○
example - The SIR Model and Influenza at an English Boarding
Infective period is 2.1 days
Take 𝜷=1/2.1=0.476
School
○ 1978𝞬=0.002342
Implicitly,
○ MSE of 700
● Using L-BFGS-B gradient descent at each initial
Toy Example - Loss Function Search
value (beta, gamma).
● The initial value space is noisy, with many local
minimums that cause L-BFGS-B to fail to converge.
What would ideal, holistic searching method look
like?
● Good statistical properties
○ Confidence intervals
○ Allow for resampling
● Use of prior information if available
○ In case we do have a good best guess estimate.
● Less time for setup and searching
● Less of a knowledge barrier needed
● More robust in producing reliable results with noisy reporting data
Bayesian Optimization fit to Toy Example

● 𝜷 = 0.00214
● 𝛄 = 0.42195
● MSE = 347.12
Citations

● Martcheva, M. An Introduction to Mathematical Epidemiology. Springer, 2015.

● https://github.com/scikit-optimize/scikit-optimize
This work was supported by the NSF Data Science Program with
Career Support and Connections to Industry, Award #1842386.