1 Introduction: Optimization
1.1 Optimization, Simulation, and Surrogate Modeling
- We will consider the interplay between
- mathematical models,
- numerical approximation,
- simulation,
- computer experiments, and
- field data
- Experimental design will play a key role in our developments, but not in the classical regression and response surface methodology sense
- Challenging real-data/real-simulation examples benefiting from modern surrogate modeling methodology
- We will consider the classical, response surface methodology (RSM) approach, and then move on to more modern approaches
- All approaches are based on surrogates
1.2 Surrogates
- Gathering data is expensive, and sometimes getting exactly the data you want is impossible or unethical
- Surrogate: substitute for the real thing
- In statistics, draws from predictive equations derived from a fitted model can act as a surrogate for the data-generating mechanism
- Benefits of the surrogate approach:
- Surrogate could represent a cheaper way to explore relationships, and entertain “what ifs?”
- Surrogates favor faithful yet pragmatic reproduction of dynamics:
- interpretation,
- establishing causality, or
- identification
- Many numerical simulators are deterministic, whereas field observations are noisy or have measurement error
1.2.1 Costs of Simulation
- Computer simulations are generally cheaper (but not always!) than physical observation
- Some computer simulations can be just as expensive as field experimentation, but computer modeling is regarded as easier because:
- the experimental apparatus is better understood
- more aspects may be controlled.
1.2.2 Mathematical Models and Meta-Models
- Use of mathematical models leveraging numerical solvers has been commonplace for some time
- Mathematical models became more complex, requiring more resources to simulate/solve numerically
- Practitioners increasingly relied on meta-models built off of limited simulation campaigns
1.2.3 Surrogates = Trained Meta-models
- Data collected via expensive computer evaluations tuned flexible functional forms that could be used in lieu of further simulation to
- save money or computational resources;
- cope with an inability to perform future runs (expired licenses, off-line or over-impacted supercomputers)
- Trained meta-models became known as surrogates
1.2.4 Computer Experiments
- Computer experiment: design, running, and fitting meta-models.
- Like an ordinary statistical experiment, except the data are generated by computer codes rather than physical or field observations, or surveys
- Surrogate modeling is statistical modeling of computer experiments
1.2.5 Limits of Mathematical Modeling
- Mathematical biologists, economists and others had reached the limit of equilibrium-based mathematical modeling with cute closed-form solutions
- Stochastic simulations replace deterministic solvers based on FEM, Navier–Stokes or Euler methods
- Agent-based simulation models are used to explore predator-prey (Lotka–Voltera) dynamics, spread of disease, management of inventory or patients in health insurance markets
- Consequence: the distinction between surrogate and statistical model is all but gone
1.2.6 Example: Why Computer Simulations are Necessary
- You can’t seed a real community with Ebola and watch what happens
- If there’s (real) field data, say on a historical epidemic, further experimentation may be almost entirely limited to the mathematical and computer modeling side
- Classical statistical methods offer little guidance
1.2.7 Simulation Requirements
- Simulation should
- enable rich diagnostics to help criticize that models
- understanding its sensitivity to inputs and other configurations
- providing the ability to optimize and
- refine both automatically and with expert intervention
- And it has to do all that while remaining computationally tractable
- One perspective is so-called response surface methods (RSMs):
- a poster child from industrial statistics’ heyday, well before information technology became a dominant industry
Goals
- How to choose models and optimizers for solving real-world problems
- How to use simulation to understand and improve processes
1.3 Jupyter Notebook
Note
- The Jupyter-Notebook of this lecture is available on GitHub in the Hyperparameter-Tuning-Cookbook Repository