1  Introduction: Optimization

1.1 Optimization, Simulation, and Surrogate Modeling

  • We will consider the interplay between
    • mathematical models,
    • numerical approximation,
    • simulation,
    • computer experiments, and
    • field data
  • Experimental design will play a key role in our developments, but not in the classical regression and response surface methodology sense
  • Challenging real-data/real-simulation examples benefiting from modern surrogate modeling methodology
  • We will consider the classical, response surface methodology (RSM) approach, and then move on to more modern approaches
  • All approaches are based on surrogates

1.2 Surrogates

  • Gathering data is expensive, and sometimes getting exactly the data you want is impossible or unethical
  • Surrogate: substitute for the real thing
  • In statistics, draws from predictive equations derived from a fitted model can act as a surrogate for the data-generating mechanism
  • Benefits of the surrogate approach:
    • Surrogate could represent a cheaper way to explore relationships, and entertain “what ifs?”
    • Surrogates favor faithful yet pragmatic reproduction of dynamics:
      • interpretation,
      • establishing causality, or
      • identification
    • Many numerical simulators are deterministic, whereas field observations are noisy or have measurement error

1.2.1 Costs of Simulation

  • Computer simulations are generally cheaper (but not always!) than physical observation
  • Some computer simulations can be just as expensive as field experimentation, but computer modeling is regarded as easier because:
    • the experimental apparatus is better understood
    • more aspects may be controlled.

1.2.2 Mathematical Models and Meta-Models

  • Use of mathematical models leveraging numerical solvers has been commonplace for some time
  • Mathematical models became more complex, requiring more resources to simulate/solve numerically
  • Practitioners increasingly relied on meta-models built off of limited simulation campaigns

1.2.3 Surrogates = Trained Meta-models

  • Data collected via expensive computer evaluations tuned flexible functional forms that could be used in lieu of further simulation to
    • save money or computational resources;
    • cope with an inability to perform future runs (expired licenses, off-line or over-impacted supercomputers)
  • Trained meta-models became known as surrogates

1.2.4 Computer Experiments

  • Computer experiment: design, running, and fitting meta-models.
    • Like an ordinary statistical experiment, except the data are generated by computer codes rather than physical or field observations, or surveys
  • Surrogate modeling is statistical modeling of computer experiments

1.2.5 Limits of Mathematical Modeling

  • Mathematical biologists, economists and others had reached the limit of equilibrium-based mathematical modeling with cute closed-form solutions
  • Stochastic simulations replace deterministic solvers based on FEM, Navier–Stokes or Euler methods
  • Agent-based simulation models are used to explore predator-prey (Lotka–Voltera) dynamics, spread of disease, management of inventory or patients in health insurance markets
  • Consequence: the distinction between surrogate and statistical model is all but gone

1.2.6 Example: Why Computer Simulations are Necessary

  • You can’t seed a real community with Ebola and watch what happens
  • If there’s (real) field data, say on a historical epidemic, further experimentation may be almost entirely limited to the mathematical and computer modeling side
  • Classical statistical methods offer little guidance

1.2.7 Simulation Requirements

  • Simulation should
    • enable rich diagnostics to help criticize that models
    • understanding its sensitivity to inputs and other configurations
    • providing the ability to optimize and
    • refine both automatically and with expert intervention
  • And it has to do all that while remaining computationally tractable
  • One perspective is so-called response surface methods (RSMs):
  • a poster child from industrial statistics’ heyday, well before information technology became a dominant industry
Goals
  • How to choose models and optimizers for solving real-world problems
  • How to use simulation to understand and improve processes

1.3 Jupyter Notebook

Note