11  Reproducibility in SpotOptim

11.1 Introduction

SpotOptim provides full support for reproducible optimization runs through the seed parameter. This is essential for:

  • Scientific research: Ensuring experiments can be replicated
  • Debugging: Reproducing specific optimization behaviors
  • Benchmarking: Fair comparison between different configurations
  • Production: Consistent results in deployed applications

When you specify a seed, SpotOptim guarantees that running the same optimization multiple times will produce identical results. Without a seed, each run explores the search space differently, which can be useful for robustness testing.

11.2 Basic Usage

11.2.1 Making Optimization Reproducible

To ensure reproducible results, simply specify the seed parameter when creating the optimizer:

import numpy as np
from spotoptim import SpotOptim

def sphere(X):
    """Simple sphere function: f(x) = sum(x^2)"""
    return np.sum(X**2, axis=1)

# Reproducible optimization
optimizer = SpotOptim(
    fun=sphere,
    bounds=[(-5, 5), (-5, 5)],
    max_iter=30,
    n_initial=15,
    seed=42,  # This ensures reproducibility
    verbose=True
)

result = optimizer.optimize()
print(f"Best solution: {result.x}")
print(f"Best value: {result.fun}")
TensorBoard logging disabled
Initial best: f(x) = 5.542803
Iteration 1: New best f(x) = 0.001070
Iteration 2: New best f(x) = 0.000089
Iteration 3: New best f(x) = 0.000066
Iteration 4: New best f(x) = 0.000036
Iteration 5: New best f(x) = 0.000001
Iteration 6: New best f(x) = 0.000000
Iteration 7: f(x) = 0.000000
Iteration 8: f(x) = 0.000000
Iteration 9: f(x) = 0.000000
Iteration 10: f(x) = 0.000000
Iteration 11: f(x) = 0.000000
Iteration 12: f(x) = 0.000000
Iteration 13: f(x) = 0.000000
Iteration 14: f(x) = 0.000000
Iteration 15: New best f(x) = 0.000000
Best solution: [3.31436760e-04 4.18312302e-05]
Best value: 1.1160017787260647e-07

Key Point: Running this code multiple times (even on different days or machines) will always produce the same result.

11.2.2 Running Independent Experiments

If you don’t specify a seed, each optimization run will explore the search space differently:

# Non-reproducible: different results each time
optimizer = SpotOptim(
    fun=sphere,
    bounds=[(-5, 5), (-5, 5)],
    max_iter=30,
    n_initial=15
    # No seed specified
)

result = optimizer.optimize()
# Results will vary between runs

This is useful when you want to: - Explore different regions of the search space - Test the robustness of your results - Run multiple independent optimization attempts

11.3 Practical Examples

11.3.1 Example 1: Comparing Different Configurations

When comparing different optimizer settings, use the same seed for fair comparison:

import numpy as np
from spotoptim import SpotOptim

def rosenbrock(X):
    """Rosenbrock function"""
    x = X[:, 0]
    y = X[:, 1]
    return (1 - x)**2 + 100 * (y - x**2)**2

# Configuration 1: More initial points
opt1 = SpotOptim(
    fun=rosenbrock,
    bounds=[(-2, 2), (-2, 2)],
    max_iter=50,
    n_initial=20,
    seed=42  # Same seed for fair comparison
)
result1 = opt1.optimize()

# Configuration 2: Fewer initial points, more iterations
opt2 = SpotOptim(
    fun=rosenbrock,
    bounds=[(-2, 2), (-2, 2)],
    max_iter=50,
    n_initial=10,
    seed=42  # Same seed
)
result2 = opt2.optimize()

print(f"Config 1 (more initial): {result1.fun:.6f}")
print(f"Config 2 (fewer initial): {result2.fun:.6f}")
Config 1 (more initial): 0.002494
Config 2 (fewer initial): 0.011538

11.3.2 Example 2: Reproducible Research Experiment

For scientific papers or reports, always use a fixed seed and document it:

import numpy as np
from spotoptim import SpotOptim

def rastrigin(X):
    """Rastrigin function (multimodal)"""
    A = 10
    n = X.shape[1]
    return A * n + np.sum(X**2 - A * np.cos(2 * np.pi * X), axis=1)

# Documented seed for reproducibility
RANDOM_SEED = 12345

optimizer = SpotOptim(
    fun=rastrigin,
    bounds=[(-5.12, 5.12), (-5.12, 5.12), (-5.12, 5.12)],
    max_iter=100,
    n_initial=30,
    seed=RANDOM_SEED,
    verbose=True
)

result = optimizer.optimize()

print(f"\nExperiment Results (seed={RANDOM_SEED}):")
print(f"Best solution: {result.x}")
print(f"Best value: {result.fun}")
print(f"Iterations: {result.nit}")
print(f"Function evaluations: {result.nfev}")

# These results can now be cited in a paper
TensorBoard logging disabled
Initial best: f(x) = 20.392774
Iteration 1: f(x) = 31.367736
Iteration 2: f(x) = 30.822371
Iteration 3: f(x) = 28.761963
Iteration 4: f(x) = 25.355390
Iteration 5: f(x) = 22.087398
Iteration 6: f(x) = 23.609642
Iteration 7: f(x) = 27.024927
Iteration 8: f(x) = 23.154774
Iteration 9: New best f(x) = 8.927550
Iteration 10: New best f(x) = 8.899624
Iteration 11: f(x) = 19.831800
Iteration 12: f(x) = 19.323148
Iteration 13: New best f(x) = 8.692652
Iteration 14: New best f(x) = 8.244681
Iteration 15: f(x) = 8.404445
Iteration 16: f(x) = 17.268172
Iteration 17: f(x) = 17.250262
Iteration 18: f(x) = 16.989215
Iteration 19: f(x) = 22.061741
Iteration 20: New best f(x) = 8.227940
Iteration 21: New best f(x) = 2.732909
Iteration 22: New best f(x) = 2.236454
Iteration 23: New best f(x) = 2.001414
Iteration 24: f(x) = 2.001821
Iteration 25: f(x) = 2.002706
Iteration 26: f(x) = 24.677487
Iteration 27: f(x) = 25.445950
Iteration 28: f(x) = 22.323886
Iteration 29: New best f(x) = 1.990146
Iteration 30: New best f(x) = 1.989946
Iteration 31: f(x) = 30.157363
Iteration 32: f(x) = 13.980500
Iteration 33: f(x) = 26.257931
Iteration 34: f(x) = 26.086756
Iteration 35: f(x) = 16.955502
Iteration 36: f(x) = 86.774141
Iteration 37: f(x) = 42.005362
Iteration 38: New best f(x) = 1.989929
Iteration 39: f(x) = 1.989930
Iteration 40: New best f(x) = 1.989929
Iteration 41: f(x) = 69.717061
Iteration 42: New best f(x) = 1.989929
Iteration 43: f(x) = 66.932192
Iteration 44: f(x) = 50.905256
Iteration 45: f(x) = 1.989930
Iteration 46: f(x) = 46.800517
Iteration 47: f(x) = 67.429406
Iteration 48: f(x) = 60.775651
Iteration 49: f(x) = 1.989930
Iteration 50: f(x) = 1.989929
Iteration 51: f(x) = 1.989930
Iteration 52: f(x) = 1.989929
Iteration 53: f(x) = 1.989930
Iteration 54: f(x) = 1.989929
Iteration 55: f(x) = 1.989931
Iteration 56: f(x) = 1.989930
Iteration 57: f(x) = 1.989929
Iteration 58: f(x) = 16.986975
Iteration 59: f(x) = 14.520119
Iteration 60: f(x) = 86.774141
Iteration 61: f(x) = 1.989931
Iteration 62: f(x) = 76.830975
Iteration 63: f(x) = 59.126187
Iteration 64: f(x) = 1.989929
Iteration 65: f(x) = 13.047213
Iteration 66: New best f(x) = 1.989929
Iteration 67: New best f(x) = 1.989928
Iteration 68: f(x) = 1.989931
Iteration 69: f(x) = 1.989930
Iteration 70: f(x) = 1.989928

Experiment Results (seed=12345):
Best solution: [ 9.94876607e-01 -9.94866368e-01  1.89299459e-04]
Best value: 1.9899282459113223
Iterations: 70
Function evaluations: 100

11.3.3 Example 3: Multiple Independent Runs

To test robustness, run the same optimization with different seeds:

import numpy as np
from spotoptim import SpotOptim

def ackley(X):
    """Ackley function"""
    a = 20
    b = 0.2
    c = 2 * np.pi
    n = X.shape[1]
    
    sum_sq = np.sum(X**2, axis=1)
    sum_cos = np.sum(np.cos(c * X), axis=1)
    
    return -a * np.exp(-b * np.sqrt(sum_sq / n)) - np.exp(sum_cos / n) + a + np.e

# Run 5 independent optimizations
results = []
seeds = [42, 123, 456, 789, 1011]

for seed in seeds:
    optimizer = SpotOptim(
        fun=ackley,
        bounds=[(-5, 5), (-5, 5)],
        max_iter=40,
        n_initial=20,
        seed=seed,
        verbose=False
    )
    result = optimizer.optimize()
    results.append(result.fun)
    print(f"Run with seed {seed:4d}: f(x) = {result.fun:.6f}")

# Analyze robustness
print(f"\nBest result: {min(results):.6f}")
print(f"Worst result: {max(results):.6f}")
print(f"Mean: {np.mean(results):.6f}")
print(f"Std dev: {np.std(results):.6f}")
Run with seed   42: f(x) = 0.000907
Run with seed  123: f(x) = 0.001394
Run with seed  456: f(x) = 0.001941
Run with seed  789: f(x) = 0.000616
Run with seed 1011: f(x) = 0.003029

Best result: 0.000616
Worst result: 0.003029
Mean: 0.001578
Std dev: 0.000854

11.3.4 Example 4: Reproducible Initial Design

The seed ensures that even the initial design points are reproducible:

import numpy as np
from spotoptim import SpotOptim

def simple_quadratic(X):
    return np.sum((X - 1)**2, axis=1)

# Create two optimizers with same seed
opt1 = SpotOptim(
    fun=simple_quadratic,
    bounds=[(-5, 5), (-5, 5)],
    max_iter=25,
    n_initial=10,
    seed=999
)

opt2 = SpotOptim(
    fun=simple_quadratic,
    bounds=[(-5, 5), (-5, 5)],
    max_iter=25,
    n_initial=10,
    seed=999  # Same seed
)

# Run both optimizations
result1 = opt1.optimize()
result2 = opt2.optimize()

# Verify identical results
print("Initial design points are identical:", 
      np.allclose(opt1.X_[:10], opt2.X_[:10]))
print("All evaluated points are identical:", 
      np.allclose(opt1.X_, opt2.X_))
print("All function values are identical:", 
      np.allclose(opt1.y_, opt2.y_))
print("Best solutions are identical:", 
      np.allclose(result1.x, result2.x))
Initial design points are identical: True
All evaluated points are identical: True
All function values are identical: True
Best solutions are identical: True

11.3.5 Example 5: Custom Initial Design with Seed

Even when providing a custom initial design, the seed ensures reproducible subsequent iterations:

import numpy as np
from spotoptim import SpotOptim

def beale(X):
    """Beale function"""
    x = X[:, 0]
    y = X[:, 1]
    term1 = (1.5 - x + x * y)**2
    term2 = (2.25 - x + x * y**2)**2
    term3 = (2.625 - x + x * y**3)**2
    return term1 + term2 + term3

# Custom initial design (e.g., from previous knowledge)
X_start = np.array([
    [0.0, 0.0],
    [1.0, 1.0],
    [2.0, 2.0],
    [-1.0, -1.0]
])

# Run twice with same seed and initial design
opt1 = SpotOptim(
    fun=beale,
    bounds=[(-4.5, 4.5), (-4.5, 4.5)],
    max_iter=30,
    n_initial=10,
    seed=777
)
result1 = opt1.optimize(X0=X_start)

opt2 = SpotOptim(
    fun=beale,
    bounds=[(-4.5, 4.5), (-4.5, 4.5)],
    max_iter=30,
    n_initial=10,
    seed=777  # Same seed
)
result2 = opt2.optimize(X0=X_start)

print("Results are identical:", np.allclose(result1.x, result2.x))
print(f"Best value: {result1.fun:.6f}")
Results are identical: True
Best value: 3.201102

11.4 Advanced Topics

11.4.1 Seed and Noisy Functions

When optimizing noisy functions with repeated evaluations, the seed ensures reproducible noise:

import numpy as np
from spotoptim import SpotOptim

def noisy_sphere(X):
    """Sphere function with Gaussian noise"""
    base = np.sum(X**2, axis=1)
    noise = np.random.normal(0, 0.1, size=base.shape)
    return base + noise

optimizer = SpotOptim(
    fun=noisy_sphere,
    bounds=[(-5, 5), (-5, 5)],
    max_iter=40,
    n_initial=20,
    repeats_initial=3,  # 3 evaluations per point
    repeats_surrogate=2,
    seed=42  # Ensures same noise pattern
)

result = optimizer.optimize()
print(f"Best mean value: {optimizer.min_mean_y:.6f}")
print(f"Variance at best: {optimizer.min_var_y:.6f}")
Best mean value: 0.056456
Variance at best: 0.003927

Important: With the same seed, even the noise will be identical across runs!

11.4.2 Different Seeds for Different Exploration

Use different seeds to explore different regions systematically:

import numpy as np
from spotoptim import SpotOptim

def griewank(X):
    """Griewank function"""
    sum_sq = np.sum(X**2 / 4000, axis=1)
    prod_cos = np.prod(np.cos(X / np.sqrt(np.arange(1, X.shape[1] + 1))), axis=1)
    return sum_sq - prod_cos + 1

# Systematic exploration with different seeds
best_overall = float('inf')
best_seed = None

for seed in range(10, 20):  # Seeds 10-19
    optimizer = SpotOptim(
        fun=griewank,
        bounds=[(-600, 600), (-600, 600)],
        max_iter=50,
        n_initial=25,
        seed=seed
    )
    result = optimizer.optimize()
    
    if result.fun < best_overall:
        best_overall = result.fun
        best_seed = seed
    
    print(f"Seed {seed}: f(x) = {result.fun:.6f}")

print(f"\nBest result with seed {best_seed}: {best_overall:.6f}")
Seed 10: f(x) = 4.597121
Seed 11: f(x) = 1.946733
Seed 12: f(x) = 6.030579
Seed 13: f(x) = 0.851170
Seed 14: f(x) = 0.154832
Seed 15: f(x) = 0.044427
Seed 16: f(x) = 8.343419
Seed 17: f(x) = 0.547492
Seed 18: f(x) = 2.121439
Seed 19: f(x) = 0.257388

Best result with seed 15: 0.044427

11.5 Best Practices

11.5.1 1. Always Use Seeds for Production Code

# Good: Reproducible
optimizer = SpotOptim(fun=objective, bounds=bounds, seed=42)

# Risky: Non-reproducible
optimizer = SpotOptim(fun=objective, bounds=bounds)

11.5.2 2. Document Your Seeds

# Configuration for experiment reported in Section 4.2
EXPERIMENT_SEED = 2024
MAX_ITERATIONS = 100

optimizer = SpotOptim(
    fun=my_objective,
    bounds=my_bounds,
    max_iter=MAX_ITERATIONS,
    seed=EXPERIMENT_SEED
)

11.5.3 3. Use Different Seeds for Different Experiments

# Different experiments should use different seeds
BASELINE_SEED = 100
EXPERIMENT_A_SEED = 200
EXPERIMENT_B_SEED = 300

11.5.4 4. Test Robustness Across Multiple Seeds

# Run same optimization with multiple seeds
for seed in [42, 123, 456, 789, 1011]:
    optimizer = SpotOptim(fun=objective, bounds=bounds, seed=seed)
    result = optimizer.optimize()
    # Analyze results

11.6 What the Seed Controls

The seed parameter ensures reproducibility by controlling:

  1. Initial Design Generation: Latin Hypercube Sampling produces the same initial points
  2. Surrogate Model: Gaussian Process random initialization is identical
  3. Acquisition Optimization: Differential evolution explores the same candidates
  4. Random Sampling: Any random exploration uses the same random numbers

This guarantees that the entire optimization pipeline is deterministic and reproducible.

11.7 Common Questions

Q: Can I use seed=0?
A: Yes, any integer (including 0) is a valid seed.

Q: Will different Python versions give the same results?
A: Generally yes, but minor numerical differences may occur due to underlying library changes. Use the same environment for exact reproducibility.

Q: Does the seed affect the objective function?
A: No, the seed only affects SpotOptim’s internal random processes. If your objective function has its own randomness, you’ll need to control that separately.

Q: How do I choose a good seed value?
A: Any integer works. Common choices are 42, 123, or dates (e.g., 20241112). What matters is consistency, not the specific value.

11.8 Summary

  • Use seed parameter for reproducible optimization
  • Same seed → identical results (every time)
  • No seed → different results (random exploration)
  • Essential for research, debugging, and production
  • Document your seeds for transparency
  • Test robustness with multiple different seeds

11.9 Jupyter Notebook

Note