Simulated Annealing — Probabilistic Optimisation
- SA accepts worse solutions with probability exp(-Δcost/T). High T: explore freely. Low T: only accept improvements.
- Cooling schedule is the critical parameter — too fast: trapped in local optima. Too slow: computationally expensive.
- Advantage over hill climbing: escapes local optima. Advantage over GA: simpler implementation, single solution.
Simulated annealing (SA) was introduced by Kirkpatrick, Gelatt, and Vecchi in 1983 as a method for solving combinatorial optimisation problems. The key insight from statistical mechanics: systems in thermal equilibrium at temperature T have state energies distributed according to the Boltzmann distribution — higher energy states are possible but less probable, with probability proportional to exp(-ΔE/kT).
Applied to optimisation: at high temperature T, accept moves that make the objective worse with probability exp(-Δcost/T). As T decreases, only improvements are accepted. This controlled randomness escapes local optima that would trap pure greedy search.
Simulated Annealing Algorithm
import math, random def simulated_annealing(initial_solution, neighbour_fn, cost_fn, T_start=1000.0, T_end=0.1, cooling_rate=0.995, iterations_per_temp=100): """ General SA framework. neighbour_fn(solution) -> new solution (small random change) cost_fn(solution) -> float (lower = better) """ current = initial_solution current_cost = cost_fn(current) best = current best_cost = current_cost T = T_start while T > T_end: for _ in range(iterations_per_temp): candidate = neighbour_fn(current) candidate_cost = cost_fn(candidate) delta = candidate_cost - current_cost # Accept if better, or with probability exp(-delta/T) if worse if delta < 0 or random.random() < math.exp(-delta / T): current = candidate current_cost = candidate_cost if current_cost < best_cost: best = current best_cost = current_cost T *= cooling_rate return best, best_cost # Minimise a simple function with many local minima def rastrigin(x): return x**2 + 10 - 10*math.cos(2*math.pi*x) def neighbour(x): return x + random.uniform(-0.5, 0.5) best_x, best_val = simulated_annealing(5.0, neighbour, rastrigin) print(f'Best x: {best_x:.4f}, f(x): {best_val:.4f}') # Near x=0, f=0
Cooling Schedules
The cooling schedule determines convergence quality vs speed:
Geometric cooling: T *= α (where α ≈ 0.9-0.99). Simple, widely used. Too fast → gets trapped. Too slow → computationally expensive.
Linear cooling: T -= ΔT. Rarely used — spends too much time at low temperatures.
Logarithmic cooling: T = T0/log(1+t). Theoretically guarantees convergence to global optimum but impractically slow.
Adaptive: Adjust cooling rate based on acceptance rate — target ~50% acceptance at each temperature.
🎯 Key Takeaways
- SA accepts worse solutions with probability exp(-Δcost/T). High T: explore freely. Low T: only accept improvements.
- Cooling schedule is the critical parameter — too fast: trapped in local optima. Too slow: computationally expensive.
- Advantage over hill climbing: escapes local optima. Advantage over GA: simpler implementation, single solution.
- Guarantee: with logarithmic cooling, converges to global optimum — but impractically slow in practice.
- Applications: VLSI placement, TSP, protein folding, circuit routing, timetabling.
Interview Questions on This Topic
- QWhy does simulated annealing escape local optima while hill climbing cannot?
- QWhat is the Boltzmann acceptance criterion and what does the temperature parameter control?
- QHow does the cooling rate affect solution quality vs computation time?
Frequently Asked Questions
How do you choose the initial temperature?
Rule of thumb: T_start should be high enough that ~80% of random moves are accepted initially. To find it: sample random moves, compute average cost increase Δcost_avg, set T_start = -Δcost_avg / ln(0.8). This ensures the algorithm starts in 'exploration mode' rather than immediately hill-climbing.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.