Genetic Algorithms: Evolution-Inspired Optimization
Genetic algorithms (GAs) are a family of metaheuristic optimization techniques inspired by natural selection. They maintain a population of candidate solutions that evolve over generations through selection, crossover, and mutation — converging toward optimal or near-optimal solutions for problems where gradient-based methods fail.
Core Mechanism
A genetic algorithm operates in a loop:
Genetic Algorithm Loop
=======================
Initialize random population P(0)
│
▼
┌──────────────────────────┐
│ Evaluate fitness f(x) │◄──────────────┐
│ for each individual │ │
└──────────┬───────────────┘ │
▼ │
┌──────────────────────────┐ │
│ Selection │ │
│ (tournament / roulette) │ │
└──────────┬───────────────┘ │
▼ │
┌──────────────────────────┐ │
│ Crossover │ │
│ (single-point / uniform)│ │
└──────────┬───────────────┘ │
▼ │
┌──────────────────────────┐ │
│ Mutation │ │
│ (bit-flip / Gaussian) │ │
└──────────┬───────────────┘ │
▼ │
New generation P(t+1) ────────────────┘
until convergence
Key Operators
Selection determines which individuals reproduce. Common strategies:
| Method | Mechanism | Pressure | Use Case |
|---|---|---|---|
| Tournament | Pick k individuals, select best | Adjustable via k | Most common, simple |
| Roulette Wheel | Probability proportional to fitness | High variance | Classic, less used now |
| Rank-based | Probability based on rank, not fitness | Moderate | Avoids premature convergence |
| Elitism | Top n pass directly to next gen | Preserves best | Almost always used |
Crossover combines genetic material from two parents:
- Single-point: split at one index, swap tails
- Two-point: swap a segment between two cut points
- Uniform: each gene independently chosen from either parent (coin flip)
- Arithmetic: weighted average of parent genes (for real-valued encodings)
Mutation introduces random variation to maintain diversity:
- Bit-flip: invert a random bit (binary encoding)
- Gaussian: add noise to a gene (real-valued)
- Swap: exchange two positions (permutation problems like TSP)
- Scramble: randomly reorder a subsequence
Encoding Strategies
The choice of encoding determines which operators are natural:
| Encoding | Representation | Suitable For |
|---|---|---|
| Binary | [1,0,1,1,0,1] | Discrete optimization, knapsack |
| Real-valued | [0.42, 3.14, -1.7] | Continuous optimization, neural architecture |
| Permutation | [3,1,4,2,5] | TSP, scheduling, sequencing |
| Tree | Parse tree / AST | Genetic programming, symbolic regression |
Hyperparameter Tuning
Key hyperparameters and their typical ranges:
- Population size: 50–500 (larger for complex landscapes)
- Crossover rate (): 0.6–0.9
- Mutation rate (): 0.001–0.05 (low to preserve good solutions)
- Elitism count: 1–5% of population
- Generations: 100–10,000 (with early stopping)
The balance between exploration (mutation, large population) and exploitation (selection pressure, crossover) is critical. Too much exploitation leads to premature convergence; too much exploration wastes compute.
GA vs Other Optimization Methods
| Method | Gradient Required | Global Search | Handles Constraints | Parallelizable | Convergence |
|---|---|---|---|---|---|
| Gradient Descent | Yes | No (local) | With penalties | Limited | Fast (convex) |
| Genetic Algorithm | No | Yes | Naturally | Highly | Moderate |
| Simulated Annealing | No | Yes | With penalties | Limited | Slow |
| Particle Swarm | No | Yes | With penalties | Highly | Moderate |
| Bayesian Optimization | No | Yes | Yes | Moderate | Fast (low-dim) |
| CMA-ES | No | Yes | Box constraints | Yes | Fast (continuous) |
Real-World Applications
Neural Architecture Search (NAS): GAs evolve neural network topologies — number of layers, units per layer, activation functions, skip connections. Google's AmoebaNet used evolutionary methods to discover architectures competitive with NAS-RL at lower compute cost.
Feature selection: encode feature subsets as binary chromosomes. Fitness = model performance on validation set. Efficiently searches feature combinations.
Hyperparameter optimization: each individual encodes a hyperparameter configuration. More robust than grid search for high-dimensional, mixed-type parameter spaces.
Supply chain & scheduling: permutation-encoded GAs handle job-shop scheduling, vehicle routing (VRP), and warehouse layout optimization where exact solvers don't scale.
Game AI & procedural generation: evolve game-playing strategies, level layouts, or character behaviors. Useful when the fitness landscape is non-differentiable.
DEAP: A Python GA Framework
from deap import base, creator, tools, algorithms
import random
# Maximize a fitness function
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)
toolbox = base.Toolbox()
toolbox.register("attr_bool", random.randint, 0, 1)
toolbox.register("individual", tools.initRepeat, creator.Individual,
toolbox.attr_bool, n=100)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
def eval_onemax(individual):
return (sum(individual),) # Simple: maximize number of 1s
toolbox.register("evaluate", eval_onemax)
toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutFlipBit, indpb=0.02)
toolbox.register("select", tools.selTournament, tournsize=3)
pop = toolbox.population(n=300)
result, log = algorithms.eaSimple(pop, toolbox,
cxpb=0.7, mutpb=0.2,
ngen=50, verbose=False)
best = tools.selBest(result, k=1)[0]
When to Use GAs
GAs shine when the search space is large, discontinuous, or non-differentiable, and when you need a "good enough" solution rather than a provably optimal one. They are not the right tool when gradient information is available and the problem is smooth — standard optimization will converge faster.
The key advantage is generality: GAs require only a fitness function and an encoding. No gradient, no Hessian, no convexity assumption. This makes them the go-to approach for combinatorial optimization, multi-objective problems (via NSGA-II), and any domain where the objective is a black-box simulation.