tadata
Back to home

Map of ML Algorithm Families

#machine-learning#artificial-intelligence#reinforcement-learning#deep-learning#python

Machine learning is far more than supervised classification. This guide covers every major algorithm family — from reinforcement learning to evolutionary computing — with intuitions, trade-offs, and practical pointers for when each shines.

Algorithm Families at a Glance

FamilyLearning SignalKey Use Cases
SupervisedLabeled examples (X → Y)Classification, regression, forecasting
UnsupervisedNo labels — find structureClustering, dimensionality reduction, anomaly detection
Self-SupervisedPseudo-labels from data itselfLanguage models, image pre-training
Reinforcement LearningRewards from environmentRobotics, games, resource optimization
Evolutionary / GeneticFitness function + selectionOptimization, architecture search, scheduling
Probabilistic / BayesianPrior distributions + evidenceUncertainty quantification, small data
Meta-LearningLearning across tasksFew-shot learning, AutoML

Supervised Learning

The most widely deployed family. You provide labeled training data; the model learns the mapping.

Classification algorithms

AlgorithmHow it worksBest for
Logistic RegressionLinear decision boundary via sigmoidBinary classification, interpretable baselines
Decision TreesRecursive feature splitsInterpretable rules, feature importance
Random ForestEnsemble of decorrelated trees (bagging)Tabular data, robust to overfitting
Gradient Boosting (XGBoost, LightGBM, CatBoost)Sequential trees correcting errorsTabular data competitions, production ML
SVMMaximize margin between classesSmall-medium datasets, high-dimensional spaces
k-NNMajority vote of nearest neighborsSimple baselines, recommendation
Naive BayesBayes' theorem + feature independenceText classification, spam filtering
Neural NetworksLearned hierarchical featuresImages, text, audio, multimodal

Regression algorithms

Most classification algorithms have regression variants: Linear Regression, Ridge/Lasso, Decision Tree Regressor, Random Forest Regressor, Gradient Boosting Regressor, SVR, k-NN Regressor.

Key insight: when to use what

                    Tabular data?
                    /           \
                  Yes            No
                  /                \
         < 100K rows?        Images/Text/Audio?
         /         \              |
       Yes          No           Yes
       /             \            |
  Logistic Reg    XGBoost/     Deep Learning
  Random Forest   LightGBM     (CNN, Transformer)
  (interpretable) (performance)

Gradient boosting (XGBoost, LightGBM, CatBoost) dominates tabular data. Deep learning wins on unstructured data. Don't use neural nets on tabular data unless you've already tried boosting.


Unsupervised Learning

No labels. The goal is to discover structure, patterns, or compressed representations.

Clustering

Group similar data points together.

AlgorithmHow it worksStrengthsWeaknesses
k-MeansAssign points to nearest centroid, update centroidsFast, scalableMust specify k, assumes spherical clusters
DBSCANDensity-based — clusters are dense regionsFinds arbitrary shapes, detects outliersStruggles with varying densities
HDBSCANHierarchical DBSCANNo need to specify eps, varying densitiesSlower on very large datasets
Gaussian Mixture ModelsSoft clustering via probability distributionsProbabilistic assignments, elliptical clustersMust specify k, can overfit
AgglomerativeBottom-up hierarchical mergingDendrogram visualization, any distance metricO(n³) memory, doesn't scale
Spectral ClusteringGraph Laplacian + k-Means on eigenvectorsComplex cluster shapesExpensive for large n, must specify k
from sklearn.cluster import HDBSCAN
import numpy as np

clusterer = HDBSCAN(min_cluster_size=15, min_samples=5)
labels = clusterer.fit_predict(X)
# labels == -1 means noise/outlier

Dimensionality Reduction

Compress high-dimensional data while preserving structure.

AlgorithmTypeBest for
PCALinear projectionFeature decorrelation, noise reduction, fast
t-SNENon-linear embedding2D/3D visualization of clusters
UMAPNon-linear embeddingFaster than t-SNE, preserves global structure better
AutoencodersNeural network compressionLearned representations, anomaly detection
Truncated SVDLinear (sparse data)Text data (LSA), recommender systems
ICAIndependent componentsSignal separation (e.g., EEG, audio)

Anomaly Detection

Find data points that don't fit the normal pattern.

AlgorithmApproach
Isolation ForestRandom partitioning — anomalies are isolated faster
Local Outlier FactorCompare local density to neighbors
One-Class SVMLearn a boundary around normal data
AutoencodersHigh reconstruction error = anomaly
StatisticalZ-score, IQR, Mahalanobis distance

Self-Supervised Learning

The model creates its own labels from raw data — no manual annotation needed. This is how modern foundation models (LLMs, vision transformers) are pre-trained.

Techniques

TechniqueDomainHow it works
Masked Language ModelingNLPMask tokens, predict them (BERT)
Next Token PredictionNLPPredict the next word (GPT)
Masked Image ModelingVisionMask patches, reconstruct them (MAE)
Contrastive LearningVision/MultiPull similar pairs close, push different pairs apart (SimCLR, CLIP)
DINO / DINOv2VisionSelf-distillation with no labels
JEPAVisionPredict latent representations, not pixels

Why it matters

Self-supervised pre-training on massive unlabeled data → fine-tune on small labeled dataset. This is the dominant paradigm in modern AI: pre-train once, fine-tune for many tasks.


Reinforcement Learning (RL)

An agent learns by interacting with an environment, receiving rewards, and adjusting its policy to maximize cumulative reward.

Core concepts

        ┌───────────────┐
        │  Environment  │
        └───┬───────┬───┘
   state s  │       │  reward r
            ▼       │
        ┌───────────┴───┐
        │     Agent     │
        │   π(a|s)      │──── action a ────►
        └───────────────┘
TermMeaning
State (s)Current observation of the environment
Action (a)What the agent does
Reward (r)Scalar feedback signal
Policy (π)Strategy: state → action mapping
Value function V(s)Expected cumulative future reward from state s
Q-function Q(s,a)Expected cumulative future reward from taking action a in state s
EpisodeOne complete run from start to terminal state

Algorithm families

Value-based methods

Learn Q(s,a) and pick the action with highest Q.

AlgorithmKey idea
Q-LearningOff-policy, tabular, updates Q via Bellman equation
DQN (Deep Q-Network)Q-Learning with neural net approximation, experience replay
Double DQNFixes overestimation bias in DQN
Dueling DQNSeparate value and advantage streams
RainbowCombines 6 DQN improvements into one agent

Policy-based methods

Learn the policy π(a|s) directly.

AlgorithmKey idea
REINFORCEMonte Carlo policy gradient — simple but high variance
PPO (Proximal Policy Optimization)Clipped objective prevents destructive policy updates. The workhorse of modern RL
TRPOTrust region constraint on policy updates
A2C / A3CActor (policy) + Critic (value) — reduces variance
SAC (Soft Actor-Critic)Maximum entropy RL — encourages exploration

Model-based methods

Learn a model of the environment and plan within it.

AlgorithmKey idea
Dyna-QInterleave real experience with simulated experience
MuZeroLearned world model — mastered Chess, Go, Atari without knowing the rules
Dreamer (v3)Learn a world model in latent space, imagine trajectories
MBPOModel-based policy optimization with short rollouts

RL applications

DomainExample
GamesAlphaGo, OpenAI Five (Dota 2), AlphaStar (StarCraft)
RoboticsDexterous manipulation, locomotion, drone control
LLM alignmentRLHF — fine-tuning language models with human preferences
Resource managementData center cooling (DeepMind), network routing
FinancePortfolio optimization, order execution
Autonomous vehiclesDecision making in complex traffic

Getting started with RL

import gymnasium as gym

# Classic control problem
env = gym.make("CartPole-v1")
obs, info = env.reset()

for _ in range(1000):
    action = env.action_space.sample()  # random policy
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()

Key libraries: Gymnasium (environments), Stable-Baselines3 (algorithms), CleanRL (single-file implementations), RLlib (scalable distributed RL), TorchRL (PyTorch-native).


Evolutionary & Genetic Algorithms

Inspired by biological evolution: population → fitness evaluation → selection → crossover → mutation → repeat.

Core concepts

Generation 0:  [A] [B] [C] [D] [E]     ← random population
                ↓ evaluate fitness
Fitness:        7   3   9   5   8
                ↓ selection (fittest survive)
Parents:       [A] [C] [E]
                ↓ crossover + mutation
Generation 1:  [AC'] [CE'] [CA'] [EA'] [EC']
                ↓ repeat...

Algorithm variants

AlgorithmHow it differs
Genetic Algorithm (GA)Binary/discrete encoding, crossover + mutation
Genetic Programming (GP)Evolves programs/trees rather than fixed-length vectors
Evolution Strategies (ES)Continuous parameters, Gaussian perturbations, no crossover
CMA-ESCovariance Matrix Adaptation — adapts the search distribution. Gold standard for continuous optimization
NEATNeuroevolution — evolves neural network topology and weights
Differential Evolution (DE)Mutation via vector differences between population members
Particle Swarm Optimization (PSO)Swarm intelligence — particles follow personal and global best
Ant Colony OptimizationPheromone-based path optimization (combinatorial problems)

When evolutionary beats gradient descent

  • Non-differentiable objectives — discrete structures, combinatorial problems
  • Multimodal landscapes — many local optima where gradient methods get stuck
  • Architecture search — evolving neural network structures (NAS)
  • Game AI — evolving strategies for NPCs
  • Scheduling/routing — job shop, vehicle routing, timetabling
  • Hardware design — circuit optimization, antenna design

Practical example: CMA-ES

import cma

# Minimize a function with CMA-ES
def objective(x):
    return sum((xi - 3)**2 for xi in x)  # minimum at [3, 3, 3, ...]

es = cma.CMAEvolutionStrategy([0] * 10, 0.5)  # 10-dim, initial sigma=0.5
es.optimize(objective)
print(es.result.xbest)  # → close to [3, 3, 3, ...]

Key libraries

LibraryLanguageFocus
DEAPPythonGeneral-purpose evolutionary computation
PyGADPythonGenetic algorithms, simple API
pycmaPythonCMA-ES reference implementation
OptunaPythonHyperparameter optimization (includes evolutionary samplers)
ECJJavaMature evolutionary computation framework
NevergradPythonGradient-free optimization (Meta)

Probabilistic & Bayesian Methods

Model uncertainty explicitly using probability distributions.

Key approaches

MethodWhat it does
Bayesian Linear RegressionLinear regression with posterior distributions over weights
Gaussian Processes (GP)Non-parametric — defines a distribution over functions. Perfect for small data with uncertainty
Bayesian Neural NetworksNeural nets with distributions over weights — quantifies prediction uncertainty
Variational InferenceApproximate intractable posteriors with simpler distributions
MCMC (Markov Chain Monte Carlo)Sample from the posterior — Metropolis-Hastings, Hamiltonian MC, NUTS
Bayesian OptimizationUse a GP surrogate to optimize expensive black-box functions
Hidden Markov ModelsSequential data with hidden states — speech, genomics
Probabilistic ProgrammingWrite models as programs, let the framework do inference

Bayesian Optimization for hyperparameters

from skopt import gp_minimize

def train_and_evaluate(params):
    learning_rate, n_estimators = params
    # Train model, return validation error
    model = XGBClassifier(learning_rate=learning_rate,
                          n_estimators=int(n_estimators))
    model.fit(X_train, y_train)
    return -accuracy_score(y_val, model.predict(X_val))

result = gp_minimize(
    train_and_evaluate,
    [(1e-4, 1.0, "log-uniform"),  # learning_rate
     (50, 500)],                   # n_estimators
    n_calls=50
)

Key libraries

LibraryFocus
PyMCProbabilistic programming, MCMC/VI
Stan / CmdStanPyHigh-performance MCMC (NUTS sampler)
NumPyroJAX-based probabilistic programming
GPyTorchScalable Gaussian Processes on GPU
BoTorchBayesian optimization (built on GPyTorch)
scikit-optimizeSequential model-based optimization

Meta-Learning ("Learning to Learn")

Algorithms that improve their learning ability across tasks.

ApproachHow it worksExample
MAML (Model-Agnostic Meta-Learning)Learn initialization weights that adapt in few gradient stepsFew-shot image classification
Prototypical NetworksClassify by distance to class prototypes in embedding spaceFew-shot learning
Matching NetworksAttention-based comparison to support setOne-shot learning
Neural Architecture Search (NAS)Search for optimal network architectureEfficientNet, DARTS
Learned OptimizersMeta-learn the optimizer itselfL2L, VeLO
In-Context LearningLarge models learn from examples in the promptGPT-4, Claude

Hybrid & Emerging Approaches

Neuro-symbolic AI

Combine neural networks (pattern recognition) with symbolic reasoning (logic, rules).

  • Neural Theorem Proving — use neural nets to guide proof search
  • Differentiable Programming with Logic — DeepProbLog, Scallop
  • LLM + Knowledge Graphs — ground language model outputs in structured knowledge

Graph Neural Networks (GNNs)

Learn on graph-structured data directly.

VariantMechanism
GCNAggregate neighbor features (spectral)
GraphSAGESample and aggregate (inductive, scales)
GATAttention-weighted neighbor aggregation
GINMaximally powerful under WL isomorphism test

Applications: molecule property prediction, social network analysis, fraud detection, recommendation systems.

Diffusion Models

Learn to denoise data — the backbone of modern generative AI.

  1. Forward process: Gradually add noise to data
  2. Reverse process: Neural network learns to remove noise step by step
  3. Generation: Start from pure noise, denoise iteratively

Used in: Stable Diffusion, DALL-E 3, Sora (video), protein structure generation.


Choosing the Right Family

Your situationStart here
Labeled tabular dataGradient Boosting (XGBoost/LightGBM)
Labeled images/text/audioFine-tune a pre-trained model (transfer learning)
No labels, find groupsHDBSCAN or Gaussian Mixture Models
No labels, reduce dimensionsUMAP for visualization, PCA for preprocessing
Sequential decisions with rewardsPPO (policy-based RL)
Optimize non-differentiable objectiveCMA-ES or Genetic Algorithm
Small data, need uncertaintyGaussian Processes or Bayesian methods
Very few labeled examplesMeta-learning (Prototypical Networks) or in-context learning
Generate images/text/audioDiffusion models, Transformers
Graph-structured dataGNN (GraphSAGE, GAT)

Resources

:::