MCMC Statistics Cover

Welcome to the Monte Carlo–Markov Chains Statistical Methods series, where we explore the theory and practice of probabilistic inference and MCMC sampling.

Articles

  1. What is Probability?
  2. Random Variables and Sampling
    • Probability Density Function and Expectation
    • Sampling Methods for Simple Distributions
    • Introduction to Common Basic Sampling Algorithms
  3. Monte Carlo Methods
    • Importance Sampling
    • Variance Reduction Techniques
  4. Understanding Markov Chains
    • What is a Markov Process
    • Stationary Distribution and Convergence
    • Constructing Simple State Transition Processes
  5. Introducing MCMC
    • Why do we need MCMC?
    • From Markov Chains to Sampling
    • Theory and Intuition
  6. Metropolis Algorithm Explained: Implementation & Intuition
    • The Core Dilemma: Intractable Normalization Constants
    • Random Walk Metropolis Explained
    • Performance in High-Dimensional Distributions
  7. The Metropolis-Hastings Algorithm: Breaking the Symmetry
    • Why do we need “asymmetric” proposals?
    • Derivation and intuition of the Hastings Correction
    • Practical Case: Solving boundary problems with Log-Normal proposals
  8. Gibbs Sampling Explained: The Wisdom of Divide and Conquer
    • High-dimensional dilemmas and the “Manhattan Walk” intuition
    • Mathematical principle: Brook’s Lemma
    • Python implementation for discrete and continuous distributions
  9. Deterministic Optimization Explained: The Mathematical Essence of Gradient Descent
    • Geometric Intuition of Convex vs. Non-Convex Optimization
    • Newton’s Method and Second-Order Approximation
    • Connection between Coordinate Descent and Gibbs Sampling
    • Pros and Cons of Steepest Descent
  10. Stochastic Optimization Explained: Simulated Annealing & Pincus Theorem
    • From Energy Minimization to Probability Maximization: The Physics of Annealing
    • High-Temp Exploration & Low-Temp Exploitation: Another Perspective on Metropolis
    • Pincus Theorem: Mathematical Proof of Convergence to Global Optimum
  11. Convergence Diagnostics
  12. Python Practical: MCMC Modeling

Stochastic Optimization Explained: Simulated Annealing & Pincus Theorem

When optimization problems are trapped in the maze of local optima, deterministic algorithms are often helpless. This article takes you into the world of stochastic optimization, exploring how to transform the problem of finding minimum energy into finding maximum probability. We will delve into the physical intuition and mathematical principles of the Simulated Annealing algorithm, demonstrate its elegant mechanism of ‘high-temperature exploration, low-temperature locking’ through dynamic visualization, and derive the Pincus Theorem in detail, mathematically proving why the annealing algorithm can find the global optimal solution. [Read More]

Deterministic Optimization Explained: The Mathematical Essence of Gradient Descent

Deterministic optimization is the cornerstone for understanding modern MCMC algorithms (like HMC, Langevin). This article delves into three classic deterministic optimization strategies: Newton’s Method (second-order perspective using curvature), Coordinate Descent (the divide-and-conquer predecessor to Gibbs), and Steepest Descent (greedy first-order exploration). Through mathematical derivation and Python visualization, we compare their behavioral patterns and convergence characteristics across different terrains (convex surfaces, narrow valleys, strong coupling). [Read More]

Gibbs Sampling Explained: The Wisdom of Divide and Conquer

When high-dimensional spaces are overwhelming, Gibbs sampling adopts a ‘divide and conquer’ strategy. By utilizing full conditional distributions, it breaks down complex N-dimensional joint sampling into N simple 1-dimensional sampling steps. This article explains its intuition, mathematical proof (Brook’s Lemma), and Python implementation. [Read More]

The Metropolis-Hastings Algorithm: Breaking the Symmetry

The original Metropolis is limited by symmetric proposals, often ‘hitting walls’ at boundaries or getting lost in high dimensions. The MH algorithm introduces the ‘Hastings Correction’, allowing asymmetric proposals (like Langevin dynamics) while maintaining detailed balance, significantly improving efficiency. [Read More]

Metropolis Algorithm Explained: Implementation & Intuition

The Metropolis algorithm is the cornerstone of MCMC. We delve into its strategy for handling unnormalized densities, from the random walk mechanism to sampling 2D correlated Gaussians, complete with Python implementation and visual diagnostics. [Read More]

Monte Carlo Sampling

Understand the core concepts of Monte Carlo: Law of Large Numbers, rejection sampling, importance sampling, variance reduction techniques (antithetic variates, control variates, stratified sampling). [Read More]

Introduction to MCMC

The reason we need MCMC is that many distributions are only known in their unnormalized form, making traditional sampling/integration methods ineffective. By constructing a ‘correct Markov chain’, we can obtain the target distribution from its stationary distribution, meaning the long-term distribution of the trajectory ≈ target distribution. [Read More]