What is PageRank?
PageRank is an algorithm used by Google to rank web pages. It treats the web as a directed graph where each page is a node and each hyperlink is an edge.
The Core Idea
The key insight is elegant: a page is important if important pages link to it. This is a recursive definition, which we resolve using linear algebra.
Given a graph of n pages, we define the PageRank vector r as the solution to:
Where M is the column-stochastic transition matrix and d ≈ 0.85 is the damping factor.
Power Iteration
We solve this iteratively — start with a uniform distribution, multiply by M repeatedly until convergence. This is exactly the stationary distribution of a random walk on the web graph.
Why It Works
The damping factor d models a user who follows links with probability d, and jumps to a random page with probability 1-d. This ensures the Markov chain is ergodic and has a unique stationary distribution.