How Google uses graph theory to rank the web
PageRank is an algorithm used by Google to rank web pages. It treats the web as a directed graph where each page is a node and each hyperlink is an edge.
The key insight is elegant: a page is important if important pages link to it. This is a recursive definition, which we resolve using linear algebra.
Given a graph of n pages, we define the PageRank vector r as the solution to:
$$
\mathbf{r} = d \cdot M\mathbf{r} + \frac{1-d}{n}\mathbf{1}
$$
Where M is the column-stochastic transition matrix and d ≈ 0.85 is the damping factor.
We solve this iteratively — start with a uniform distribution, multiply by M repeatedly until convergence. This is exactly the stationary distribution of a random walk on the web graph.
The damping factor d models a user who follows links with probability d, and jumps to a random page with probability 1-d. This ensures the Markov chain is ergodic and has a unique stationary distribution.