6 minutes
Markov chain: Limiting Distribution & Reversibility
Limiting Distribution
Theorem 1: For an irreducible aperiodic chain, \(\{X_{t}\}\), \begin{equation} p_{ik}(t) \to \pi_{k} = \frac{1}{m_{k}} \quad \text{as} \quad t \to \infty \tag{1} \label{eq:1} \end{equation}
for all states \(i\) and \(k\).
i.e., for an irreducible aperiodic chain, the limit probability does not depend on the starting state and is equal to the stationary probability.
Proof:
(a) If the chain \(\{X_t\}\) is transient.
By Theorem 2 in the second post of this series, \begin{equation} \lim\limits_{t \to \infty}p_{ik}(t) = 0 \end{equation} for all states \(i, k\).
By Equation \eqref{eq:1} \begin{equation} \lim\limits_{t \to \infty}p_{ik} = \frac{1}{m_k} = \frac{1}{\infty} = 0 \end{equation}
Therefore Equation \eqref{eq:1} holds for a transient chain.
(b) If the chain \(\{X_t\}\) is recurrent.
Using coupling: Let \(Z = (X, Y)\) be a coupled chain, where \(X = \{X_{t}\}\) and \(Y = \{Y_{t}\}\) are independent Markov chains, with the same state space \(\mathcal{S}\) and transition matrix \(\mathbf{P}\).
(i) Assume \(X\) is positive recurrent:
\(Z = {Z_t = (X_{t}, Y_{t})}\) with state space \(\mathcal{S} \times \mathcal{S}\), is a Markov chain with transition probabilities:
\begin{align} p_{(i,j)(k,l)} &= P(Z_{t + 1} = (k, l) | Z_{t} = (i, j)) \\ &= P(X_{t+1} =k, Y_{t+1} = l | X_{t} = i, Y_{t} = j) \\ &= P(X_{t+1} = k | X_{t} = i)P(Y_{t+1} = l | Y_{t} = j ) & \quad(\text{by independence})\\ &= p_{ik}p_{jl} \tag{2} \label{eq:2} \end{align}
Therefore \begin{equation} p_{(i,j)(k,l)}(t)= p_{ik}(t)p_{jl}(t) \end{equation}
It can be shown that, for an irreducible and aperiodic chain, there exists some \(t\) such that \(p_{(i,j)(k,l)}(t) > 0\). Therefore \(Z\) is also irreducible.
We assumed the chain \(X\) is positive recurrent, so is \(Y\). Thus \(X\) and \(Y\) have the same unique stationary distribution \(\vec{\pi}\), since they have the same state space and transition probabilities. Then \(Z\) also has a stationary distribution,
\begin{align} \vec{v}^{\top} &= (v_{(i,j)} | (i, j) \in \mathcal{S} \times \mathcal{S}) \\ \text{with } \quad v_{(i,j)} &= P(X=i, Y=j)= \pi_{i}\pi_j \tag{3} \label{eq:3} \end{align}
So \(Z\) is positive recurrent as well.
Let \begin{equation} T = \min\{t \in \mathbb{N}^{+}| Z_{t} = (X_t, Y_{t}) = (s, s) \in \mathcal{S} \times \mathcal{S}\} \tag{4} \label{eq:4} \end{equation}
be the time of the first passage of \(Z\) to state \((s,s)\). Suppose \(X_{0} = i\) and \(Y_{0} = j\), then \(Z_{0} = (i, j)\).
We have \begin{align} p_{ik}(t) &= P(X_{t} = k | X_{0} = i) \\ &= P(X_{t} = k | X_{0} = i, Y_{0} = j) \quad \text{(by independence)}\\ &= P(X_{t} = k, T \leq t | X_{0} = i, Y_{0} = j) + P(X_t = k, T > t | X_{0} = i, Y_{0} = j) \\ &= P(Y_t = k, T \leq t | X_{0} = i, Y_{0} = j) + P(X_t = k, T > t | X_{0} = i, Y_{0} = j) \\ &\leq P(Y_t = k | X_{0} = i, Y_{0} = j) + P(T > t | X_{0} = i, Y_{0} = j) \\ &= P(Y_t = k | Y_{0} = j) + P(T > t | X_{0} = i, Y_{0} = j) \quad \text{(by independence)}\\ &= p_{jk}(t) + P(T > t | X_{0} = i, Y_{0} = j) \label{eq:5} \tag{5} \end{align}
We replace \(X_{t}\) with \(Y_{t}\) in the forth equality because for \(T < t\), \(X_{t}\) and \(Y_{t}\) are identically distributed see (Takahara 2017, p. 161). The equatlity comes from the fact that \(P(A \cap B | C) \leq P(A | C)\).
So \begin{equation} p_{ik}(t) - p_{jk}(t) \leq P(T > t|X_0 = i, Y_0 = j) \end{equation}
If we started the derivation in \eqref{eq:5} with \(Y_{t} = k\) and \(Y_0 = j\), we would get:
\begin{equation} p_{jk}(t) - p_{ik}(t) \leq P(T > t|X_0 = i, Y_0 = j) \end{equation}
Hence,
\begin{equation} |p_{ik}(t) - p_{jk}(t)| \leq P(T > t|X_0 = i, Y_0 = j) \label{eq:6} \tag{6} \end{equation}
Because \(Z\) is recurrent, \(P(T < \infty | X_0 = i, Y_0 = j) = 1\). Then, \(\lim\limits_{t \to \infty} P(T < \infty | X_0 = i, Y_0 = j) = 0\). This means that,
\begin{equation} |p_{ik}(t) - p_{jk}(t)| \leq 0 \quad \text{as } t \to \infty \implies p_{ik}(t) - p_{jk}(t) = 0 \end{equation}
This tells us that in the limit, that transition probabilities do not depend on the starting state.
Remember from the previous post \(\pi_{k} = \sum\limits_{i}\pi_{i}p_{ik}(t)\). This allows us to find the limiting distribution.
\begin{align} \pi_{k} - p_{jk}(t) &= \sum_{i}\pi_{i}p_{ik}(t) - \underbrace{\sum_{i}\pi_{i}}_{=1} p_{jk}(t) \\ &= \sum_{i}\pi_{i}(p_{ik}(t) - p_{jk}(t)) \to 0 \quad \text{as} \quad t \to \infty \end{align}
Therefore,
\begin{equation} \lim_{t \to \infty} p_{ik}(t) = \pi_{k} \end{equation}
(ii) If the chain is null recurrent:
We want to show that \(\lim\limits_{t \to \infty}p_{ij(t)} = 0\). See (Grimmett and Stirzaker 2020, Section 6.4 Proof of (20), p. 261)
$$\tag*{$\blacksquare$}$$
The Law of Large Numbers
We can now state the (strong) law of large numbers for the Markov chain.
If \(g\) is a bounded function, then with probability \(1\).
\begin{equation} \lim_{N \to \infty} \frac{1}{N} \sum_{t = 1}^{N}g(X_{t}) \overset{a.s.}{\longrightarrow} \mathbb{E}_{\vec{\pi}}(g) = \sum_{i}g(i)\pi_{i} \label{eq:7} \tag{7} \end{equation}
Reversibility
Let \(\{X_{t}\}\) be an irreducible positive recurrent chain. Assume that \(X_{0}\) has the stationary distribution \(\vec{\pi}\), so that \(X_{t}\) has distribution \(\vec{\pi}\).
We define a reversed chain \(Y\) by, \(Y_{t} = X_{T - t}\) for \(t \in \{0, \dots, T\}\).
Theorem 2: \(Y\) is a Markov chain with transition matrix \(\mathbf{Q}\) with, \begin{equation} q_{ij} = P(Y_{t+1} = j | Y_{t} = i) = \frac{\pi_{j}}{\pi_{i}}p_{ji} \label{eq:8} \tag{8} \end{equation}
Proof: It is clear from Equation \eqref{eq:8} that \(q_{ij} \geq 0\). Now, \begin{equation} \sum_{j}q_{ij} = \frac{1}{\pi_i} \sum_{j}\pi_{j}p_{ji} = \frac{\pi_{i}}{\pi_{i}} = 1 \end{equation} Hence \(\mathbf{Q}\) is a stochastic matrix.
We next want to show that \(Y\) satisfies the Markov property.
\begin{align} P(Y_{t+1} = i_{t+1} | Y_{t} = i_{t}, \dots, Y_{0} = i_{0}) &= \frac{P(Y_{t+1} = i_{t+1}, Y_{t} = i_{t}, \dots, Y_{0} = i_{0})}{P(Y_{t} = i_{t}, \dots, Y_{0} = i_{0})} \\ &= \frac{P(X_{T-t-1} = i_{t+1}, X_{T-t}=i_{t}, \dots, X_{T} = i_{0})}{P(X_{T-t}=i_{t}, \dots, X_{T}=i_{0})} \\ &= \frac{\pi_{i_{t+1}}p_{i_{t+1}i_{t}} p_{i_{t}i_{t-1}}\cdots p_{i_{1}i_{0}}}{\pi_{i_{t}}p_{i_{t}i_{t-1}} \cdots p_{i_{1}i_{0}}} \\ &= \frac{\pi_{i_{t+1}}p_{i_{t+1}i_{t}}}{\pi_{i_{t}}} \\ &= P(Y_{t+1}= i_{t + 1} | Y_{t} = i_{t}) \quad \text{(after some manipulation)} \end{align}
If we let \(i_{t+1} = j, i_{t} = i\), we get the desired \begin{equation} q_{ij} = \frac{\pi_{j}}{\pi_{i}}p_{ji} \tag*{$\blacksquare$} \end{equation}
The chain \(\{Y_t\}\) is called the time reversal of \(X\)
Definition: A disribution \(\vec{\mu}\) and a transition matrix \(\mathbf{P}\) are in detailed balance if for all states \(i, j \in \mathcal{S}\),
\begin{equation} \mu_{i}p_{ij} = \mu_{j}p_{ji} \label{eq:9} \tag{9} \end{equation}
Definition: The chain \(\{X_t\}\) is called reversible if the transition matrices of \(\{X_t\}\) and \(\{Y_t\}\) are the same.
That is, \(q_{ij} = p_{ij}\). This means that, \begin{equation} \pi_{i}p_{ij} = \pi_{j}p_{ji} \end{equation}
for all states \(i,j\). In this case the transition matrix is in detailed balance with the stationary distribution, \(\{X_{t}\}\) is said to be reversible in equilibrium.
Theorem 2: Given an irreducible chain with transition matrix \(\mathbf{P}\), if a distribution, say \(\vec{v}\) satisfies detailed balance, then it is a stationary distribution.
Proof:
Suppose \(\vec{v}\) satisfies the detailed balance condition, then \begin{align} v_{i}p_{ij} &= v_{j}p_{ji} \\ \sum_{i}v_{i}p_{ij} &= \sum_{i}v_{j}p_{ji} \\ &= v_{j}\sum_{i}p_{ji} \\ &= v_{j} \end{align}
\(\vec{v}\) is thus stationary (from the definition). $$\tag*{$\blacksquare$}$$
References
Grimmett, Geoffrey, and David Stirzaker. 2020. Probability and Random Processes. Oxford; New York: Oxford University Press.
Takahara, Glen. 2017. Stochastic Processes. https://mast.queensu.ca/~stat455/lecturenotes/set3.pdf.
Wasserman, Larry. 2004. All of Statistics: A Concise Course in Statistical Inference. Springer Texts in Statistics. New York: Springer. https://doi.org/10.1007/978-0-387-21736-9.