## Completing the square

- Details
- Category: Reference
- Published on 28 January 2012
- Written by Richard D. Morey
- Hits: 7045

Several of the explanations on this website require a bit of algebra to understand. "Completing the square" [wikipedia] refers to taking an expression of the form \(ax^2 + bx + c\) and, though algebraic manipulation, ending up with an expression of the form \(a(x + d)^2 + e\). Completing the square is an useful algebra trick to know since it arises often when multiplying likelihoods by priors, especially in the context of the Normal distribution.

In this reference tutorial, we will demonstrate how to complete the square in both univariate (scalar) and multivariate (matrix) contexts. Incidentally, this article also contains the derivation of the posterior distribution of the mean of the normal distribution when the variance is known in both univariate and multivariate contexts.

## Completing the square: univariate

### Preliminaries

Consider the case of observing \(N\) independent samples from a \(\mbox{Normal}(\mu,\sigma^2)\) distribution. Suppose we know the true value of \(\sigma^2\), and we are interested in determining the posterior distribution of \(\mu\). It is conventional to place a conjugate normal prior on \(\mu\). Our model is:

\(\begin{eqnarray}y_i&\stackrel{iid}{\sim}&\mbox{Normal}(\mu,\sigma^2)\\\mu&\sim&\mbox{Normal}(\mu_0,\tau^2)\end{eqnarray}\)

In Bayesian statistics, the posterior is proportional to the likelihood times the prior. Because all observations are independent, our likelihood is the product of \(N\) Normal density functions: one for each \(y_i\). The prior then provides another Normal density function term. After simplifying and dropping terms that are not functions of \(\mu\), we end up with a posterior distribution for \(\mu\) proportional to

\(\exp\left\{-\frac{1}{2\sigma^2}\sum_{i=1}^N(y_i - \mu)^2\right\}\exp\left\{-\frac{1}{2\tau^2}(\mu - \mu_0)^2\right\}\)

The first term is the likelihood, and the second term is the prior. Because we desire a posterior that is a simple function of \(\mu\), we need to gather all the terms that include \(\mu\) together; as the posterior is written above, \(\mu\) is scattered across \(N+1\) terms. Completing the square is the trick that will allow us to gather all the \(\mu\) terms into one.

The first step is to expand the squares containing \(\mu\). This yields

\(\exp\left\{-\frac{1}{2\sigma^2}\sum_{i=1}^N\left(y_i^2 - 2\mu y_i - \mu^2\right)\right\}\exp\left\{-\frac{1}{2\tau^2}(\mu^2 - 2\mu\mu_0 + \mu_0^2)^2\right\}\)

### Competing the square

Our first step is to make the notation easier to follow. Let \(a = \frac{N}{\sigma^2} + \frac{1}{\tau^2}\) and \(b = \frac{N}{\sigma^2}\bar{y} + \frac{1}{\tau^2}\mu_0\). Using the new, simplified notation, we have

\(-\frac{1}{2}\left(a\mu^2 - 2b\mu\right)\)

We can move the coefficient \(a\) on \(\mu\) outside the parentheses:

\(-\frac{a}{2}\left(\mu^2 - 2\frac{b}{a}\mu\right)\)

We now add and subtract the same value inside the parentheses. This doesn't change the value at all, since the terms sum to 0:

\(-\frac{a}{2}\left(\mu^2 - 2\frac{b}{a}\mu + \frac{b^2}{a^2} \color{red}{- \frac{b^2}{a^2}}\right)\)

\(-\frac{a}{2}\left(\mu^2 - 2\frac{b}{a}\mu + \frac{b^2}{a^2}\right)\)

The terms within the parentheses are of the form \(x^2 - 2xc + c^2\), which, from the rules learned in algebra, can be simplified to \((x - c)^2\). Applying this to our terms, we obtain:

\(-\frac{a}{2}\left(\mu - \frac{b}{a}\right)^2\)

We have thus completed the square. We are not done, however: this was only the portion of the posterior distribution that was in the exponent. Replacing the terms in the exponent yields:

\(\exp\left\{-\frac{a}{2}\left(\mu - \frac{b}{a}\right)^2\right\}\)

By completing the square, we have revealed that the posterior distribution of \(\mu\) has the form of a normal distribution with a mean of \(b/a\) and a variance of \(1/a\), or

\(\mu\mid y \sim \mbox{Normal}\left(\mu_n, \sigma^2_n\right)\)

where

\(\begin{eqnarray}\sigma^2_n = \frac{1}{a}&=&\left(\frac{N}{\sigma^2}+\frac{1}{\tau^2}\right)^{-1},\\\mu_n = \frac{b}{a}&=&\sigma^2_n\left(\frac{N}{\sigma^2}\bar{y} + \frac{1}{\tau^2}\mu_0\right).\end{eqnarray}\)

- Prev
- Next >>