HomePHP Page 6 - Implementing Bayesian Inference Using PHP: Part 2
Algebraic cleverness - PHP
While the first article in this series discussed building intelligent Web applications through conditional probability, this Bayesian inference article examines how you can use Bayes methods to solve parameter estimation problems. Relevant concepts are explained in the context of Web survey analysis using PHP and JPGraph. (This intermediate-level article was first published by IBM developerWorks on April 12, 2004 at http://www.ibm.com/developerWorks).
When estimating p using maximum likelihood analysis, you use the binomial formula to compute the likelihood of p:
P( k/n | pi) = nCk (p)k (1 - p) (n - k)
The computation involved keeping the k and n parameters fixed while varying pi and then seeing which value of pi maximized the likelihood of the results k/n. If you examine the above equation, you should note that the value of nCk will remain constant as you vary pi. This implies that you can drop this term from the equation without affecting the shape of the likelihood distribution or the MLE. To confirm this, you can modify the likelihood graphing code by replacing this line:
This is simply the binomial formula without the combinations term. When you do this, you get the following graph:
Figure 2. The likelihood distribution graph (reduced formula)
Note that the MLE value is smaller than before, but that 0.20 is still the MLE of p. As the likelihood distribution is not a probability distribution, these reduced values are immaterial -- the shape and maxima are all that really matters. From now on, you can use this reduced formula to compute the MLE of p.
P( k/n | pi) = (p)k * (1 - p) (n - k)
Another bit of algebraic cleverness involves eliminating the exponents and multiplications by taking the logarithm of each term appearing on the right-hand side. When you do so, we obtain the log likelihood formula (commonly denoted with a capital L):
Note that taking the log of a formula with exponents in it changes the exponents into multipliers. Also, terms that were multiplied are now added. It is often easier to find the derivative of 0 with the log version of the formula (to find the MLE).
To convince yourself that the log likelihood formula can be used to derive the MLE of p, you can modify the likelihood graphing code by replacing this line:
Figure 3. The likelihood distribution graph (log likelihood formula)
As you can see, the shape has changed somewhat but the MLE of p is still 0.20. Note also that the first graphed point does not start at 0 because the log of 0 produces an infinite value. The simple solution I adopted was to start plotting with a p value of 0.05 instead of 0.
It was necessary to show you these alternate formulas for computing the likelihood of p because statisticians often switch between them in different contexts based on the mathematical convenience of doing so. The log likelihood version is especially important because you often see abundant use of logarithms in the context of logistic regression which is often used to analyze multivariate surveys and experimental data having a binary dependent measure that one wants to predict and explain. Logistic regression uses maximum likelihood techniques to estimate(theta) on the basis of several explanatory response variables (for n explanatory variables):
= P( Y=1 | xi1, ..., xin )
Logistic regression is used for estimation, prediction, and modeling purposes and is an important technique you should learn if you want to design and analyze multivariate binary surveys.