HomePHP Page 7 - Implementing Bayesian Inference Using PHP: Part 2

Bayes estimators - PHP

While the first article in this series discussed building intelligent Web applications through conditional probability, this Bayesian inference article examines how you can use Bayes methods to solve parameter estimation problems. Relevant concepts are explained in the context of Web survey analysis using PHP and JPGraph. (This intermediate-level article was first published by IBM developerWorks on April 12, 2004 at http://www.ibm.com/developerWorks).

A Bayes estimator combines information from a prior parameter estimate P(i) and a likelihood parameter estimate P(R |i) to arrive at a posterior parameter estimate P(i | R). In the Bayes parameter estimation formula below, R stands for "results" andstands for "parameter":

P(i | R) = P(R |i) P(i) / P(R)

In the specific case of a simple binary survey, the sample results can be expressed as the number of success events k divided by the total number of events n:

R = k/n

The Bayes parameter estimation formula for poll data looks like this:

P(i | k/n) = P(k/n |i) * P(i) / P(k/n)

Recall that the numerator term P(k/n) plays a relatively insignificant normalizing role, so you can ignore it for the purposes of understanding how to compute the posterior distribution:

P(i | k/n) ~ P(k/n |i) * P(i)

In the last few sections, I have shown you how the likelihood term P(k/n |i) in the above formula can be computed using maximum likelihood techniques -- in particular, the binomial formula for computing the probability of various values ofi (where p is replaced by the generic term denoting a parameter):

P(k/n |i) = nCk *ik * (1 -i) (n - k)

Now that you know how to compute the likelihood term in Bayes equation, how can you compute the prior term P(i)?

The key to computing P(i) is to first recognize thati represents the probability of a success event (like a 1-coded response) and as such, can only take on values in the 0 to 1 range. Each value ofi in this range will have a different probability of occurrence associated with it. The parameteri can assume an infinite number of values between 0 and 1 which means that you need to represent it with a continuous probability distribution (like the normal distribution) as opposed to a discrete probability distribution (like the binomial distribution).

In the case of a simple binary survey, the beta distribution is the appropriate continuous distribution to use to represent P(i) because:

The domain of your probability distribution function is between 0 and 1, and

The outcomes of your survey arise from a Bernoulli process.

A Bernoulli process:

consists of a series of independent, dichotomous trials where the possible events occurring on each trial are labeled "success" and "failure",p is the probability of success on a given trial, and p remains unchanged from trial to trial. -- Winkler and Hayes, Statistics: Probability, Inference and Decision, p 204.

The process that generates the observed response distribution for a particular binary question in the survey can be legitimately viewed as arising from a Bernoulli process as Winkler and Hayes defined. A process that can be modeled as a Bernoulli process gives rise to a Beta distribution for the parameter p (estimated using k/n). I'm ready now to discuss the beta distribution and the critical role it plays in computing the posterior parameter estimate P(i | R).