HomePHP Page 3 - Implementing Bayesian Inference Using PHP: Part 2

What is parameter estimation? - PHP

While the first article in this series discussed building intelligent Web applications through conditional probability, this Bayesian inference article examines how you can use Bayes methods to solve parameter estimation problems. Relevant concepts are explained in the context of Web survey analysis using PHP and JPGraph. (This intermediate-level article was first published by IBM developerWorks on April 12, 2004 at http://www.ibm.com/developerWorks).

Parameter estimation refers to the process of using sample data to estimate the value of a population parameter (for example, the mean, variance, or t score) or a model parameter (for example, a weight in a regression equation). Surveys are often used for the purposes of parameter estimation.

A consumer survey might ask about past spending decisions in various retail sectors. The purpose might be to estimate the parameter values to use in a mathematical model describing the complete consumer spending profile of the sampled population. (The surveys I use in this article are not so complex; I are more concerned with understanding the basic concepts involved.) To start I want to focus on a simple binary survey (a poll).

When estimating parameters, the most common inference technique used to estimate the "true value" of a parameter is the maximum likelihood technique. The manner in which a maximum likelihood estimator (MLE) is computed depends upon:

The level of measurement used (nominal, ordinal, interval, ratio) to record data.

The theoretical sampling distribution that best describes the response distribution (normal, binomial, poission, exponential).

In the case of a simple binary survey, the level of measurement is nominal because you use a category-based scale with two levels to measure participant responses to the question.

In what follows, you use the binomial sampling distribution to model the response distribution you obtained from a simple binary survey. You could also use a hypergeometric distribution if the fact that you are sampling without replacement is a concern (for example, when sampling from a small-finite population, such as an employee survey).

Now that you know the level of measurement and the appropriate sampling distribution to use, you can discuss the specifics of the maximum likelihood technique that should be used to estimate our population or model parameters. This discussion of maximum likelihood estimation is relevant to Bayesian inference because it demonstrates another technique that can be used to compute the likelihood terms in Bayes equation. This new technique involves using probability distributions to compute the likelihood (and prior terms) in Bayes formula. In the previous article, I computed the likelihood term by dividing the joint distribution by the relevant marginal distribution. Here, I will demonstrate how to compute the likelihood term using a theoretical sampling distibution.

Another reason why maximum likelihood techniques are relevant is because they are the main competition to using Bayes inference in parameter-estimation contexts.