HomePHP Page 2 - Implement Bayesian inference using PHP, Part 1

Conditional probability - PHP

Have you ever wanted to build an intelligent Web application? Paul Meagher shows how to do it using conditional probability. (This intermediate-level article was first published by IBM developerWorks, March 16, 2004, at http://www.ibm.com/developerWorks).

A conditional probability refers to the probability of observing an event A given that you have observed a separate event B. The mathematical shorthand for expressing this idea is:

P(A | B)

Imagine that A refers to "customer buys product A" and B refers to "customer buys product B". P(A | B) would then read as the "probability that a customer will buy product A given that they have bought product B." If A tends to occur when B occurs, then knowing that B has occurred allows you to assign a higher probability to A's occurrence than in a situation in which you did not know that B occurred.

More generally, if A and B systematically co-vary in some way, then P(A | B) will not be equal to P(A). Conversely, if A and B are independent events, then P(A | B) would be expected to equal P(A).

The need to compute a conditional probability thus arises any time you think the occurence of some event has a bearing on the probability of another event's occurring.

The most basic and intuitive method for computing P(A | B) is the set enumeration method. Using this method, P(A | B) can be computed by counting the number of times A and B occur together {A & B} and dividing by the number of times B occurs {B}:

P(A | B) = {A & B} / {B}

If you observe that 12 customers to date bought product B and of those 12, 10 also bought product A, then P(A | B) would be estimated at 10/12 or 0.833. In other words, the probability of a customer buying product A given that they have purchased product B can be estimated at 83 percent by using a method that involves enumerating relative frequencies of A and B events from the data gathered to date.

You can compute a conditional probability using the set enumeration method with the following PHP code:

Listing 1. Computing conditional probability using set enumeration

<?php

/** * Returns conditional probability of $A given $B and $Data. * $Data is an indexed array. Each element of the $Data array * consists of an A measurement and B measurment on a sample * item. */ function getConditionalProbabilty($A, $B, $Data) { $NumAB = 0; $NumB = 0; $NumData = count($Data); for ($i=0; $i < $NumData; $i++) { if (in_array($B, $Data[$i])) { $NumB++; if (in_array($A, $Data[$i])) { $NumAB++; } } } return $NumAB / $NumB; }