A **Probability Distribution** is a table/graph that depicts the assignment of probabilities to the assumption of specific values by a given random variable.

The following concepts are useful to understand probability distributions:

- If Event A can occur in
*p*possible ways and Event B can occur in*q*possible ways, then both A and B can occur in*p*x*q*ways. - The number of different ways that a set of objects can be arranged is called
*Combination*. The number of combinations of*n*objects taken*r*at a time is given by*nCr = n! / (n - r)! r!* - The number of different ways that a set of objects can be arranged in order is called
*Permutation*. The number of permutations of*n*objects taken*r*at a time is given by*nPr = n! / (n - r)!*

FUNCTION factorial(p_n IN NUMBER) RETURN NUMBER ISEX> Compute 9!

BEGIN

IF p_n IS NULL OR p_n < 0 THEN

RAISE_APPLICATION_ERROR(-20000, 'Invalid Input Value');

ELSIF p_n <= 1 THEN

RETURN 1;

ELSE

RETURN factorial(p_n-1) * p_n;

END IF;

END;

select factorial(9) from dual;I was curious to see how far I can push this function - the maximum value of

*n*was 83.7 with NUMBER types, and 84.7 when I changed the input parameter and return type to BINARY_DOUBLE

SQL> select factorial(83.7) from dual;EX> Compute the number of combinations of 9 objects taken 3 at a time.

FACTORIAL(83.7)

---------------

9.642E+125

SQL> select factorial(83.71) from dual;

FACTORIAL(83.71)

---------------

~

SQL> select factorial2(84.7) from dual;

FACTORIAL2(84.7)

----------------

8.167E+127

SQL> select factorial2(84.71) from dual;

FACTORIAL2(84.71)

-----------------

Inf

select factorial(9)/(factorial(9-3) * factorial(3)) from dual;EX> Compute the number of different ways of arranging 9 objects taken 3 at a time.

select factorial(9)/factorial(9-3) from dual;

**Discrete Probability Distributions**

- The
*discrete probability distribution*is a table that lists the discrete variables (outcomes) of an experiment with the relative frequency (a k a probability) of each outcome.

Example: Tossing a coin two times gives you the combinations (H,H), (H,T), (T,H), (T,T) and hence, the following tuples for (#Heads, Frequency, Relative_Frequency):

(0, 1, 1/4=0.25), (1, 2, 2/4=0.5), (2, 1, 1/4=0.25).

This is the probability distribution for # heads after flipping a coin twice. *Mean*or*Expected value*of the discrete probability distribution*μ*= ∑_{i=1_to_n}*x*For the coin example,_{i}* P(x_{i})*μ*= 0 * 0.25 + 1 * 0.5 + 2 * 0.25 = 1*Variance*of the discrete probability distribution*σ²*= ∑_{i=1_to_n}*(x*_{i}- μ)² * P(X_{i})*Standard deviation*is the square root of the variance

**A**

*Binomial Probability Distribution*

*binomial*or

*Bernoulli*experiment is one which consists of a fixed number of trials, each independent of the other, each with only two possible outcomes, with a fixed probability for success or failure representation in each outcome. The Bernoulli process counts the number of successes over a given number of attempts, or in other words, the random variable for a Binomial distribution is the number of successes over given number of attempts.

- The probability of
*r*successes in*n*trials with probability of success*p*and probability of failure*q*is given by*P(r, n) = (n! / (n - r)! r!) p*^{r}q^{(n - r)} - The binomial probability distribution is a table of (r, P(r, n)) which can be subsequently graphed, as discussed in this example

^{2}0.6

^{(7 - 2)}, which can be computed using

select factorial(7) * power(0.4,2) * power(0.6,(7-2))/The probability that it will rain

(factorial(7-2) * factorial(2)) p_2_7

from dual;

*at least*6 days over the next 7 days is P(r >= 6) = P(6,7)+P(7,7), computed using

select (factorial(7) * power(0.4,6) * power(0.6,(7-6))/Finally, the probability that it will rain no more than 2 days over the next 7 days is P(r <= 2) = P(0,7) + P(1,7) + P(2,7)

(factorial(7-6) * factorial(6))) +

(factorial(7) * power(0.4,7) * power(0.6,(7-7))/

(factorial(7-7) * factorial(7))) p_r_ge_6

from dual;

- The
*mean*of a binomial distribution is*μ = np* - The
*standard deviation*is*σ² = npq*

*BINOMDIST(r, n, p, cumulative)*. p is the probability of success, set cumulative=TRUE if you want the probability of r or fewer successes, set cumulative=FALSE if you want exactly r successes. Here is the PL/SQL version:

FUNCTION binomdist(r NUMBER, n NUMBER, p NUMBER, cumulative BOOLEAN DEFAULT FALSE) RETURN NUMBER IS

ri NUMBER;

ret NUMBER;

fn NUMBER;

BEGIN

ret := 0;

fn := factorial(n);

FOR ri IN REVERSE 0..r LOOP

ret := ret + (fn * power(p, ri) * power((1-p),(n - ri)))/

(factorial(n - ri) * factorial(ri));

IF NOT cumulative THEN

EXIT;

END IF;

END LOOP;

RETURN ret;

END binomdist;

*Poisson Probability Distribution*

The random variable for Poission distribution is the number of occurrences of the event over a measurable metric (time, space). In a Poisson process, the (measured) *mean* number of occurences of an event is the same for each interval of measurement, and the number of occurrences in a particular interval are independent of number of occurrences in other intervals.

- The probability of exactly
*r*occurrences over a given interval is given by*P(r) = μ*^{r}* e^{(-μ)}/ r! - The
*variance*of the Poisson distribution is the same as the (observed) mean. - A
*goodness of fit*test helps verify if a given dataset fits the Poisson distribution

select power(25,31) * exp(-25)/factorial(25) p_31Just as we saw in Binomial distribution, the probability that no more than 31 customers will walk into the coffee shop is P(r <= 31) = P(0)+P(1)+..+P(31). Inversely, the probability that

from dual;

*at least*31 customers will walk into the coffee shop is P(r >= 31) = 1 - P(r < 31). Obviously, this leads up to the need for a function similar to POISSON(r, μ, cumulative) in Excel - where cumulative = FALSE indicates computation of exactly r occurrences, and cumulative = TRUE indicates r or fewer.

FUNCTION poissondist(r NUMBER, mu NUMBER,

cumulative BOOLEAN DEFAULT FALSE) RETURN NUMBER IS

ri NUMBER;

ret NUMBER;

BEGIN

ret := 0;

FOR ri IN REVERSE 0..r LOOP

ret := ret + (power(p, ri) * exp(-mu)/factorial(ri));

IF NOT cumulative THEN

EXIT;

END IF;

END LOOP;

RETURN ret;

END poissondist;

*Poisson approximation* - a Poisson distribution can be used to approximate a Binomial distribution if the number of trials (in the binomial experiment) is >= 20 and the probability of success *p* is <= 5%.

## No comments:

Post a Comment