Common probability distributions
Distributions based on Bernoulli trials
Several distributions arise from considering a sequence of independent so-called “Bernoulli trials”: random experiments with a binary outcome. While the coin toss is the prototypical example, a wide range of situations can be described as Bernoulli trials: clinical outcomes, device function, manufacturing defects, and so on. For the purposes of probability calculations, outcomes are encoded as 1, called a ‘success’, and 0, called a ‘failure’.
Bernoulli distribution. A random variable
Geometric distribution. Imagine now a sequence of independent Bernoulli trials with success probability
Binomial distribution. Let
The binomial PMF is clearly non-negative for any
Negative binomial. If now
There is an alternate form of the negative binomial that arises from considering the number of failures, akin to the geometric distribution, rather than the number of trials. If
About 36% of people in the US have blood type A positive. Consider a blood drive in which donors participate independently of blood type and are representative of the general population. If
The following questions can be answered using distributions based on Bernoulli trials.
- What is the probability that the first 5 donors are not
? - What is the probability that more than 5 donors have blood drawn before an
donor has blood drawn? - What is the probability that of the first 20 donors, 10 are
? - What is the probability that it takes 30 donors to obtain 10
samples?
The answers are as follows:
Here, consider
to be the number of donors before the first donor; then . So, .Let
remain as in (i). Then .Now let
record the number of donors out of the first 20. Then , soNow let
record the number of donors until 10 samples are obtained. Then where the first parametrization is used. Then .
Multinomial distribution. The multinomial generalizes the binomial to trials with
The percentages of the US population with each of the 8 blood types is given below.
Type | Frequency |
---|---|
34.7% | |
6.6% | |
35.7% | |
6.3% | |
8.5% | |
1.5% | |
3.4% | |
0.6% |
What is the probability of observing 4
The above PMFs convey the distribution of probabilities across outcomes, but what are “typical” values that one is likely to observe for these random variables? There are several so-called measures of center, including: the mode or value with largest mass/density; the median or ‘middle’ value with equal mass/density above and below; and the mean or average of values in the support weighted by density/mass.
The mean is known, formally, as the expected value or simply the ‘expectation’ of a random variable, and defined to be:
Show that:
- If
then - If
then - If
then
Poisson distribution. Consider a binomial probability
The Poisson distribution is often used to model count data. Suppose you’re recording the number of visits to your website each day for a year, and the average number of daily visits is 32.7, so you decide to model the random variable
- Find the following probabilities according to your model:
. - If you assume visits on each day are independent, how many days would you expect to observe no visits in a year? Under 20 visits? Over 40 visits? Over 100 visits?
- If your year of data shows 30 days with over 100 visits, do you think the Poisson is a good model?
We’ll work through the solution in class.
Basic continuous distributions
Here we’ll look at a few elementary continuous distributions.
Uniform distribution. The continuous uniform distribution corresponds to drawing a real number at random from an interval
Another way of looking at the matter of endpoints is that since
The expectation of a uniform random variable is the midpoint of the interval:
Exponential distribution. The exponential distribution is given by the PDF:
The CDF is
The exponential distribution is often used to model waiting times and failure times.
The Gaussian distribution
The Gaussian or normal distribution is of central importance in statistical inference, and arises in relation to averages. Here we’ll develop the density in the standard case (no parameters) and then introduce the center and scale parameters through a simple linear transformation. We’ll start with an important calculus result.
Theorem (Gaussian integral).
Let
Gaussian distribution. The standard Gaussian or normal distribution is given by the PDF
As a remark, special notation is used for the PDF and CDF of the standard normal/Gaussian because it is used so frequently.
If
This argument leverages the fact, immediate from the functional form of
Now let
Notice that
More expectations
Theorem. If
The proof requires some advanced techniques, so we’ll skip it. Hogg & Craig provide a sketch in the discrete case and otherwise defer to references. Some texts simply state this as a definition.
Corollaries. If
.- If
for all in the support of , then . - If
for all in the support of , then . - If
then for all in the support of , .
Notice that (ii) and (iv) are special cases of (iii), so it suffices to prove (i) and (iv). Essentially, these properties follow directly from properties of integration/summation.
For (i), in the discrete case:
In the continuous case, the argument is essentially the same:
For (iii), let
In the continuous case:
To obtain (ii), set
To obtain (iv), set
These corollaries can often ease calculations. For example, it is immediate that for any random variable whose expectation exists,
The variance of a random variable is defined as the expectation:
If
While one could calculate the variance directly, as in the example above, this is often a cumbersome calculation in more complex cases. Instead, by the corollary, and noting that
If
Then, by the variance formula: