A random variable (RV) is a function that maps each event in a sample space to a number (e.g., in ), i.e.,
A discrete random variable takes on a finite number of values.
Suppose is the number of heads in 3 tosses of a fair coin.
, a realization of , can take on the values 0, 1, 2, 3.
Distribution of a Discrete Random Variable
The distribution of a discrete RV is the collection of its values and the probabilities associated with those values.
The probability distribution for is as follows:
0
1
2
3
1/8
3/8
3/8
1/8
For the distribution to be well-defined, we need that
Example Discrete Distribution of
Expectation of a Random Variable
Let be realizations of w/ corresponding probabilities , the expected value of is the sum of each multiplied by its corresponding probability:
Sometimes, may be used in place of the notation and may be written .
Calculating an Expectation
Returning to our coin tossing example,
Linearity of Expectation
The expectation operator is linear in that:
, for a constant
, for a constant
This turns out to be a very useful property. Intuitively, this follows from the expectation being summation (or integration) operation.
Variance of a Random Variable
Let be realizations of w/ corresponding probabilities and expected value , then the variance of , (or )1, is The standard deviation of , written (or ), is just the square root of the variance.
Calculating the Variance of a Random Variable
Again returning to our coin tossing example,
The standard deviation is 0.866.
Variance and Expectation
Note that is the expected squared distance1 of a given realization from the RV ’s expected value , so
As noted before, if we define , then we have
Covariance of Two Random Variables
For two RVs, and , their covariance1, , measures the degree to which the two RVs vary together
The covariance is tied to another notion—correlation, which we will discuss later in a module on regression analysis.
Binomial Random Variables
A specific type of discrete RV is a binomial RV.
is a binomial RV when it represents the number of successes in independent replications1 of an experiment where
Each replicate has two possible outcomes: success or failure
The probability of success in each replicate is constant
Binomial Random Variables
A binomial RV takes on values . We use the shorthand to say that follows a binomial distribution with trials and success probability.
For example, the number of heads in 3 tosses of a fair coin is a binomial RV with parameters and .
For a binomial RV with parameters and ,
The Binomial Coefficient
The binomial coefficient is the number of ways to choose items from a set of size , where the order of the choice is ignored.
Mathematically,
For any integer ,
Formula for the Binomial Distribution
Let be the number of successes in trials, then
Parameters of the distribution:
= number of trials
= probability of success
Calculating Binomial Probabilities in R
The function dbinom() is used to calculate .
dbinom(k, n, p):
The function pbinom() is used to calculate or .
pbinom(k, n, p):
pbinom(k, n, p, lower.tail = FALSE):
Continuous Random Variables
A discrete random variable takes on a finite number of values.
Number of heads in a coin tosses
Number of people who’ve had chicken pox in a random sample
A continuous random variable takes on any value in an interval.
Height in a population
Blood pressure in a population
Discrete RVs are counted, continuous RVs are measured.
Probabilities from Continuous Distributions
Two important features of continuous distributions:
The total area under the density curve is 1.
The probability that a variable has a value within a specified interval is the area under the curve over that interval.
Probabilities from Continuous Distributions
When working with continuous random variables, probability is found for intervals of values rather than individual values.
Formally, the probability that a continuous RV takes on any single individual value is zero, that is, 1.
Thus, is equivalent to .
The “Empirical Rule” for the Normal Distribution
According to the “empirical rule,” for any1 normal distribution,
approximately 68% of the data are within 1 SD of the mean
approximately 95% of the data are within 2 SDs of the mean
approximately 99.7% of the data are within 3 SDs of the mean
The “Empirical Rule” for the Normal Distribution
An Example of Using a Normal Distribution
Assume that the distributions of test scores on the SAT and ACT are normal with means , and variances , .
Suppose that one student scores an 1800 on the SAT (Student A) and another student scores a 24 on the ACT (Student B). Which student performed better?
Standard Normal Distribution
A standard normal distribution is defined as a normal distribution with mean 0 and variance 1. It is often denoted as .
Any normal random variable can be transformed into a standard normal random variable .
Example of Using a Normal Distribution
SAT scores are ; ACT scores are .
is the score of Student A; is the score of Student B.
Calculating Probabilities from Normal Distributions
What is the percentile rank for a student who scores an 1800 on the SAT for a year in which the scores are ?
Calculate a -score. If ,
pnorm(z) gives the area (i.e., probability) to the left of
pnorm(1)
[1] 0.8413447
Alternatively, let R do the work…
pnorm(1800, 1500, 300)
[1] 0.8413447
Calculating Probabilities from Normal Distributions
What score on the SAT would put a student in the 99th percentile?
Identify the -value. qnorm(p) calculates the value (a quantile) such that for a , .
qnorm(0.99)
[1] 2.326348
If , then , and , so
Alternatively, let R do the work …
qnorm(0.99, 1500, 300)
[1] 2197.904
Words of Warning…
“Everyone is sure of this [that errors are normally distributed]…since the experimentalists believe that it is a mathematical theorem, and the mathematicians that it is an experimentally determined fact.” –Poincaré (1912)
“Far better an approximate answer to the right question, which is often vague, than the exact answer to the wrong question, which can always be made precise.” –Tukey (1962)
The Poisson Distribution
The Poisson distribution is used to calculate probabilities for rare events that accumulate over time (but is often used for counts).
It used most often in settings where events happen at a rate per unit of population and per unit time, such as the annual incidence of a disease in a population.
Typical example: for children ages 0-14, the incidence rate of acute lymphocytic leukemia (ALL) was about 30 diagnosed cases per million children per year in the decade 2000-2010.
Always important to note and understand the units.
Example: Outbreaks of Childhood Leukemia
Fortunately, childhood cancers are rare.
For children ages 0-14, the incidence rate of acute lymphocytic leukemia (ALL) was approximately 30 diagnosed cases per million children per year in the decade from 2000-2010. Given that about 20% of the US population are in this age range, we may ask
What is the incidence rate of ALL over a 5 year period?
In a small city of 75,000 people, what is the probability of observing exactly 8 cases of ALL over a 5 year period?
In the small city, what is the probability of observing 8 or more cases of ALL over a 5 year period?
Poisson Distribution
Suppose events occur over a fixed time window in such a way that
The probability an event occurs in an interval is proportional to the length of the interval.
Events occur independently at a rate per unit of time.
Poisson Distribution
Then the probability of exactly events in one unit of time is
The mean is , i.e., .
The standard deviation is , i.e., .
Poisson Distribution
The probability of exactly events units of time is
The mean is , i.e., .
The standard deviation is , i.e., .
Poisson Distribution with
Childhood Leukemia Incidence
See Example 3.37 of Vu and Harrington (2020) for details.
What is the incidence rate of ALL over a 5 year period?
Incidence rate of ALL in a year is 30 cases per 1,000,000 children:
Incidence rate in a 5-year period is (5)(30) per 1,000,000 children:
…and what about a city of size 75,000?
In a small city of 75,000 people, what is the probability of observing exactly 8 cases of ALL over a 5 year period?
In a city of 75,000 people, about children will be age 0-14.
The five-year rate of new cases for the whole city would be
What is the probability of 8 cases over 5 years?
In a small city, what is the probability of observing 8 or more ALL cases in a 5-year period?
Easiest to calculate this in R. To do so, suppose , then dpois(k, lambda) gives
dpois(8, lambda =2.25)
[1] 0.001717027
…but what is the probability of 8 or more cases?
In a small city, what is the probability of observing 8 or more ALL cases in a 5 year period?
Would 8 or more cases be a rare event? Suppose and calculate .
Compute using ppois(k, lambda)
1-ppois(7, lambda =2.25)
[1] 0.002267088
Or as ppois(k, lambda, lower.tail = FALSE)
ppois(7, lambda =2.25, lower.tail =FALSE)
[1] 0.002267088
Summary Table of Distributions
Binomial
Normal
Poisson
Parameters
,
,
Possible values
(-, )
Mean
Standard Deviation
References
Poincaré, Henri. 1912. Calcul Des Probabilités. Vol. 1. Gauthier-Villars.