Lab 8
Probability (I)
Random Variable
A “random variable” can be viewed as a numerical variable with known possible values, though the actual value remains unknown until the corresponding experiment is completed. Since numerical variables can be either discrete or continuous, random variables can also be classified as discrete or continuous.
Discrete Random Variable
A discrete random variable can take on countably distinct values. For example, it might yield an integer between 1 and 6 (as in the outcome of rolling a die) or an integer from 1 to infinity (such as the number of car accidents in the U.S. in a given year).
Box model
A box model is a useful way to model an experiment with a discrete random variable. Suppose we have a box containing three balls marked with 1, five balls marked with 2, and two balls marked with 3. We draw a ball with replacement, meaning:
- Extract a ball.
- Check the number on it.
- Put the ball back into the box.
- Repeat the process.
Due to this “replacement” condition, the theoretical probability of drawing a 1, 2, or 3 remains constant throughout the experiment. If we define the random variable \(X\) as the number marked on the drawn ball, the corresponding discrete probability distribution is:
| x | 1 | 2 | 3 | Total |
|---|---|---|---|---|
| p(x) | 3/10 | 5/10 | 2/10 | 10/10=1 |
The shape of the distribution can be illustrated:
Now, we can use the \(\texttt{\color{brown}{sample()}}\) function in \(\textbf{R}\) to implement the experiment using the box model, which involves assuming the model and simulating the process of drawing balls from the box. The \(\texttt{\color{brown}{sample()}}\) function accepts three arguments:
- The first argument accepts the ball setting in a box model;
- The second argument is the number of balls we want to draw;
- The third argument is to set the replacement condition.
Note that each time you run the code, the set of drawn balls will differ due to the randomness of the experiment. Now, we can summarize the observed balls in the sample data using a frequency table as follows:
With the large number of the sample, the frequency table closely aligns with the theoretical probability distribution. This illustrates the concept of long-run probability—the idea that the relative frequency of an event (e.g., \(X=1\), \(X=2\), or \(X=3\)) converges toward the theoretical probability with repeated trials. We can further verify this by examining the trajectories of relative frequencies. For this purpose, we use the \(\texttt{\color{brown}{cumsum()}}\) function, which generates a vector of cumulative sums of events from subsets with sample sizes ranging from 1 to 2000.
We can observe that the convergences of relative frequencies to the theoretical probabilities.
Mean and variance of discrete probability distribution
As learned, the mean and variance of a discrete probability distribution can be calculated by: \[\begin{align*} \mu =& \sum x p(x)\\ \sigma^2 = & \sum (x-\mu)^2p(x), \end{align*}\] where the summation is taken over all possible values of \(X\). Using R, we can easily compute these quantities. For example, with our box model:
We can also calculate these values empirically using the data we generated.
Again, observe the empirical mean and variance are close to their theoretical counterparts.
Binomial distribution
The binomial probability distribution is one of the most widely used discrete distributions. Consider an experiment with two possible outcomes: success or failure, each with a fixed probability of success \(p\). If this experiment is repeated \(n\) times, the total number of successes can be modeled as a binomial random variable \(X\), which can take any integer value from 0 (no success) to \(n\) (all successes).
Recall that the probability of observing exactly \(x\) successes is given by: \[
p(x)=\frac{n!}{x!(n-x)!}p^x(1-p)^{n-x}.
\]
Example: Curry’s free throw shootings
Stephen Curry is renowned for his three-point shooting, but he also holds the highest career free-throw percentage in NBA history (91%). Suppose we expect him to attempt 7 free throws in the Warriors’ game against the Pelicans on 03/28/2025.
We can model this scenario using the binomial probability distribution with \(p=0.91\) (success probability) and \(n=7\) (number of attempts). The probability that he makes exactly 5 out of 7 free throws can be calculated as:
While this direct computation is doable in R, it may not be convenient to write out all the codes manually. Instead, we can use the built-in function \(\texttt{\color{brown}{dbinom()}}\) to simplify the process. The function \(\texttt{\color{brown}{dbinom()}}\) accepts the three arguments:
- The first argument is the observed \(x\) value;
- The second and third arguments are the two parameters \(n\) and \(p\).
In addition, we can calculate the probability of achieving five or fewer successful shots using the function \(\texttt{\color{brown}{pbinom()}}\).
Lastly, the probability the number of successes is more than five shots becomes:
Note that the probability above was obtained using the complement rule.
We can also check the shape of the distribution as well:
For a binomial distribution, we can easily determine the mean and variance using the two parameters \(n\) and \(p\): \[ \mu=np\text{ and }\sigma^2=np(1-p). \]
Lab Questions

Shaquille O’Neal is a legendary NBA player, renowned for being one of the most dominant forces in the history of the league, particularly due to his impact on the game and his prowess in the paint. However, he was also infamous for his low free throw percentage of 52.7% throughout his career. Opposing teams often resorted to intentionally fouling him to give him more free throw attempts, a strategy known as “Hack-a-Shaq.” For the following questions, suppose he attempted 10 free throws in a game.
- What are the two parameters \(n\) and \(p\)?
- What is the probability that O’Neal made 7 free throws among 10 attempts?
- What is the probability that O’Neal made at least 7 free throws among 10 attempts?
- Calculate the mean and variance of the number of successful shots among 10 attempts.
Click HERE to submit your answers.