Topics
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
In probability, a trial is any procedure with an uncertain result, such as flipping a coin, rolling a die, or drawing a card. Each possible result of a trial is called an outcome.
An event is a collection of one or more outcomes, representing scenarios we're interested in, such as rolling an even number or drawing a red card. Events are the probabilities we calculate, and are typically denoted with letters such as A, so that the "probability of an event A" is given by P(A).
All possible outcomes from a single trial form the sample space, denoted U.
The overall probability of the sample space, denoted P(U), is 1. This expresses the idea that if you perform a trial, something must happen.
Theoretical probability is calculated based on reasoning or mathematical principles—it's what we expect to happen. When outcomes are equally likely, the probability of an event is given by
where n(A) is the number of outcomes in event A, and n(U) is the total number of outcomes in the sample space.
Experimental probability (or relative frequency) is found by actually conducting trials and observing outcomes. The relative frequency is calculated by:
While theoretical probability tells us what's expected, experimental probability tells us what's observed.
For an event A with probability P(A), the expected number of occurrences of A after n trials is given by
This is another way of saying that for every n trials, A will happen an average of P(A)×n times.
The complement of an event A, denoted A′, is the event that A does not happen. Since A either happens or it doesn't, then exactly one of A and A′ must happen for each trial:
This expresses the idea that the probability of the entire outcome space is 1.
A sample space diagram is a table representation for scenarios where two probability events occur side by side, and we're interested in certain combinations of those events.
For example, consider a game where a player rolls a die and then flips a coin. If the coin lands on heads, we add 1 to their die roll, and if it lands on tails we double their die roll.
Then we can answer questions like:
What is the probability of scoring higher than 6?
To do this, we simply count the number of cells that are greater than 6 and divide by the total number of cells. So in this case, we have the cells with 7,8,10,12, so the answer is
A Venn diagram is a visual tool used to illustrate relationships between sets or events. Each event is represented by a circle, and overlaps between circles represent shared outcomes. All circles lie within the larger universe U, which is the whole sample space.
Venn diagrams are often filled in with numbers representing the number of samples in each category.
The intersection is the event where both A and B occur simultaneously, denoted A∩B.
The union is the event that at least one of A or B occurs. The union is denoted A∪B and has probability
This formula is sometimes referred to as the inclusion-exclusion rule. It is often rearranged in the form
Events are mutually exclusive if they cannot both occur at once. In this case, the intersection probability is zero:
And therefore
Conditional probability is the probability of event A happening given we already know event B has occurred. It's calculated by taking the probability that both events occur, divided by the probability of the known event B:
Notice that we can rearrange this formula to get a general formula for the probability of multiple events,
If two events A and B are independent, then knowing whether or not one happened gives no information on whether or not the other happened, and
Rearranging the conditional probability formula gives the fact that for independent A and B,
In probability, a selection is the action of choosing one or more items from a set or group. We can perform selections either with replacement or without replacement.
In selection with replacement, each chosen item is returned to the original group before the next choice, keeping the probabilities constant across selections.
In selection without replacement, the chosen items are removed from the group, causing probabilities to change after each pick because the number of available items decreases.
This difference significantly impacts how probabilities are calculated, especially in problems involving multiple selections.
A tree diagram is a map of what can happen, one step at a time. We use them in probability scenarios with multiple steps, mostly when the result of one step affects the probability of the next step.
To explain, consider this example: We have a bag with 3 red marbles and 2 green marbles. You draw one marble from the bag, put it in your pocket, and then draw a second marble from the bag.
The diagram starts at the point where nothing has happened yet, and the branches show the different possibilities for what can happen next.
There are two branches leaving from the start: one goes to "G" (green), the other to "R". They are numbered with the probability of the first marble being a certain color. Since there are 3 red marbles and a total of 5 marbles in the bag, that branch has probability 53.
Now the tree has split into two possibilities, and for each of those there are two possibilities for what can happen next. But those probabilities depend on what the first marble was, because if we drew a green marble there is only 1 green marble left.
The probabilities on the next four branches are based on where you already are in the tree. For example, the 41 in the top right is the probability that the second marble is green if the first marble was green.
The probability of drawing two green marbles can be found by multiplying the branches that lead from "start" to two greens:
The probability of drawing two different color marbles (in any order) requires adding the probability of RG and the probability of GR:
A Markov Chain is a probability model for a scenario that transitions between different states at fixed time intervals.
An example is a fluorescent molecule filed frame by frame under a microscope. It has three states:
On: glowing
Off: temporarily dark
Bleached: permanently dead, will never glow again
At each frame, the molecule can transition between states, and those states have fixed probabiltiies:
If the molecule is on, there's an 80% chance it stays on, a 15% chance it turns off, and a 5% chance it bleaches.
If the molecule is off, there's a 60% chance it stays of, and a 40% chance it turns back on.
If the molecule is bleached, it's off forever, so there's a 100% probability it stays in the bleached state.
We represent this system with a transition diagram:
Since a transition diagram is just a special type of graph, we can write a weight matrix for it.
The transition matrix is a square matrix with order n×n, where n is the number of states. In the ith row and jth column of the matrix we out the probability of going from state j to state i. Let's revisit our example from earlier:
Let's call the states 1 for on, 2 for off, and 3 for bleached. The transition matrix is 3×3, so something like:
Now let's fill in the first column by looking at what can happen from the "On" state, since we said "On" was column 1.
There's a 0.8 chance of staying in "On", so in row 1 we put 0.8.
There's a probability of 0.15 that the molecule moves to "Off", which is row 2, so we put 0.15 there.
Finally, we put 0.05 in row 3, because the probability of going from 1→3 (which is "On" to "Bleached" is 0.05)
If we repeat that process for the other states, we find the other columns:
Once we have the transition matrix, we can use it to see how the state evolves at each step. Imagine the molecule starts in the "On" state, which we said was the first row / column. We can represent this state with the vector
which just says "there's a 100% chance the molecule is in the 'On' state right now". We call this vector the initial state vector, denoted s0.
One frame later, the state evolves according to the transition matrix:
We call this vector s1, the state vector after 1 time step.
In general, the state vector after n steps is sn, and it's entries give the probability of being in each state at that time.
Just as the first time step involves multiplying the state vector by the transition matrix, every subsequent step we just multiply by the transition matrix again.
Starting from state s0, we take n steps to get to sn, which means we multiply by T a total of n times:
In our molecule example, if we want to find s100 we calculate using technology:
This tells us that if a molecule starts in the "On" state, 100 times steps later there's a 97.4% chance it's in the "bleached" state. This makes sense because once in the bleached state, a molecule will stay in the bleached state. So in the long term, the likelihood of being in the bleached state should approach 100%.
Some transition matrices have eigenvectors. As a reminder, this means that
But the probabilities in s must add to 1, and the same must be true for Ts. That forces λ=1, so any eigenvector of a transition matrix has eigenvalue 1:
That means that when the transition matrix is applied to this state, the state does not change. We therefore call this state the steady state.
Finding eigenvectors is usually hard, but because we know the eigenvalue is 1, it's easier for transition matrices. For example, consider the transition matrix
With eigenvalue 1, the eigenvector (xy) we are looking for satisfies
Both rows give the same result: 0.8y=0.4x⇒x=2y, but we also know that x+y=1:
With a calculator, the fastest way to find the steady state is just to compute a large power of the transition matrix:
The columns will be the same if a steady state exists, so we just read one column and find the steady state is (2/31/3).
For any starting state (p1−p) (has to sum to 1), the state after 100 steps is
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
In probability, a trial is any procedure with an uncertain result, such as flipping a coin, rolling a die, or drawing a card. Each possible result of a trial is called an outcome.
An event is a collection of one or more outcomes, representing scenarios we're interested in, such as rolling an even number or drawing a red card. Events are the probabilities we calculate, and are typically denoted with letters such as A, so that the "probability of an event A" is given by P(A).
All possible outcomes from a single trial form the sample space, denoted U.
The overall probability of the sample space, denoted P(U), is 1. This expresses the idea that if you perform a trial, something must happen.
Theoretical probability is calculated based on reasoning or mathematical principles—it's what we expect to happen. When outcomes are equally likely, the probability of an event is given by
where n(A) is the number of outcomes in event A, and n(U) is the total number of outcomes in the sample space.
Experimental probability (or relative frequency) is found by actually conducting trials and observing outcomes. The relative frequency is calculated by:
While theoretical probability tells us what's expected, experimental probability tells us what's observed.
For an event A with probability P(A), the expected number of occurrences of A after n trials is given by
This is another way of saying that for every n trials, A will happen an average of P(A)×n times.
The complement of an event A, denoted A′, is the event that A does not happen. Since A either happens or it doesn't, then exactly one of A and A′ must happen for each trial:
This expresses the idea that the probability of the entire outcome space is 1.
A sample space diagram is a table representation for scenarios where two probability events occur side by side, and we're interested in certain combinations of those events.
For example, consider a game where a player rolls a die and then flips a coin. If the coin lands on heads, we add 1 to their die roll, and if it lands on tails we double their die roll.
Then we can answer questions like:
What is the probability of scoring higher than 6?
To do this, we simply count the number of cells that are greater than 6 and divide by the total number of cells. So in this case, we have the cells with 7,8,10,12, so the answer is
A Venn diagram is a visual tool used to illustrate relationships between sets or events. Each event is represented by a circle, and overlaps between circles represent shared outcomes. All circles lie within the larger universe U, which is the whole sample space.
Venn diagrams are often filled in with numbers representing the number of samples in each category.
The intersection is the event where both A and B occur simultaneously, denoted A∩B.
The union is the event that at least one of A or B occurs. The union is denoted A∪B and has probability
This formula is sometimes referred to as the inclusion-exclusion rule. It is often rearranged in the form
Events are mutually exclusive if they cannot both occur at once. In this case, the intersection probability is zero:
And therefore
Conditional probability is the probability of event A happening given we already know event B has occurred. It's calculated by taking the probability that both events occur, divided by the probability of the known event B:
Notice that we can rearrange this formula to get a general formula for the probability of multiple events,
If two events A and B are independent, then knowing whether or not one happened gives no information on whether or not the other happened, and
Rearranging the conditional probability formula gives the fact that for independent A and B,
In probability, a selection is the action of choosing one or more items from a set or group. We can perform selections either with replacement or without replacement.
In selection with replacement, each chosen item is returned to the original group before the next choice, keeping the probabilities constant across selections.
In selection without replacement, the chosen items are removed from the group, causing probabilities to change after each pick because the number of available items decreases.
This difference significantly impacts how probabilities are calculated, especially in problems involving multiple selections.
A tree diagram is a map of what can happen, one step at a time. We use them in probability scenarios with multiple steps, mostly when the result of one step affects the probability of the next step.
To explain, consider this example: We have a bag with 3 red marbles and 2 green marbles. You draw one marble from the bag, put it in your pocket, and then draw a second marble from the bag.
The diagram starts at the point where nothing has happened yet, and the branches show the different possibilities for what can happen next.
There are two branches leaving from the start: one goes to "G" (green), the other to "R". They are numbered with the probability of the first marble being a certain color. Since there are 3 red marbles and a total of 5 marbles in the bag, that branch has probability 53.
Now the tree has split into two possibilities, and for each of those there are two possibilities for what can happen next. But those probabilities depend on what the first marble was, because if we drew a green marble there is only 1 green marble left.
The probabilities on the next four branches are based on where you already are in the tree. For example, the 41 in the top right is the probability that the second marble is green if the first marble was green.
The probability of drawing two green marbles can be found by multiplying the branches that lead from "start" to two greens:
The probability of drawing two different color marbles (in any order) requires adding the probability of RG and the probability of GR:
A Markov Chain is a probability model for a scenario that transitions between different states at fixed time intervals.
An example is a fluorescent molecule filed frame by frame under a microscope. It has three states:
On: glowing
Off: temporarily dark
Bleached: permanently dead, will never glow again
At each frame, the molecule can transition between states, and those states have fixed probabiltiies:
If the molecule is on, there's an 80% chance it stays on, a 15% chance it turns off, and a 5% chance it bleaches.
If the molecule is off, there's a 60% chance it stays of, and a 40% chance it turns back on.
If the molecule is bleached, it's off forever, so there's a 100% probability it stays in the bleached state.
We represent this system with a transition diagram:
Since a transition diagram is just a special type of graph, we can write a weight matrix for it.
The transition matrix is a square matrix with order n×n, where n is the number of states. In the ith row and jth column of the matrix we out the probability of going from state j to state i. Let's revisit our example from earlier:
Let's call the states 1 for on, 2 for off, and 3 for bleached. The transition matrix is 3×3, so something like:
Now let's fill in the first column by looking at what can happen from the "On" state, since we said "On" was column 1.
There's a 0.8 chance of staying in "On", so in row 1 we put 0.8.
There's a probability of 0.15 that the molecule moves to "Off", which is row 2, so we put 0.15 there.
Finally, we put 0.05 in row 3, because the probability of going from 1→3 (which is "On" to "Bleached" is 0.05)
If we repeat that process for the other states, we find the other columns:
Once we have the transition matrix, we can use it to see how the state evolves at each step. Imagine the molecule starts in the "On" state, which we said was the first row / column. We can represent this state with the vector
which just says "there's a 100% chance the molecule is in the 'On' state right now". We call this vector the initial state vector, denoted s0.
One frame later, the state evolves according to the transition matrix:
We call this vector s1, the state vector after 1 time step.
In general, the state vector after n steps is sn, and it's entries give the probability of being in each state at that time.
Just as the first time step involves multiplying the state vector by the transition matrix, every subsequent step we just multiply by the transition matrix again.
Starting from state s0, we take n steps to get to sn, which means we multiply by T a total of n times:
In our molecule example, if we want to find s100 we calculate using technology:
This tells us that if a molecule starts in the "On" state, 100 times steps later there's a 97.4% chance it's in the "bleached" state. This makes sense because once in the bleached state, a molecule will stay in the bleached state. So in the long term, the likelihood of being in the bleached state should approach 100%.
Some transition matrices have eigenvectors. As a reminder, this means that
But the probabilities in s must add to 1, and the same must be true for Ts. That forces λ=1, so any eigenvector of a transition matrix has eigenvalue 1:
That means that when the transition matrix is applied to this state, the state does not change. We therefore call this state the steady state.
Finding eigenvectors is usually hard, but because we know the eigenvalue is 1, it's easier for transition matrices. For example, consider the transition matrix
With eigenvalue 1, the eigenvector (xy) we are looking for satisfies
Both rows give the same result: 0.8y=0.4x⇒x=2y, but we also know that x+y=1:
With a calculator, the fastest way to find the steady state is just to compute a large power of the transition matrix:
The columns will be the same if a steady state exists, so we just read one column and find the steady state is (2/31/3).
For any starting state (p1−p) (has to sum to 1), the state after 100 steps is