Topics
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
When we want to make a claim using statistics, need sufficient evidence. Flipping a coin and getting 3 heads in a row is not strong evidence that it is biased, but 100 in a row is.
Whatever data we have, we start by assuming that they are produced by random chance alone. We call this the null hypothesis, which we write H0. In the coin flip example, the null hypothesis is H0: the coin is fair.
An alternative hypothesis, denoted H1, is the idea that something "fishy" is going on. In the coin flip example, this could be H1: the coin is biased towards heads.
It's important (both in exams and real life) to assume the null hypothesis is true unless you have good evidence.
Writing down the null and alternative hypotheses can be hard, but you can think of H0 as a neutral assumption, and H1 as something we need evidence to prove.
Does listening to music while studying hurt test performance?
Null hypothesis H0: we assume it makes no difference: the average scores of students with or without music are similar
Alternative hypothesis H1: students who listen to music do worse: they have a lower average test score.
Does drinking an energy drink improve reaction time?
Null hypothesis H0: we assume it makes no difference: the average reaction times with and without energy drinks are similar.
Alternative hypothesis H1: drinking an energy drink lowers the mean reaction time.
Once we have our null and alternative hypotheses, we use our data as evidence against the null hypothesis.
Let's take the coin flip example, and start by assuming the null hypothesis: it is fair. That means each time I flip it, I have a 21 probability of getting heads. If the coin gives 10 heads in a row, the probability is
This number is the probability of the data we observed assuming the null hypothesis. The smaller it gets, the less likely that the null hypothesis is true.
We call this the p-value. The smaller the p-value, the stronger the evidence for the alternative hypothesis. If the p value is less than the significance level α, we reject the null hypothesis, which is essentially concluding the alternative hypothesis is true.
p-value: The probability of getting results as surprising (or more) as the observation if the null hypothesis were true.
Significance level (α): The cutoff we choose in advance. If the p-value is below α, we reject the null hypothesis.
A χ² goodness of fit test compares actual frequencies to the frequencies that would be expected under the null hypothesis. The bigger the relative difference between actual and expected values, the smaller the p value it returns.
For example, imagine a 5 kilometer race where the number of racers finishing in certain time brackets is recorded, and compared to what is expected based on historical data:
Notice that the expected and observed frequencies both add up to 146. They must always be the same.
The null hypothesis for this test is that the observed frequencies do fit the expected distribution.
The alternative hypothesis is that the observed frequencies do not fit the expected distribution.
To perform a χ² goodness of fit test, you use your calculator:
Enter in L1 the observed frequencies
Enter in L2 the expected frequencies
Find the χ2 GOF-Test on your calculator, with
Observed: L1
Expected: L2
df: (n−1), where n is the number of categories. (2 in our case)
The calculator returns the following:
χ2≈9.24
p≈0.00986
The degrees of freedom in a dataset is the number of values that can change while keeping the total sum constant. If there are n values in a list, the number of degrees of freedom is n−1.
The degrees of freedom are important because with more values, there will naturally be more total variation between actual and expected values. The calculator needs to account for this.
The critical value for a χ² test is a threshold we are given, against which we compare the value of χ² for our data. If our χ² is larger than the critical value, we reject H0.
A χ2 test can also be used to test whether categorical variables are related, for example, does favorite movie depend on gender? It works by comparing how far off the observed data is from what we would expect if the variables were not related (H0).
In a χ2 test for independence:
The null hypothesis H0 is that the categories are not independent (not related)
The alternative hypothesis H1 is that the categories are not independent (they are related).
On a calculator:
Enter the observed frequencies in a matrix (table)
Enter the expected frequencies in a separate matrix or leave them blank if they are not given.
Navigate to χ2-Test on your calculator, and enter the observed and expected matrices (select an empty matrix and your calculator will find the expected values itself) you just filled.
The calculator returns the χ2 value and the p value.
A T-test is a technique that compares whether the means of two groups are significantly different. It works by measuring how different the mean of a sample is from another mean, and comparing that difference to the variance in the sample.
The null hypothesis is that the two groups have the same mean H0:μ=μ0.
We can have any of the following alternative hypotheses:
H1:μ<μ0 - testing whether our sample has a lower mean that what we're comparing it to
H1:μ>μ0 - testing whether our sample has a higher mean that what we're comparing it to
H1:μ=μ0 - testing whether our sample has a different mean that what we're comparing it to
The first two alternative hypotheses are called one-tailed because they test difference in a specific direction (one mean greater or smaller than the other).
We can use a one-tailed T-test to determine whether
patients at a certain hospital have a significantly faster (lower) mean recovery time than the national average. This is a one-tailed test.
trout in lake A have a significantly different mean weight than trout in lake B. This is a two-tailed test.
We can perform a t-test for a single sample against a known mean by on a calculator:
Enter the sample data into a list.
Navigate to T-Test on a calculator.
Select "DATA" and enter the name of the list where sample is stored.
Select the tail type depending on what our alternative hypothesis is (μ0 is the population mean):
=μ0 for a change in mean
<μ0 for a decrease in mean
>μ0 for an increase in mean
Hit calculate, and interpret the p-value as usual.
Assumptions: This test is assuming that the data are independent, randomly sampled, and approximately normally distributed. IB questions will specifically ask you to state the assumption of normally distributed variables.
To compare the means of two samples using a T-test, we use a calculator:
Enter each sample in its own list.
Navigate to 2-SampTTest.
Select "Data", then enter the names of the lists containing the samples.
Select the tail type depending on what our alternative hypothesis is:
μ1=μ2 for different means
<μ2 for first list mean smaller than second
>μ2 for first list mean greater than second
Set "Pooled" to true.
The calculator reports the t-value and p-value, which we interpret as usual.
Assumptions: This test is assuming that the data are approximately normally distributed, and that both samples have the same variance. IB questions will specifically ask you to state these assumptions.
Track your progress across all skills in your objective. Mark your confidence level and identify areas to focus on.
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
Track your progress:
Don't know
Working on it
Confident
📖 = included in formula booklet • 🚫 = not in formula booklet
When we want to make a claim using statistics, need sufficient evidence. Flipping a coin and getting 3 heads in a row is not strong evidence that it is biased, but 100 in a row is.
Whatever data we have, we start by assuming that they are produced by random chance alone. We call this the null hypothesis, which we write H0. In the coin flip example, the null hypothesis is H0: the coin is fair.
An alternative hypothesis, denoted H1, is the idea that something "fishy" is going on. In the coin flip example, this could be H1: the coin is biased towards heads.
It's important (both in exams and real life) to assume the null hypothesis is true unless you have good evidence.
Writing down the null and alternative hypotheses can be hard, but you can think of H0 as a neutral assumption, and H1 as something we need evidence to prove.
Does listening to music while studying hurt test performance?
Null hypothesis H0: we assume it makes no difference: the average scores of students with or without music are similar
Alternative hypothesis H1: students who listen to music do worse: they have a lower average test score.
Does drinking an energy drink improve reaction time?
Null hypothesis H0: we assume it makes no difference: the average reaction times with and without energy drinks are similar.
Alternative hypothesis H1: drinking an energy drink lowers the mean reaction time.
Once we have our null and alternative hypotheses, we use our data as evidence against the null hypothesis.
Let's take the coin flip example, and start by assuming the null hypothesis: it is fair. That means each time I flip it, I have a 21 probability of getting heads. If the coin gives 10 heads in a row, the probability is
This number is the probability of the data we observed assuming the null hypothesis. The smaller it gets, the less likely that the null hypothesis is true.
We call this the p-value. The smaller the p-value, the stronger the evidence for the alternative hypothesis. If the p value is less than the significance level α, we reject the null hypothesis, which is essentially concluding the alternative hypothesis is true.
p-value: The probability of getting results as surprising (or more) as the observation if the null hypothesis were true.
Significance level (α): The cutoff we choose in advance. If the p-value is below α, we reject the null hypothesis.
A χ² goodness of fit test compares actual frequencies to the frequencies that would be expected under the null hypothesis. The bigger the relative difference between actual and expected values, the smaller the p value it returns.
For example, imagine a 5 kilometer race where the number of racers finishing in certain time brackets is recorded, and compared to what is expected based on historical data:
Notice that the expected and observed frequencies both add up to 146. They must always be the same.
The null hypothesis for this test is that the observed frequencies do fit the expected distribution.
The alternative hypothesis is that the observed frequencies do not fit the expected distribution.
To perform a χ² goodness of fit test, you use your calculator:
Enter in L1 the observed frequencies
Enter in L2 the expected frequencies
Find the χ2 GOF-Test on your calculator, with
Observed: L1
Expected: L2
df: (n−1), where n is the number of categories. (2 in our case)
The calculator returns the following:
χ2≈9.24
p≈0.00986
The degrees of freedom in a dataset is the number of values that can change while keeping the total sum constant. If there are n values in a list, the number of degrees of freedom is n−1.
The degrees of freedom are important because with more values, there will naturally be more total variation between actual and expected values. The calculator needs to account for this.
The critical value for a χ² test is a threshold we are given, against which we compare the value of χ² for our data. If our χ² is larger than the critical value, we reject H0.
A χ2 test can also be used to test whether categorical variables are related, for example, does favorite movie depend on gender? It works by comparing how far off the observed data is from what we would expect if the variables were not related (H0).
In a χ2 test for independence:
The null hypothesis H0 is that the categories are not independent (not related)
The alternative hypothesis H1 is that the categories are not independent (they are related).
On a calculator:
Enter the observed frequencies in a matrix (table)
Enter the expected frequencies in a separate matrix or leave them blank if they are not given.
Navigate to χ2-Test on your calculator, and enter the observed and expected matrices (select an empty matrix and your calculator will find the expected values itself) you just filled.
The calculator returns the χ2 value and the p value.
A T-test is a technique that compares whether the means of two groups are significantly different. It works by measuring how different the mean of a sample is from another mean, and comparing that difference to the variance in the sample.
The null hypothesis is that the two groups have the same mean H0:μ=μ0.
We can have any of the following alternative hypotheses:
H1:μ<μ0 - testing whether our sample has a lower mean that what we're comparing it to
H1:μ>μ0 - testing whether our sample has a higher mean that what we're comparing it to
H1:μ=μ0 - testing whether our sample has a different mean that what we're comparing it to
The first two alternative hypotheses are called one-tailed because they test difference in a specific direction (one mean greater or smaller than the other).
We can use a one-tailed T-test to determine whether
patients at a certain hospital have a significantly faster (lower) mean recovery time than the national average. This is a one-tailed test.
trout in lake A have a significantly different mean weight than trout in lake B. This is a two-tailed test.
We can perform a t-test for a single sample against a known mean by on a calculator:
Enter the sample data into a list.
Navigate to T-Test on a calculator.
Select "DATA" and enter the name of the list where sample is stored.
Select the tail type depending on what our alternative hypothesis is (μ0 is the population mean):
=μ0 for a change in mean
<μ0 for a decrease in mean
>μ0 for an increase in mean
Hit calculate, and interpret the p-value as usual.
Assumptions: This test is assuming that the data are independent, randomly sampled, and approximately normally distributed. IB questions will specifically ask you to state the assumption of normally distributed variables.
To compare the means of two samples using a T-test, we use a calculator:
Enter each sample in its own list.
Navigate to 2-SampTTest.
Select "Data", then enter the names of the lists containing the samples.
Select the tail type depending on what our alternative hypothesis is:
μ1=μ2 for different means
<μ2 for first list mean smaller than second
>μ2 for first list mean greater than second
Set "Pooled" to true.
The calculator reports the t-value and p-value, which we interpret as usual.
Assumptions: This test is assuming that the data are approximately normally distributed, and that both samples have the same variance. IB questions will specifically ask you to state these assumptions.