Content
In this lesson, we learn about different ways to visualize frequency data, including tables, histograms, and cumulative frequency diagrams.
Want a deeper conceptual understanding? Try our interactive lesson! (Plus Only)
No exercises available for this concept.
Datasets can be represented in frequency tables, with a row containing the values that exist in the data and a row containing the frequency, or number of times each value appears.
The mean of frequency data can be calculated using the formula:
If we take the above
Note that fi​ is the frequency of the value xi​, so n=i=1∑k​fi​ is just the total number of points.
When data is continuous, we cannot have a column per possible value, as there are infinitely many.
Instead, we use a grouped frequency table to break up the data into specific intervals.
If all the intervals have equal size, then the modal class is the interval in which the most values fall.
We can also estimate the mean from grouped data as if it were a discrete frequency table using the mid-interval values, that is the average of the upper and lower bounds of each interval.
Grouped frequency tables can also be turned into histograms (aka bar graph) by drawing rectangles with base corresponding to the intervals, and heights corresponding to the frequency.
Powered by Desmos
Cumulative frequency graphs are a powerful visual representation of continuous data.
The value of y at each point x on the curve represents the number of data points less than x.
We start with a grouped frequency table, and add a row for cumulative frequency, which is the number of items in an interval and all previous (lower) intervals. To plot the diagram, we make a point from each column. The x-coordinates are the upper bound of each interval, and the y-coordinates are the cumulative frequency.
Powered by Desmos
Cumulative frequency diagrams can be used to find medians, quartiles, and percentiles.
In the same way that the first quartile, Q1​, is the value greater than a quarter (25%) of data values, the kth percentile is the value greater than k% of the data values.
Powered by Desmos
Q1​: 0.25× the max
Median: 0.5× the max
Q3​: 0.75× the max
kth percentile: 100k​× the max.