Perplex
Content
  • Exponents & Logarithms
  • Approximations & Error
  • Sequences & Series
  • Matrices
  • Complex Numbers
  • Financial Mathematics
  • Cartesian plane & lines
  • Function Theory
  • Modelling
  • Transformations & asymptotes
  • 2D & 3D Geometry
  • Voronoi Diagrams
  • Trig equations & identities
  • Vectors
  • Graph Theory
  • Probability
  • Descriptive Statistics
  • Bivariate Statistics
  • Distributions & Random Variables
  • Inference & Hypotheses
  • Differentiation
  • Integration
  • Differential Equations
Other
  • Review Videos
  • Formula Booklet
  • Blog
  • Landing Page
  • Sign Up
  • Login
  • Perplex
    IB Math AIHL
    /
    Descriptive Statistics
    /

    Linear Regression

    Edit

    Exercises

    Key Skills

    Linear Regression

    Linear Regression

    Regressions of ​y​ on ​x, regressions of ​x​ on ​y, the correlation coefficient ​r, the rank correlation coefficient ​rs​,​ extrapolation and interpolation of data.

    Want a deeper conceptual understanding? Try our interactive lesson! (Plus Only)

    Exercises

    No exercises available for this concept.

    Practice exam-style linear regression problems

    Key Skills

    Plotting approximate best fit line
    SL 4.4

    Best fit lines can also be drawn approximately by eye. We start by finding the average ​x​ and ​y, giving the point ​(xˉ,yˉ​). We then take a ruler and place it on this point, and adjust the slope until we find a reasonable best fit line.


    Powered by Desmos

    Regression line y on x
    SL 4.4

    Linear regression is a statistical method used to model the relationship between two variables when data is given as pairs of points ​(x,y). We fit a straight line (called the regression line) that minimizes the average vertical distance from the points:

    Powered by Desmos


    The general equation of the regression line is:

    ​
    y=ax+b
    ​

    where ​a​ is the slope and ​b​ is the ​y​-intercept.


    The values of ​a​ and ​b​ can be found using a calculator:

    • Use Stat>Edit to fill in ​x​- and ​y​-values into ​L1​​ and ​L2​.

    • Then, press Stat, right arrow to the CALC menu, and select 4:LinReg(ax+b).

    Pearson's Product-Moment Correlation Coefficient
    SL 4.4

    Pearson's product-moment correlation coefficient, denoted by ​r, measures the strength and direction of a linear relationship between two numerical variables ​x​ and ​y. Its value always lies between ​−1​ and ​+1:

    • ​r=+1: perfect positive linear relationship

    • ​r=−1: perfect negative linear relationship

    • ​r=0: no linear relationship

    A positive value means ​y​ generally increases as ​x​ increases; a negative value means ​y​ generally decreases as ​x​ increases. The closer ​r​ is to ​±1, the stronger the linear relationship.


    If you clickmode, scroll to STAT DIAGNOSTICS , hover over ON, and click ENTER, then any time you perform a linear regression, the calculator will provide Pearson's coefficient in addition to the regression line.

    Predicting y from x
    SL 4.4

    Once we have a regression line ​y=ax+b, we can use it to predict ​y​ by plugging in a value of ​x.

    Danger of extrapolation
    SL 4.4

    When using a regression line to predict ​y​ from ​x, we need to be aware of the danger of extrapolation. This occurs when we try to predict ​y​ for a value of ​x​ far outside the range of ​x​ values in our data. For such an ​x, we cannot trust that the relationship is the same.

    Limitations of predicting x from y
    SL 4.4

    While it is possible to use a regression line ​y=ax+b​ to predict ​x​ with

    ​
    x=ay−b​,
    ​

    this is not a reliable process. The best fit line is determined to minimize the difference between the real ​y’s​ and the predicted ​y’s,​so the difference between real and predicted values for ​x​ may be much larger.