\setcounter{ExampleCounter}{1}
Now that we can build the regression line, we want to know what we can do with it, and how we should interpret it.
\subsection{Making Predictions}
The regression line gives a predicted $y$ value, $\hat{y}$, for each given $x$ value (within a reasonable range). Of course, this is only a prediction, and we expect the actual value to differ slightly from the prediction.
\begin{example}{Predicting}
The house price data led to the following regression line:
\[\hat{y} = 0.0992x + 160.194.\]
\begin{enumerate}[(a)]
\item Predict the price of a home with 2700 square feet.\\
\[\hat{y} = 0.0992(2700) + 160.194 = 428.034\] or \$428,034.
\item Predict the price of a home with 4500 square feet.\\
\[\hat{y} = 0.0992(4500) + 160.194 = 606.594\] or \$606,594.
\end{enumerate}
\end{example}
\subsection{Interpreting the Predicted Value $\hat{y}$}
Which of those answers do you think will be a better prediction?
\begin{itemize}
\item The first one was called an \textbf{interpolation}, because it made a prediction for a data point \emph{inside} the range of the data.
\item The second one was called an \textbf{extrapolation}, because it made a prediction for a data point \emph{outside} the range of the data.
\item Generally, interpolations are more reliable than extrapolations.
\end{itemize}
\vfill
\pagebreak
\paragraph{More precisely:} $\hat{y}$ is what we expect the \emph{average} $y$ value to be for all the data points with a particular $x$ value. In the example above, we expect that the average price for homes with 2700 square feet will be \$428,034.
\begin{example}{Predicting}
The data on students' third test and final exam led to the following equation for the least-squares regression line:
\[\hat{y} = 4.827x - 175.513.\]
What final exam score would you predict for a student who scored 60 on the third test?\\
Our prediction will be the same as what we expect the average score of students who meet that criteria to be:
\[\hat{y} = 4.827(60) - 175.513 = 114.107\] out of 200 points.
\end{example}
\paragraph{Note:} Not every $x$ value that you can plug into the regression equation is a meaningful one. For instance, you could try predicting the final exam score of a student who got a 90 on the third test (even though the third test scores can only go up to 80), and the equation will dutifully give you a value. Just note that that value is meaningless; you need to use common sense when making predictions.
\subsection{Interpreting the Slope}
\begin{itemize}
\item If the $x$ values of two data points differ by 1, their $y$ values will differ by the amount of the slope.
\item If the $x$ values of two data points differ by 2, their $y$ values will differ by twice the amount of the slope.
\item etc.
\end{itemize}
\vfill
\pagebreak
\begin{example}{Using Slope}
Two houses differ in size by 300 square feet. How much would you expect their prices to differ?\\
All we have to do is multiply the difference in $x$ (difference in size) by the slope:
\[0.0992(300) = 29.76\]
Therefore, we expect these two houses to differ in price by \$29,760.
\end{example}
\paragraph{Note:} The slope doesn't mean that if $x$ \emph{changes} by 1, we expect $y$ to \emph{change} by the amount of the slope; it means that if we look at two different data points, then we can predict the difference in their $y$ values based on the difference in their $x$ values.
For example, if we developed a regression model to predict a person's height based on their weight, we couldn't say that if they lost weight, they'd suddenly shrink.
\begin{example}{Making Predictions}
At the final exam in a statistics class, the professor asks each student to indicate how many hours he or she studied for the exam. After grading the exam, the professor computes the least-squares regression line for predicting the final exam score from the number of hours studied. The equation of the line is $\hat{y} = 50 + 5x.$
\begin{enumerate}[(a)]
\item Antoine studied for 6 hours. What do you predict his exam score to be?
\[\hat{y} = 50 + 5(6) = 80\]
\item Emma studied for 3 hours longer than Jeremy did. How much higher do you predict Emma's score to be?
\[5(3) = 15\]
\end{enumerate}
\end{example}
\vfill
\pagebreak
\subsection{Interpreting the Intercept}
The slope, mathematically, is the $y$ value of a data point whose $x$ value is 0.
\begin{itemize}
\item Realistically, the $y$ intercept is only meaningful if a value of 0 for $x$ is feasible.
\item If only positive values or only negative values are meaningful for $x$, the $y$ intercept is not meaningful in context; it just makes the equation work.
\end{itemize}
\begin{example}{Interpreting the Intercept}
For each of the following scenarios, decide whether or not the $y$ intercept is meaningful in context.
\begin{enumerate}[(a)]
\item The house price example.
No, because it doesn't make sense to talk about a house with 0 square feet (or negative square feet).
\item The test score example.
No, because while it would be possible for a student to get a score of 0 on the third test, that intercept would predict that their final exam score would be $-175.513$, which is meaningless.
\item The least-squares regression line is $\hat{y}=1.98+0.039x$, where $x$ is the temperature in a freezer in degrees Fahrenheit, and $y$ is the time it takes to freeze a certain amount of water into ice.
Yes, this could be meaningful, since we can talk about temperatures in the positive or negative range on the Fahrenheit scale.
\item The least-squares regression line is $\hat{y}=-13.586+4.340x$, where $x$ represents the age of an elementary school student and $y$ represents the score on a standardized test.
No, because newborns are not in elementary school.
\end{enumerate}
\end{example}
\vfill
\pagebreak
\begin{example}{Linear Regression}
The following table lists the heights (in inches) and weights (in pounds) of 14 NFL quarterbacks in the 2009 season.
\begin{center}
\begin{tabular}{l c c}
Name & Height & Weight\\
\hline
Peyton Manning & 77 & 230\\
Tom Brady & 76 & 225\\
Ben Roethlisberger & 77 & 241\\
Drew Brees & 72 & 209\\
Eli Manning & 76 & 225\\
Carson Palmer & 77 & 235\\
Phillip Rivers & 77 & 228\\
Kurt Warner & 74 & 214\\
Donovan McNabb & 74 & 240\\
Jay Cutler & 75 & 233\\
Tony Romo & 74 & 225\\
Matt Ryan & 76 & 220\\
Brett Favre & 74 & 222\\
Kyle Orton & 76 & 225\\
\end{tabular}
\end{center}
\begin{enumerate}[(a)]
\item Compute the regression line for predicting weight from height.
Using the calculator:
\[\hat{y} = 3.2723x - 20.0206\]
\item Calculate $r$, the correlation coefficient.
Again, using the calculator:
\[r=0.5628\]
\item Do you think this linear regression model is going to be an accurate one?
The $r$ value is not very close to 1, so not really.
\item Is it possible to interpret the $y$-intercept?
No.
\item If two quarterbacks differ in height by two inches, by how much would you expect their weight to differ?
\[3.2723(2) = 6.5446\]
\item Predict the weight of a quarterback who is 74.5 inches tall.
\[\hat{y} = 3.2723(74.5) - 20.0206 = 223.8\ lb\]
\item Does Tom Brady weigh more or less than the weight predicted by the regression line, based on his height?
Less
\end{enumerate}
\end{example}
\vfill
\pagebreak
\begin{example}{Linear Regression}
A blood pressure measurement consists of two numbers: the systolic pressure, which is the maximum pressure taken when the heart is contracting, and the diastolic pressure, which is the minimum pressure taken at the beginning of the heartbeat. Blood pressures were measured (in millimeters of mercury, mmHg) for a sample of 16 adults.
\begin{center}
\begin{tabular}{l | c c c c c c c c c c c c c c c c}
Systolic & 134 & 115 & 113 & 123 & 119 & 118 & 130 & 116\\
Diastolic & 87 & 83 & 77 & 77 & 69 & 88 & 76 & 70\\
\\
\hline
\\
Systolic & 133 & 112 & 107 & 110 & 108 & 105 & 157 & 154\\
Diastolic & 91 & 75 & 71 & 74 & 69 & 66 & 103 & 94
\end{tabular}
\end{center}
\begin{enumerate}[(a)]
\item Calculate $r$, the correlation coefficient.
Using the calculator: \[r=0.8568\]
\item Do you think there is a strong linear association?
The $r$ value is above 0.8, so yes.
\item Compute the regression line for predicting the diastolic pressure from the systolic pressure.
\[\hat{y} = 0.5748x+9.1828\]
\item Is it possible to interpret the $y$-intercept?
No.
\item If the systolic pressures of two patients differ by 10 mmHg, by how much would you predict their diastolic pressures will differ?
\[5.748\]
\item Predict the diastolic pressure for a patient whose systolic pressure is 125 mmHg.
\[81.0\]
\end{enumerate}
\end{example}
\vfill
\pagebreak