When talking about univariate data, or one-variable data, you would discuss the shape, center, and spread of a distribution when making histograms and dot plots,
On a scatterplot, it's a bit difficult to talk about the shape. Regarding center and spread, it's all very confusing; perhaps the QB salary is very spread out, and the total salary is maybe not so spread out. This would make it hard to talk about the spread.
Instead, you're actually going to describe:
In the form, we look for a pattern. Is the pattern linear, or do the data show a curve? Do they start low, then peak, then end low? Or do they start low and end high? How do they curve or do they rise quickly and then tail off? There's a lot to look at.
When discussing form, you will most likely describe a scatterplot as linear or non-linear.
|Forms of a Scatterplot|
|Linear: The scatterplot is approximating a line.|
|Non-Linear: The data points follow a curve.|
|No Association: Data points resemble a cloud and there is not clear pattern.|
In addition, it is important to consider outliers or clusters when looking at the form. It the scatterplot was essentially linear but had one outlier, we would want to note that. Also, if the data created clusters throughout the scatterplot, this will be important to keep track of.
The direction refers to how the y-axis variable responds as you move to the right on the x-axis variable. There are two main directions that a scatterplot can have are positive and negative.
|Direction of a Scatterplot|
|Positive: The variables both increase or both decrease|
|Negative: The variables go in opposite directions (one variable increases while the other variable decreases)|
The strength is how closely the two variables are associated with some line or curve. How well do the points follow that indicated form? How well do these points stack up on a line? The strength can be described as strong, moderate, or weak.
|Strengths of a Scatterplot|
|Strong: The scatterplot would most resemble the form. The data points are clustered around either a line or a curve.|
|Moderate: The data points are less clustered in a line or curve, however, the direction is still clear.|
|Weak: The data points are much more spread out and the direction may be less clear.|
Form, Direction, and Strength of Scatterplot|
(1970 vs 1980 Seafood Prices)
The form is fairly linear. One point is a little bit low for the line that we would look at for the rest of the data points. It also appears to have an outlier on the high side.
The direction is positive, which means that as the 1970 price increases, so does the 1980 price. That's not surprising, because you would expect that the ones that are less expensive in 1970 would also be less expensive in 1980, and the ones that were more expensive in '70 would be more expensive in '80.
The strength is very strong because it's fairly predictable what is going to happen with these prices, based on the fact that they're very close to a line.
Source: Adapted from Sophia tutorial by Jonathan Osters.