This tutorial is going to cover data analysis. You will learn about:
Data analysis is what we do once we've collected our data.
Data Analysis
The understanding of the key features of a set of data - shape, center, spread, and outliers.
In this lesson, we will look at data analysis to identify those trends or key features. There are four components of data analysis that are key:
Shape is sort of a qualitative notion telling us where most of the points lie in the distribution.
Shape
The qualitative description of the clustering of data points in a certain location when the data are graphed.
For instance, for this shape, you would say that most of the data points are in the hump, where the line is highest on the y-axis:
There are not a whole lot of data points on the far right side, in what we'd call the tail of the graph.
Shapes can be either skewed to the left or the right:
Center is essentially what it sounds like: it's wherever the middle is.
Center
The “middle” of the data set. There are many measures of center.
There are a couple different ways to measure center.
In this graph, there are a few arrows pointing to the different measurements of the middle.
Which one is the correct measure of center? They're all different measures and they can all be correct in different situations.
Spread gives a numerical value relating how spread out the data points are.
Spread
The numerical description of how close the numbers are to the center.
Just as with center, there are several different measures for spread:
There would both be different, and correct, measurements of the spread.
Outliers are important to look for.
Outliers
Points in a data set that are so high or so low as to be unusual, given the rest of the values.
Outliers are not just the highest or lowest numbers, but they are very far above the next highest number in the data set or very far below the next lowest number in the data set.
Suppose that a small class took an exam. And the scores were as follows:
Some students did very well on this test. In fact, most students scored in the 80s or the 90s. However, one person scored only 46. That 46 would be considered an outlier because it's so much lower than the rest of the pack.
Outliers are important data points because they are so high or low that they would be considered unusual.
Data analysis consists of clearly describing the four key elements: shape, center, spread, and outliers, if there are any. There are some standard descriptions that are used to describe shape, such as skewed to the left and skewed to the right, and there are also several different measures for center and spread. Those are typically numbers. Outliers are values that are so high above the rest of the data set or so far below that they would be considered unusual.
Thank you and good luck!
Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS
The "middle" of the data set. There are many measures of center.
The understanding of the key features of a set of data - shape, center, spread, and outliers.
Points in a data set that are so high or so low as to be unusual, given the rest of the values.
The qualitative description of the clustering of data points in a certain location when the data are graphed.
The numerical description of how close the numbers are to the center.