This tutorial is going to talk about distributions. You will learn about:
Distribution is a way to visually show how many times a variable takes a certain value, so it is the values the variable takes and how often they show up.
Distribution
A way to visually display the values a variable takes and how often it takes each value.
There are many kinds of distributions, including:
Don't worry now about the terminology, because all of these will be talked about in their own tutorials.
One way of showing a distribution is in a frequency table. If we're talking about these pool balls here, two are yellow, two are blue, two are red, et cetera, all the way down to one of them is black.
Using a frequency table can visually show how often the variable color takes the value of yellow.
Some distributions, like pie charts and bar graphs, are for qualitative data. The variable values in these distributions are categories.
There are also distributions that can be used for quantitative data. One very simple plot to make is called a dot plot, and it just stacks dots on top of each other. You may also find a histogram, a stem and leaf plot, and a time series.
Finally, a distribution might be described by some mathematical rule. For example, the height of people might be described by a distribution that's single peak. This a mathematical rule called the normal distribution. Or you might have something that follows something called the Poisson distribution. It's good to know that there are distributions that do, in fact, follow mathematical rules, and are not strictly data driven.
Why are there so many different kinds of distributions? The point of a distribution is to make the data, which can be a large data set that is possibly unwieldy, simpler to understand. You want to make it easy for yourself and for your readers to understand. So different kinds of distributions will lend themselves better to different kinds of data sets.
A dot plot is better for data that are close together and ones that don't have a lot of values, whereas certain other distributions are better for larger data sets. A histogram is better than a dot plot when the data's very spread out.
You can determine which kind of distribution to use based on the kind of data you have.
Each distribution has its own situation for which it's ideal. The data will tell us which distribution we would like to use.
There are many types of distributions. The point of all of them is to visually display your data so the reader can take the large data set and succinctly understand what's going on with it. Some distributions contain every observation or data point and some only contain summaries; you can match your distribution types to the data set. Each type of distribution discussed here can be explored further in its own tutorial.
Thank you and good luck!
Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS
A way to visually display the values a variable takes and how often it takes each value.