The two important characteristics of data are its central tendency and variation. Central tendency is measured as the data’s mean, median and mode. Let’s explore more about the median.
Overview: What is the median?
The mean is often referred to as the mathematical center of the data. The mode is the most frequently occurring value in the data set. The median, similar to the road analogy above, is described as the physical center of the data.
To identify the median value in your data set, you must first sort the data from high to low or vice versa. For example, if your data set was 3, 7, 2, 9, 10, 8, 5 you can’t state your median is 9 just because it is the value in the middle with three values above it and three below it.
Let’s first sort the data in mathematical order so it reads 2, 3, 5, 7, 8, 9, 10. If we select the number in the middle so half the values are above and half below, you will now identify the middle or median value to be 7, not 9.
With an odd number of values in your data set, the median would be the center value. If you have an even number of values, the median would be the average of the two middle values. An example of this would be if we had a data set of 3, 7, 2, 9, 10, 8, 5, 8. Our first step would be to sort the data so it was 2, 3, 5, 7, 8, 8, 9, 10. Since there are an even number of values, the two middle values are 7 and 8. The median would then be the average of 7 and 8 or 7.5.
The value of the median is that the central tendency is not heavily affected by outliers as is the mean. For example, if you had a data set of 3, 7, 2, 9, 10, 8, 5, 8, 150, the mean would be 22.44. The median of 2, 3, 5, 7, 8, 8, 9, 10, 150 would be 8 which is not significantly different from our previous example where the median was 7.5.
It is recommended you compute both the mean and median for your data. If there is a significant difference, you might look for an extreme value and understand what happened. For example, if your mean is 22.44 but your median is 8, you might look for a high number since the mean seems to have been skewed to the high side.
An industry example of the median
The Manager of Shipping was concerned when he noticed the average on-time delivery for the month was significantly higher than previous months. He sensed something was wrong since he had been computing both the mean and median of his delivery times.
Since the mean was now much larger than the median, he knew to look for an unusually long delivery time. It didn’t take him long to realize a single delivery had been delivered to the wrong location and was what accounted for the big change in the average delivery times. While he continued to report out the average delivery times, he tended to rely more on the median times to offset any negative impact to his monthly report due to an unusual circumstance.
Frequently Asked Questions (FAQ) about the median
Is the mean a better measure of central tendency than the median?
Not really since the mean can be skewed in one direction or the other as a result of an unusually high or low value in your data. An outlier number has little impact on the median.
Do I have to sort my data before determining the median value?
Yes. Your data must be sorted either high to low or low to high before determining which is the center value.
If I have an even number of data values, how do I compute the median value?
You just calculate the average of the two middle values.