Below is a list of topics that I will cover in this review:
Although statistics is not a heavy topic in elementary and secondary school, it is playing an ever increasing role in our society. What does it mean when they say that the average SAT score is 1200? How do they figure out the approval ratings of President Clinton? Did they go out and surveyed all of the people that are living in the United States? These are just a few examples of where statistics enter our lives.
Now, a formal definition of statistics. Statistics is a branch of mathematics which involves the collection, organization, interpretation, and presentation of data (information). The goal is to make some sort of inference about the data that you have collected (i.e., more than half of the class spent one hour in doing a math homework).
In a statistical study, you can either choose to obtain data from a population (all of the people in the group) or from a sample (a random number of people within the group). Sampling is especially important when it is impracticable to obtain data from the population (i.e., everyone living in the United States).
A measure of central tendency is a single number which represents the tendency of a group of numbers to center around some values. The three most widely used measures of central tendency are the mean, the median, and the mode.
Find out how people use Statistics to lie!
The Mean
The arithmetic mean, as its name implies, is the average of a set of numbers. Just add up all of the values and divide by the number of values you have added. For example, if Johnny got a 70 on his first test, a 79 on his second test, and a 76 on his third test, what is the mean of the three tests?
Mean = (70 + 79 + 76) / 3 = 225 / 3 = 75
Note that since all of the three values are very close to one another, it is not a surprise that the mean is very close to each of the values.
However, if some of the data are extremely large (or extremely small), the mean will be pulled away from the true central tendency. For example, suppose Johnny got a 22 on his last test instead of a 79, the mean of these three tests will be:
Mean = (70 + 79 + 22) / 3 = 171 / 3 = 57
Does it mean that Johnny didn't do good in this class? Of course not, he only messed up one of the three tests. The "outlier" messed up the data. In this scenario, the mean did not provide any useful information regarding the central tendency of the data.
The median of a set of data is the middle value of the data after the data values are arranged in increasing order. To find the median of a set of numbers, arrange them in increasing size and pick the middle number. If the total number of values (n) is odd, the median is the (n+1)/2 th number. If the total number of values (n) is even, the median is the average of the two middle numbers, in other words, it is equal to ( n/2 th number + (n+2)/2 th number ) / 2. For example, if Mary get the following scores on her homework assignments: 98, 90, 75, 85, 100, 50, 80, what can we say about the median of these homework scores?
Arrange the scores in increasing order
50, 75, 80, 85, 90, 98, 100
Since there are seven (odd number of) values, the median is the middle entry, which is the fourth entry, and thus the median is 85. Note that three of the seven scores are lower than 85 and that three of the seven scores are higher than 85. Another way of looking at median is to remember that the median divides a set of numbers into two groups that have the same number of values.
Suppose Mary got another homework assignment graded, and she received a 65.
Arrange the scores in increasing order
50, 65, 75, 80, 85, 90, 98, 100
Since there are now eight (even number of) values, the median is the average of the two middle entries, which are the fourth and the fifth entries, and thus the median is equal to (80 + 85) / 2, or 82.5. Note that four of the eight scores are lower than 82.5 and that four of the eight scores are higher than 82.5.
The mode of a set of data is the number that appears most often in the set. For example, if Steve asks 10 different people how many pins they can knock down with a bowling ball, and he obtains the following data:
6, 8, 6, 5, 9, 6, 4, 8, 5, 10
Since the number 6 occurs three times, and none of the other numbers occur more often than the number 6 (8 occurs twice, 5 occurs twice, 4 occurs once, 9 occurs once, and 10 occurs once), the mode of this set of numbers is 6.
Sometimes a set of numbers might contain more than one mode. Consider the set of numbers 8, 5, 9, 4, 8, 5
The number 5 and the number 8 both appeared twice in the set. Both of them are considered as the mode for this set of data.
Also note that if all of the data appear only once in the set, then there is no mode. The set 2, 3, 5, 7, 11, 13, 17, 19 has no mode.
* This is the only one measure of central tendency that might take more than one value as the answer. The other two measures (mean and median) cannot take more than one value as the answer.
A Trick for Remembering Mean, Median, and Mode
meAn = AverageQuartiles are numbers that separate a set of scores arranged in increasing size order into four groups that contain the same number of scores.
Since there are 25 scores, the median of these scores is equal to the 13th term in the set, which is 83, this is the second or middle quartile (Q2). The lower quartile (Q1) is the median of the twelve scores to the left of 83, hence, Q1 = (70 + 73) / 2 = 71.5. The upper quartile (Q3) is the median of the twelve scores to the right of 83, hence, Q3 = (90 + 92) / 2 = 91.
Click here to go to Statistics 101 - Part 2
[ Home | Regents Review | Join Pen-Pal Network | E-mail me ]
© July 1998 by Danny Chan