Office of Institutional Resources and Research
Definitions and Descriptions of Statistics
Standard deviation (sd) - A measure of the variability or scatter of test scores. It is especially useful because of its relationship to the normal curve - when the number of students is relatively large. The curve with one sd taken on each side of the mean includes approximately 68% of the students, two sds on each side of the mean includes about 95% of the students, and three sds on each side includes 99% of the students.
Kuder-Richardson internal consistency formula number 20 (KR-20) - Used to compute the reliability estimate. A reliability coefficient of this type gives an indication of the extent to which individuals taking the test again will receive the same scores. Values of the KR-20 estimate range between 0.0000 and +1.0000. A value close to +1.0000 indicates that the test exhibits a high degree of reliability. Estimates should be interpreted cautiously if large numbers of students are unable to complete the test within the allotted time. For a typical 50-minute classroom examination covering related subject matter, a reliability coefficient of at least .70 is desirable. Reliability can be improved through item revision based upon the item analysis data. Lengthening the test (when this is practical) will also increase reliability, particularly in the case of short examinations.
Standard error of measurement - An estimate of the probable extent of error in test scores. It is interpreted in the same manner as a standard deviation. The more reliable and error-free the test, the smaller the standard error. This direct application to scores makes the standard error of measurement especially useful when evaluating differences among students or assigning grades. In other words, the standard error of measurement is the standard deviation of test scores that would have been obtained from an individual student who had been tested multiple times (with no new learning between testings and no memory of the questions).
Point Biserial - The item's discrimination index. The point biserial correlation ranges from -1.000 to +1.000. A positive value indicates students scoring higher on this exam were more likely to answer this item correctly. Generally, a value of at least .20 is desirable. However, be aware that very easy or very difficult items have low point biserial correlations and items with 60% to 80% answering correctly generally have higher values.
T Score - The student's score on this test relative to the group testing. A score of 50 corresponds to the average score. The standard deviation is 10; a score of 60 means the student scored one standard deviation above the mean. A T score is helpful if several tests are averaged within an academic time interval to establish a grade.
P Value - Difficulty index. The proportion of students answering the item correctly.