What Makes a Good Test?
There are three basic elements to look for when judging the quality of a psychological test — reliability, validity, and standardization.
RELIABILITY is a measure of the test’s consistency. A useful test is consistent over time. As an analogy, think of a bathroom scale. If it gives you one weight the first time you step on it, and a different weight when you step on it a moment later, it is not reliable. Similarly, if an IQ test yields a score of 95 for an individual today and 130 next week, it is not reliable. Reliability also can be a measure of a test’s internal consistency. All of the items (questions) on a test should be measuring the same thing — from a statistical standpoint, the items should correlate with each other. Good tests have reliability coefficients which range from a low of .65 to above .90 (the theoretical maximum is 1.00).
VALIDITY is a measure of a test’s usefulness. Scores on the test should be related to some other behavior, reflective of personality, ability, or interest. For instance, a person who scores high on an IQ test would be expected to do well in school or on jobs requiring intelligence. A person who scores high on a scale of depression should be diagnosed as depressed by mental health professionals who assess him. A validity coefficient reflects the degree to which such relationships exist. Most tests have validity coefficients (correlations) of up to .30 with “real world” behavior. This is not a high correlation and emphasizes the need to use tests in conjunction with other information. Relatively low correlations mean that some people may score high on a scale of schizophrenia without being schizophrenic and some people may score high on an IQ test and yet not do well in school. Correlations are high as .50 are seen between IQ and academic performance.
Jonathan Rich, Ph.D.
2929 Westminster Blvd., #3892
Seal Beach, CA 90740
(949) 329-8421
jrich@psychologicaltesting.com
© 2024 Psychological Testing. All Rights Reserved.