Are IQ Tests Biased?
Richard Niolon, Ph.D.
08/05

 
What is test bias?
Cleary (1968) offers that a test is biased when "the criterion score predicted from the common regression line is consistently too high or too low for members of the subgroup." Thus, bias is a difference in accuracy of predictions about performance based on scores. Studies of the SAT in predicting the first year college grades showed that it overpredicted success for African American Students, and so while it was biased it was not harmful bas it did not lead to screening out African Americans as potential students.

Of course, this isn't the only definition of bias. Many argue that simple differences in the scores obtained themselves mark some bias when the differences are reliably produced between one group of people and another, even if there is evidence that the differences are "real." Thorndike and others in this camp argue for using "biased" tests, but with different cutoff scores for different groups. Mercer offered this should be based on an analysis of the culture of the home.

Some skip all this, claiming that to assess test bias is pointless, as the tests measure what they do, and reliably so. Thus, the issue is user bias. Throwing out tests is like saying, "Sometimes bad drivers cause car accidents. We should get rid of all cars."

Sattler discusses in Chapter 19 some of the issues in IQ testing and bias. He points out that issues against IQ tests are as follows:
1) IQ tests are culturally biased since they show differences between minority groups
 
Sattler argues that this has not been clearly shown. Where one minority group shows lower scores, the differences could be real. This could indicate a poorer educational system (differences in educational opportunities, poverty, neighborhoods, home life…), but this doesn't mean the test is biased. Tests, further, should not be abandoned, as they can be used to assess the impact of interventions, and spot deficiencies in teaching different groups. Further, returning to "judgment calls" would introduce even more bias.

When you look at mean score differences between groups on the WISC R, there may be real differences, especially when SES convolutes the data. Some recall studies that African Americans score 15 points lower on IQ tests that Caucasians, but when SES is controlled this drops to 5 points or less. This is to say that being poor or rich may have more of an impact on your IQ and perhaps intelligence (whatever that is) than your ethnicity.

Other efforts look at predictive validity. Most IQ tests predict performance on achievement tests very well. But, if achievement tests are biased too, then we would expect high predictive validity and this wouldn't rule out bias. However, some argue that if our culture does value some skills over others, then the test is still an accurate predictor of a person's ability to succeed in our culture. Thus, IQ and achievement tests could be culturally biased and heavily so, but their reflection of the dominant culture's values is desirable. To design truly "culture free" tests would be to design tests that don't measure anything.

Other efforts look at construct validity, but the factor structure of the WAIS III holds up with African American, Caucasian, and Hispanic children. Thus, it is measuring the same thing in each child. Now, whether that is what you mean when you say intelligence or not is another question…

Other efforts focus upon test item bias. A CBS documentary picked "What would you do if a child much smaller than you tried to pick a fight with you?" for a documentary. They offered this was a culturally laden item from the WISC.

Rankings of easy to hard items placed this item number 42 and 47 for black and white kids respectively; that is, black children got it "right" more often that white children. Rankings of percentage of children that correctly answer the question showed 73 and 71% black and white answered this question correctly. Thus, "eyeball evaluations" of item bias may not actually match the data. Only two items showed a significant difference on the WISC R, and one was dropped from the WISC III. For the WAIS III items were given to AA, Hispanic, and White people and any items not answered equally by all three groups were dropped. Thus, there were no items on the test that differentiated between blacks and white prior to its publication.

This eyeballing is provocative though. Take the CBS item. Is this to say that black people teach their children that it is acceptable to hurt weaker people? Be careful about thinking in "culturally sensitive" ways that aren't.

2) National norms are unfair, since Caucasian kids are compared to Caucasian kids and African Americans or Hispanic kids are compared to Caucasian kids.
 
Sattler argues that the norms reflect societal levels of performance, and African Americans and Hispanics are represented at percentages equal to whole population. How would we alter this? Collect norms for different ethnic groups? What about people of mixed ancestry (which is really everybody). How black is black, how white is white? Does rural, suburban, or city life make a difference?

Another issue is if minority children score lower, do they need more or special education to compensate? Sattler raises the self-fulfilling prophesy problem with teachers who find kids scored lower and they thus expected less from them, and thus they were less challenged to achieve. Lower scoring children get less experienced teachers (as the more experienced teachers work with the "bright" children), less one on one time and attention in class, and may not even be offered some opportunities for education and coursework.

This plays out in the children themselves, ("effort optimism"), but if a minority child scores low, might not these reflect real deficits and the need for real intervention? I had a case where a child floored the test, and the worker and family described her as "slow and a little immature," but definitely not retarded. Does she need help she isn't getting?

Sattler also discusses differences in norms. The correct answer to the wallet question is "return it," and some say this is a cultural value that is consistent with religious, legal, and moral codes of conduct. If a different culture doesn't hold that value, then it isn't an issue of test bias that they differ on their score, but of cultural differences. The thing is, this value is a value espoused by American culture. A test of the skills and basic levels of performance required for our society and used to predict that performance should, to do its job, assess understanding and adherence to prevailing values.

3) Minorities may not be culturally ready to take the test
 
Minority children may not appreciate the demands, achievement stimuli, time pressures, competitive edge required… and may not see the test in the same way. Sattler says there may be something to this, but we need data for it.

The effort optimism argument is salient. Gardner quotes a study in which African American students were told they would do poorer, and they did, but when not told this, they performed the same as white kids.

Some argue that instructions like "Work as quickly as you can" may mean little to groups who don't view time as the mainstream culture does, or who don't see the value in rushing. While we have tried to remove all the culturally laden objects in the test, the act of verbally explaining some ideas might be a culturally practiced skill common to middle class parents and children. Prompting, "Yes, you do eat them, but they are also fruits," may also not be seen as a sign that you should change your thinking with some groups, but only as an alternative.

4) White people testing African Americans results in depression of scores
 
Sattler says the data doesn't support this in 25 of 29 published studies.


So are IQ tests biased? It depends. The answer is likely "No" if you limit interpretations to IQ scores and what they are shown to be, but "Yes" if you extend interpretations to "intelligence," whatever that is.