A remarkable similarity between IQ and SAT scores across ethnic groups

Chuck recently published the IQ estimates for almost 30 ethnic groups/subgroups in the ABCD of the 10-year old US children. The post was an astounding hit. However, a few commenters complained that the sample sizes of some subgroups were small. I responded that if one could replicate the values and the rank order, one would have more confidence in these estimates. And this is exactly what we did here (full result available).

To replicate the ABCD analysis, we investigated the SAT / ACT data from the NPSAS:UG by using the NCES datalab. Two recent waves were available for ethnicity comparisons, NPSAS:16 and NPSAS:20. Total sample sizes for NPSAS:16 and NPSAS:20 are, respectively, 122,030 and 145,490. These samples are representative of the American college population. Only the weighted sample sizes for subgroups are reported during our analysis using their datalab. So we computed the “real” subgroup sizes by multiplying these weighted sample sizes by the ratio of the unweighted / weighted total sample size. In other words, if the weights of ethnic groups are markedly different from each other, the sample size estimates will not be accurate.

The College Board used the ACT/SAT concordance to compute the composite SAT: “First based on agency-reported SAT scores; if not available, then the institution-reported SAT score was used. If neither was available, then then an agency-reported ACT score converted to an SAT score was used; if not available, then an institution-reported ACT score converted to an SAT score was used.” While we refer to the scores as SAT scores, they represent a combination of both SAT and ACT scores.

The sample we used was restricted to US born students. The SAT variable is a composite of the SAT math and reading-writing sections or, alternatively, the ACT composites converted to SAT-metric scores. The SAT scores are converted into an IQ scale by using the SD of the total group (the only estimate that was available). The table below displays IQ estimates from SAT scores and IQ scores from the ABCD NIH Toolbox battery as well as their differences.

	Averaged SAT		ABCD IQ		SAT-ABCD IQ Difference
	IQ-metric	N-est.	NIH-IQ	N	SAT-2016	SAT-2020	SAT-average
White & Asian Indian	109.42	323	109.62	44		-0.20	-0.20
Chinese	108.55	2469	111.32	81	-3.80	-1.81	-2.77
White & Chinese	108.16	804	105.16	77	5.06	1.99	3.00
Korean & Japanese	107.52	1198	110.05	33	-2.57	-2.50	-2.53
Asian Indian	107.11	1742	102.42	53	3.44	5.55	4.69
Mixed Asian	104.92	1208
White & Korean/Japanese	104.28	1435	106.65	78	0.25	-3.57	-2.37
White & Vietnamese	102.02	295
Vietnamese	101.57	1473	98.69	24	-0.90	6.00	2.88
White & Filipino	100.96	921	105.07	60	-3.74	-4.27	-4.11
White	100.00	131370	100.00	5858	0.00	0.00	0.00
Filipino	98.88	1447	103.53	51	-2.05	-7.12	-4.65
White & Pacific Islander	98.45	1491	99.66	25	0.81	-2.09	-1.21
Other Asian	98.41	2137	102.46	52	-4.68	-3.48	-4.05
White & Native American	96.74	3360	95.63	144	3.11	0.12	1.11
White Cuban	95.79	993	91.67	151	5.24	2.47	4.12
NH Black & White	94.68	6033	91.63	418	4.46	2.37	3.05
White Puerto Rican	94.37	3415	90.98	133	3.90	2.78	3.39
Other Hispanic	93.14	10567	91.29	518	1.79	1.91	1.85
Pacific Islander	92.96	912	96.06	17	-3.61	-2.76	-3.10
Mixed Hispanic	92.30	4750
White Mexican	91.90	19596	91.78	775	0.32	-0.10	0.12
Native American	91.47	1802	89.22	39	2.99	1.50	2.25
Other Cuban	90.91	170	89.69	30	1.80	0.12	1.22
Black & Other Puerto R.	90.85	1273	87.69	90	3.55	2.82	3.16
NH Black immigrant	90.46	5824	89.74	110	-0.14	1.96	0.72
Other Mexican	90.39	3763	88.79	460	1.43	1.80	1.60
USA Black	89.16	23974	82.98	1499	6.05	6.37	6.18

High rank-order similarity

There was a remarkable high rank-order correlation between IQ and SAT scores across the 25 subgroups (among them, 11 mixed race groups) with available comparisons, as high as 0.92, 0.92, 0.94, using SAT-2016, SAT-2020 and the SAT averaged across years. The similarity of IQ and SAT scores in IQ metric is noticeable but less spectacular, as seen in the last columns under “SAT-ABCD IQ Difference”. Comparing SAT-2016 or SAT-2020 scores with the ABCD IQ scores in IQ metric yields similar result, although a few subgroups showed large SAT changes over time (Vietnamese and Filipino). On average, the IQ and SAT scores typically differ by 2-4 points in IQ scale. Black americans are the exception as they show a 6-point difference in both 2016 and 2020. Furthermore, the black-white gap in SAT is smaller than in the NIH-IQ. The “overperformance” of blacks in the SAT may be due, presumably, to attrition, dropping out of college. However, the black-white mixed race group scores exactly mid-way between blacks and whites, as it does in the ABCD and other data sets.

The similarity between the IQ and SAT gap estimates and correlations is remarkable despite the difference in age: the ABCD sampled 9/10-yrs old children in 2016 while the NPSAS:UG sampled undergraduate students around the same years. Another factor which could potentially cause estimation discrepancy and explain the observed correlation lower than 1 is the small sample in some ABCD subgroups.

Taken together, the similarity in correlations and standardized gaps between the SAT and IQ might be tentatively explained by their similar psychometric properties. However, such similarity reveals nothing about the absolute mean values. To illustrate, correlation deals with rank ordering but not mean changes. This explains why the SAT scores could go down during the time IQ scores went up (Jensen, 1998, p. 322; Williams, 2013, p. 4). And here, in our data, there was a huge score inflation between 2016 and 2020. This can still be consistent with our finding of a high group correlation regardless of the years.

Myths about the SAT

One possible explanation for the present result is that the SAT is a good proxy for IQ owing to their high correlation (Frey & Detterman, 2004; Beaujean et al., 2006; Koenig et al., 2008; Marks, 2022, Table 5). A big myth spread by most criticism is that SAT measures primarily privilege. A meta-analysis, using College Board data, conducted by Sackett et al. (2009, Table 4) found that, within a given school’s applicant pool, SES did not attenuate the correlation between SAT and grade (r=0.47) whereas the correlation between grade and SES was modest (r=0.19) but then disappeared once SAT was controlled for (r=0.05). The respective correlations for meta-analysis of independent studies were 0.36, 0.09 and 0.00. Another myth is that SAT loses predictivity at high level score. Jensen (1998, pp. 289, 336) and Kuncel & Hezlett (2010) reviewed evidence that the GPA-SAT relationship is linear and that SAT predicts greater professional accomplishments within the top 1% of the cognitive ability. Wai et al. (2005) found that higher quartiles of SAT scores within the top 1% measured at age 13 predicted occupational accomplishments 20 years later. Wai (2013, Table 3) showed that higher SAT scores within the top 1% predicted higher wealth among billionaires and CEOs.

Robust to coaching effect

Another strong feature of the SAT: coaching effects are quite modest. Powers & Rock (1998, Table 5) examined the 1995-96 administration of SAT I and found a gain of 6-8 and 13-18 points on SAT-V and SAT-M, respectively, when discounting for outlier estimates. They applied different methods to control for self-selection bias, including Propensity Matching, Instrumental Variable Selection, ANCOVA, repeated measures, Belson model, Heckman model. Finally, they noticed that their estimates are similar to an earlier meta-analysis of 48 studies reporting 9 and 19 points for the SAT-V and SAT-M. Using the NELS:88, Briggs (2001, Table 5) estimated the coaching effect on SAT-math and SAT-verbal to be 14-15 and 6-8 points, respectively, when controlling for PSAT scores, GPA, SES as well as motivation and preparation activities. The result does not change after applying the Heckman model.

However, the effects depend on various factors. Briggs (2004, pp. 226-229) noted that “coached students are more socioeconomically advantaged and more extrinsically motivated to take the SAT than uncoached students.” (p. 223). This time, he re-analyzed a subsample of the NELS:88 and reported new findings. A stronger coaching effect among high SES compared to middle SES (25 vs. 12 points) for the SAT-V. A negative effect of coaching for hispanics (-18 points) which is 31 points less than that of coaching for whites (13 points). The coaching effect is about 26 points for students who score very high on the NELS but turns negative for students who score below average on the NELS. The largest coaching effect is about 40 points, for students preparing for the SAT-M with commercial coaching and a private tutor, and about 18 points for students preparing for the SAT-V with a commercial course and computer software. Overall, the effect of commercial coaching is about 3-20 points on the SAT-V, and 10-28 points for the SAT-M. Domingue & Briggs (2009, Tables 2, 6-7) reported interesting results when they analyzed the SAT data in the ELS:02, using Propensity Score matching method and other adjustments for PSAT differences and clustering of students at the school level. Coaching in the top SES quartile yields 15 and 9 points on SAT-M and SAT-V while coaching in lower SES quartiles yields 5 and 2 points, respectively. Coaching effect is found to be higher among Asians, by 5 and 14 points for SAT-M and SAT-V, and to be higher for SAT-M by 15 points and lower for SAT-V by 18 points among blacks. Overall, SAT-M and SAT-V produced 11-22 points and 6-8 points, respectively. More recent studies are needed to confirm whether the surge in SAT scores after 2017 is due to coaching effect.

Stability and comparability of predictive validities

An important question is the comparability of predictive validities across races and over time. Marini et al. (2019, Table 4) found that the 2017 SAT correlates well with first-year GPA for all subgroups, after correction for multivariate range restriction, although being lower (0.44) for blacks and higher (0.55) for asians and intermediate (0.50) for whites. Similar corrected correlations were reported by Beard & Marini (2018, Table 8) for the 2013 SAT, i.e., 0.45, 0.53, 0.53 for blacks, asians, whites, respectively. The numbers for 2012 SAT (Beard & Marini, 2015, Table 8) and 2011 SAT (Patterson & Mattern, 2013, Table 8) were almost identical at 0.47, 0.52, 0.54 and 0.47, 0.52, 0.55, respectively. Patterson & Mattern (2012, Table 8) reported correlations of the 2009 SAT, i.e., 0.47, 0.51 ,0.53, Patterson & Mattern (2011, Table 8) reported correlations of the 2008 SAT, i.e., 0.47, 0.52, 0.54, while Mattern et al. (2008, Table 2) correlations of the 2006 SAT, i.e., 0.47, 0.48, 0.53 for blacks, asians, whites, respectively. If there was any trend over time, the correlation among asians increased slightly. With regard to differential validities, Dalliard indeed raised concerns about lack of proper adjustment (e.g., college selectiveness, omitted variables) and follow up years after the 2017-revision. However, Jensen (1980, p. 486) reported an old study by Cleary (1968) who found an average correlation between SAT-V and SAT-M with GPA of 0.47 for blacks and 0.47 for whites. These old numbers are not strikingly different from recent ones, despite multiple revisions of the SAT over time and despite the massive SAT decline between 1963 and 1980 (Murray & Herrnstein, 1992, Figure 1).

Earlier I reviewed numerous studies on the cultural fairness of IQ tests, all using DIF techniques to detect bias. Typically, the DIFs were small and multidirectional for either IQ or SAT. The most recent DIF study on the SAT is from the 2017 SAT® Suite of Assessments Technical Manual Appendixes (Table A-5.16). The amount and size of DIFs are negligible with respect to ethnic groups (blacks, hispanics, asians). However it is not clear which methods are applied.

More education widens the SAT gap and IQ gap

Still another similar pattern worth mentioning. A final analysis compares the SAT scores by ethnic group across each of the nine levels of parental education: no high school, high school, vocational training, associate’s degree, some college, bachelor, master degree, PhD professional, PhD scholarship. Earlier, I have examined 8 data sets and found that the black-white IQ gap increases when education level increases, consistent with the hypothesis that more education does not reduce the IQ gap. This pattern was replicated in the SAT scores, for both years. The asian-white gap maybe increases with education level, but not the hispanic-white gap. It is striking that asians with parents who did not complete high school score 100 points above blacks with PhD parents.

The interval bars display the 95% confidence intervals.

References.

Beard, J., & Marini, J. P. (2015). Validity of the SAT for predicting first-year grades: 2012 SAT validity sample (College Board Statistical Report 2015-2). New York: College Board.
Beard, J., & Marini, J. (2018). Validity of the SAT® for Predicting First-Year Grades: 2013 SAT Validity Sample. College Board.
Beaujean, A. A., Firmin, M. W., Knoop, A. J., Michonski, J. D., Berry, T. P., & Lowrie, R. E. (2006). Validation of the Frey and Detterman (2004) IQ prediction equations using the Reynolds Intellectual Assessment Scales. Personality and Individual Differences, 41(2), 353-357.
Briggs, D. C. (2001). The effect of admissions test preparation: Evidence from NELS: 88. Chance, 14(1), 10-18.
Briggs, D. C. (2004). Evaluating SAT coaching: Gains, effects and self-selection. In Rethinking the SAT: Perspectives based on the November 2001 conference at the University of California, Santa Barbara, ed. by R. Zwick, 217-234. New York: Routledge Falmer.
Domingue, B., & Briggs, D. C. (2009). Using linear regression and propensity score matching to estimate the effect of coaching on the SAT. Multiple Linear Regression Viewpoints, 35(1), 12-29.
Frey, M. C., & Detterman, D. K. (2004). Scholastic assessment or g? The relationship between the scholastic assessment test and general cognitive ability. Psychological science, 15(6), 373-378.
Koenig, K. A., Frey, M. C., & Detterman, D. K. (2008). ACT and general cognitive ability. Intelligence, 36(2), 153-160.
Kuncel, N. R., & Hezlett, S. A. (2010). Fact and fiction in cognitive ability testing for admissions and hiring decisions. Current Directions in Psychological Science, 19(6), 339-345.
Marini, J. P., Westrick, P. A., Young, L., Ng, H., Shmueli, D., & Shaw, E. J. (2019). Differential Validity and Prediction of the SAT®: Examining First-Year Grades and Retention to the Second Year. College Board.
Marks, G. N. (2022). Cognitive ability has powerful, widespread and robust effects on social stratification: Evidence from the 1979 and 1997 US National Longitudinal Surveys of Youth. Intelligence, 94, 101686.
Mattern, K. D., Patterson, B. F., Shaw, E. J., Kobrin, J. L., & Barbuti, S. M. (2008). Differential Validity and Prediction of the SAT. (College Board Research Report 2008-4). New York: College Board.
Patterson, B. F., & Mattern, K. D. (2011). Validity of the SAT for Predicting First-Year Grades: 2008 SAT Validity Sample. Statistical Report No. 2011-5. College Board.
Patterson, B. F., & Mattern, K. D. (2012). Validity of the SAT® for Predicting First-Year Grades: 2009 SAT Validity Sample. Statistical Report No. 2012-2. College Board.
Patterson, B. F., & Mattern, K. D. (2013). Validity of the SAT® for Predicting First-Year Grades: 2011 SAT Validity Sample. Statistical Report 2013-3. College Board.
Powers, D. E., & Rock, D. A. (1998). Effects of coaching on SAT® I: Reasoning scores. ETS Research Report Series, 1998(2), i-17.
Sackett, P. R., Kuncel, N. R., Arneson, J. J., Cooper, S. R., & Waters, S. D. (2009). Does socioeconomic status explain the relationship between admissions tests and post-secondary academic performance?. Psychological bulletin, 135(1), 1.
Wai, J. (2013). Investigating America’s elite: Cognitive ability, education, and sex differences. Intelligence, 41(4), 203-211.
Wai, J., Lubinski, D., & Benbow, C. P. (2005). Creativity and occupational accomplishments among intellectually precocious youths: an age 13 to age 33 longitudinal study. Journal of Educational Psychology, 97(3), 484.

Latest Images

Trending Articles

Latest Images