Preparation Pioneered by
Stanley Kaplan in 1946 with a 64-hour course, SAT
preparation has become a highly lucrative field. Many companies and organizations offer test preparation in the form of books, classes, online courses, and tutoring. The test preparation industry began almost simultaneously with the introduction of university entrance exams in the U.S. and flourished from the start. Test-preparation scams are a genuine problem for parents and students. In general, East Asian Americans, especially
Korean Americans, are the most likely to take private SAT preparation courses while
African Americans typically rely more one-on-one tutoring for
remedial learning. Nevertheless,
the College Board maintains that the SAT is essentially uncoachable and research by the College Board and the National Association of College Admission Counseling suggests that tutoring courses result in an average increase of about 20 points on the math section and 10 points on the verbal section. Indeed, researchers have shown time and again that preparation courses tend to offer at best a modest boost to test scores. Statisticians Ben Domingue and Derek C. Briggs examined data from the Education Longitudinal Survey of 2002 and found that the effects of coaching were only statistically significant for mathematics; moreover, coaching had a greater effect on certain students than others, especially those who have taken rigorous courses and those of high socioeconomic status. A 2012
systematic literature review estimated a coaching effect of 23 and 32 points for the math and verbal tests, respectively. Meanwhile, a 2011 study found that the effects of one-on-one tutoring to be minimal among all ethnic groups. When this group is broken down even further, Korean Americans are more likely to take SAT prep courses than
Chinese Americans, taking full advantage of their Church communities and ethnic economy. The College Board announced a partnership with the non-profit organization
Khan Academy to offer free test-preparation materials starting in the 2015–16 academic year to help level the playing field for students from low-income families. The College Board also offers a test called the Preliminary SAT/National Merit Scholarship Qualifying Test (
PSAT/NMSQT), and there is some evidence that taking the PSAT at least once can help students do better on the SAT; moreover, like the case for the SAT, top scorers on the PSAT could earn scholarships.
Sleep hygiene is important as the quality of sleep during the days leading to the exam can improve performance. Moreover, it has been shown that later class times (8:30 am rather than 7:30 am), which better suits the shifted circadian rhythm of teenagers, can raise SAT scores enough to change the tier of the colleges and universities student might be admitted to. In the wake of the COVID-19 pandemic, a large number of American colleges and universities decided to make standardized test scores
optional for prospective students. Nevertheless, many students still chose to take the SAT and to enroll in preparation programs, which continued to be profitable.
Predictive validity and powers In 2009, education researchers Richard C. Atkinson and Saul Geiser from the
University of California (UC) system argued that high school GPA is better than the SAT at predicting college grades regardless of high school type or quality. In its 2020 report, the UC academic senate found that the SAT was better than high school GPA at predicting first year GPA, and just as good as high school GPA at predicting undergraduate GPA, first year retention, and graduation. This predictive validity was found to hold across demographic groups, with the report noting that standardized test scores were actually "better predictors of success for students who are underrepresented minority students (URMs), who are first-generation, or whose families are low-income." A series of College Board reports point to similar predictive validity across demographic groups. However, a month after the UC academic senate report, Saul Geiser rejected the UC academic senate's findings "spurious" because it omitted student demographics. Indicating when high school GPA is combined with demographics in the prediction, the SAT is less reliable.
Li Cai, a
UCLA professor who directs the
National Center for Research on Evaluation, Standards, and Student Testing, indicated that the UC Academic Senate did include student demographics by using a different and simpler model for the public to understand and that the discriminatory impacts of the SAT are compensated during the admissions process.
Jesse Rothstein, a
UC Berkeley professor of public policy and economics, countered Li's claim, claiming that the UC academic senate misunderstood overestimated the value the SAT. According to Rothstein, UC admissions policies did not "compensate" for group differences in test scores. However, by analyzing their own institutional data,
Brown,
Yale, and
Dartmouth universities reached the conclusion that SAT scores were more reliable predictors of collegiate success than GPA. Furthermore, the scores allowed them to identify
more potentially qualified students from disadvantaged backgrounds than they otherwise would. A 2019 study with a sample size of around a quarter of a million students suggested that together, SAT scores and high-school GPA offered an excellent predictor of freshman collegiate GPA and second-year retention. Furthermore, an admissions officer who failed to take average SAT scores into account would risk overestimating the future performance of a student from a low-scoring school and underestimating that of a student from a high-scoring school. Psychometricians Thomas R. Coyle and David R. Pillow showed in 2008 that the SAT predicts college GPA even after removing the general factor of intelligence (
g), with which it is highly correlated. Like other standardized tests such as the ACT or the GRE, the SAT is a traditional method for assessing the academic aptitude of students who have had vastly different educational experiences and as such is focused on the common materials that the students could reasonably be expected to have encountered throughout the course of study. As such the mathematics section contains no materials above the
precalculus level, for instance. Psychologist
Raymond Cattell referred to this as testing for "historical" rather than "current"
crystallized intelligence. Psychologist
Scott Barry Kaufman further noted that the SAT can only measure a snapshot of a person's performance at a particular moment in time. Educational psychologists Jonathan Wai, David Lubinski, and Camilla Benbow observed that one way to increase the predictive validity of the SAT is by assessing the student's
spatial reasoning ability, as the SAT at present does not contain any questions to that effect. Spatial reasoning skills are important for success in STEM. A 2006 study led by psychometrician
Robert Sternberg found that the ability of SAT scores and high-school GPAs to predict collegiate performance could further be enhanced by additional assessments of analytical,
creative, and practical thinking. Psychologist
Nancy Etcoff observed that since better looking students were more likely to receive higher marks, standardized tests such as the SAT could help ensure fairness. Experimental psychologist Meredith Frey noted that while advances in education research and
neuroscience can help incrementally improve the ability to predict scholastic achievement in the future, the SAT or other standardized tests likely will remain a valuable tool to build upon. For example, South Korea's College Scholastic Ability Test (
CSAT) and Finland's
Matriculation Examination are both longer, tougher, and count for more towards the admissibility of a student to university. In many countries around the world, exams, including university entrance exams, are the sole deciding factor of admission; school grades are simply irrelevant. In an article from 2012, educational psychologist Jonathan Wai argued that the SAT was too easy to be useful to the most competitive of colleges and universities, whose applicants typically had brilliant high-school GPAs and standardized test scores. Admissions officers therefore had the burden of differentiating the top scorers from one another, not knowing whether or not the students' perfect or near-perfect scores truly reflected their scholastic aptitudes. He suggested that the College Board make the SAT more difficult, which would raise the measurement ceiling of the test, allowing the top schools to identify the best and brightest among the applicants. At that time, the College Board was already working on making the SAT tougher. After realizing the June 2018 test was easier than usual, the College Board made adjustments resulting in lower-than-expected scores, prompting complaints from the students, though some agreed this was to ensure fairness. In its analysis of the incident, the Princeton Review supported the idea of curving grades, but pointed out that the test was incapable of distinguishing students in the 86th percentile (650 points) or higher in mathematics. The Princeton Review also noted that this particular curve was unusual in that it offered no cushion against careless or last-minute mistakes for high-achieving students. The Review posted a similar blog post for the SAT of August 2019, when a similar incident happened and the College Board responded in the same manner, noting, "A student who misses two questions on an easier test should not get as good a score as a student who misses two questions on a hard test. Equating takes care of that issue." It also cautioned students against retaking the SAT immediately, for they might be disappointed again, and recommended that instead, they give themselves some "leeway" before trying again.
Recognition The College Board claims that outside of the United States, the SAT is considered for university admissions in approximately 70 countries, as of the 2023–24 academic year.
Association with general cognitive ability In a 2000 study, psychometrician Ann M. Gallagher and her colleagues found that only the top students made use of
intuitive reasoning in solving problems encountered on the mathematics section of the SAT. Frey and Detterman (2004) investigated associations of SAT scores with intelligence test scores. Using an estimate of
general mental ability, or
g, based on the
Armed Services Vocational Aptitude Battery, they found SAT scores to be highly correlated with
g (r=.82 in their sample, .857 when adjusted for non-linearity) in their sample taken from a 1979 national probability survey. Additionally, they investigated the correlation between SAT results, using the revised and recentered form of the test, and scores on the
Raven's Advanced Progressive Matrices, a test of
fluid intelligence (reasoning), this time using a non-random sample. They found that the correlation of SAT results with scores on the Raven's Advanced Progressive Matrices was .483, they estimated that this correlation would have been about 0.72 were it not for the
restriction of ability range in the sample. They also noted that there appeared to be a
ceiling effect on the Raven's scores which may have suppressed the correlation. Beaujean and colleagues (2006) have reached similar conclusions to those reached by Frey and Detterman. Because the SAT is strongly correlated with general intelligence, it can be used as a proxy to measure intelligence, especially when the time-consuming traditional methods of assessment are unavailable. For decades many critics have accused designers of the verbal SAT of cultural bias as an explanation for the disparity in scores between poorer and wealthier test-takers, with the biggest critics coming from the University of California system. A famous example of this perceived bias in the SAT I was the
oarsman–
regatta analogy question, which is no longer part of the exam. The object of the question was to find the pair of terms that had the relationship most similar to the relationship between "runner" and "marathon". The correct answer was "oarsman" and "regatta". The choice of the correct answer was thought to have presupposed students' familiarity with
rowing, a sport popular with the wealthy. However, for psychometricians, analogy questions are a useful tool to gauge the mental abilities of students, for, even if the meaning of two words are unclear, a student with sufficiently strong analytical thinking skills should still be able to identify their relationships.
Association with college or university majors and rankings In 2010, physicists Stephen Hsu and James Schombert of the University of Oregon examined five years of student records at their school and discovered that the academic standing of students majoring in mathematics or physics (but not biology, English, sociology, or history) was strongly dependent on SAT mathematics scores. Students with SAT mathematics scores below 600 were highly unlikely to excel as a mathematics or physics major. Nevertheless, they found no such patterns between the SAT verbal, or combined SAT verbal and mathematics and the other aforementioned subjects. In 2015, educational psychologist Jonathan Wai of Duke University analyzed average test scores from the
Army General Classification Test in 1946 (10,000 students), the Selective Service College Qualification Test in 1952 (38,420),
Project Talent in the early 1970s (400,000), the
Graduate Record Examination between 2002 and 2005 (over 1.2 million), and the SAT Math and Verbal in 2014 (1.6 million). Wai identified one consistent pattern: those with the highest test scores tended to pick the physical sciences and engineering as their majors while those with the lowest were more likely to choose education and agriculture. (See figure below.)A 2020 paper by Laura H. Gunn and her colleagues examining data from 1389 institutions across the United States unveiled strong positive correlations between the average SAT percentiles of incoming students and the shares of graduates majoring in STEM and the social sciences. On the other hand, they found negative correlations between the former and the shares of graduates in psychology, theology, law enforcement, recreation and fitness. Various researchers have established that average SAT or ACT scores and college ranking in the
U.S. News & World Report are highly correlated, almost 0.9. Between the 1980s and the 2010s, the U.S. population grew while universities and colleges did not expand their capacities as substantially. As a result, admissions rates fell considerably, meaning it has become more difficult to get admitted to a school whose alumni include one's parents. On top of that, high-scoring students nowadays are much more likely to leave their hometowns in pursuit of higher education at prestigious institutions. Consequently, standardized tests, such as the SAT, are a more reliable measure of selectivity than admissions rates. Still, when Michael J. Petrilli and Pedro Enamorado analyzed the SAT composite scores (math and verbal) of incoming freshman classes of 1985 and 2016 of the top universities and liberal arts colleges in the United States, they found that the median scores of new students increased by 93 points for their sample, from 1216 to 1309. In particular, fourteen institutions saw an increase of at least 150 points, including the University of Notre-Dame (from 1290 to 1440, or 150 points) and Elon College (from 952 to 1192, or 240 points).
Association with types of schooling While there seems to be evidence that private schools tend to produce students who do better on standardized tests such as the ACT or the SAT, Keven Duncan and Jonathan Sandy showed, using data from the
National Longitudinal Surveys of Youth, that when student characteristics, such as age,
race, and
sex (7%),
family background (45%),
school quality (26%), and other factors were taken into account, the advantage of private schools diminished by 78%. The researchers concluded that students attending private schools already had the attributes associated with high scores on their own.
Association with educational and societal standings and outcomes File:1995-SAT-Income2.png File:1995-SAT-Education2.png Research from the University of California system published in 2001 analyzing data of their undergraduates between Fall 1996 through Fall 1999, inclusive, found that the SAT II was the single best predictor of collegiate success in the sense of freshman GPA, followed by high-school GPA, and finally the SAT I. After controlling for family income and parental education, the already low ability of the SAT to measure aptitude and college readiness fell sharply while the more substantial aptitude and college readiness measuring abilities of high school GPA and the SAT II each remained undiminished (and even slightly increased). The University of California system required both the SAT I and the SAT II from applicants to the UC system during the four academic years of the study. This analysis is heavily publicized but is contradicted by many studies. This finding has been replicated and shown to hold across racial or ethnic groups and for both sexes. In addition, the correlation is only significant between biological families, not adoptive ones, suggesting that this might be due to
genetic heritage, not economic wealth. According to the College Board, in 2019, 56% of the test takers had parents with a university degree, 27% parents with no more than a high-school diploma, and about 9% who did not graduate from high school. (8% did not respond to the question.) In their 2018 analysis of data from the
National Longitudinal Surveys of the
Bureau of Labor Statistics, economists Adam Blandin, Christopher Herrington, and Aaron Steelman concluded that family structure played an important role in determining educational outcomes in general and SAT scores in particular. Families with only one parent who has no degrees were designated 1L, with two parents but no degrees 2L, and two parents with at least one degree between them 2H. Children from 2H families held a significant advantage of those from 1L families, and this gap grew between 1990 and 2010. Because the median SAT composite scores (verbal and mathematics) for 2H families grew by 20 points while those of 1L families fell by one point, the gap between them increased by 21 points, or a fifth of one standard deviation.
Sex differences In performance In 2013, the American College Testing Board released a report stating that boys outperformed girls on the mathematics section of the test, a significant gap that has persisted for over 35 years. As of 2015, boys on average earned 32 points more than girls on the SAT mathematics section. Among those scoring in the 700–800 range, the male-to-female ratio was 1.6:1. In 2014, psychologist Stephen Ceci and his collaborators found boys did better than girls across the percentiles. For example, a girl scoring in the top 10% of her sex would only be in the top 20% among the boys. In 2010, psychologist Jonathan Wai and his colleagues showed, by analyzing data from three decades involving 1.6 million intellectually gifted seventh graders from the Duke University Talent Identification Program (TIP), that in the 1980s the gender gap in the mathematics section of the SAT among students scoring in the top 0.01% was 13.5:1 in favor of boys but dropped to 3.8:1 by the 1990s. This ratio is similar to that observed for the ACT mathematics and science scores between the early 1990s and the late 2000s. Sex differences in SAT mathematics scores began making themselves apparent at the level of 400 points and above. Greater male variability has been found in body weight, height, and cognitive abilities across cultures, leading to a larger number of males in the lowest and highest distributions of testing. Consequently, a higher number of males are found in both the upper and lower extremes of the performance distributions of the mathematics sections of standardized tests such as the SAT, resulting in the observed gender discrepancy. Paradoxically, this is at odds with the tendency of girls to have higher classroom scores than boys,
In strategizing Mathematical problems on the SAT can be broadly categorized into two groups: conventional and unconventional. Conventional problems can be handled routinely via familiar formulas or algorithms while unconventional ones require more creative thought in order to make unusual use of familiar methods of solution or to come up with the specific insights necessary for solving those problems. In 2000, ETS psychometrician Ann M. Gallagher and her colleagues analyzed how students handled disclosed SAT mathematics questions in self-reports. They found that for both sexes, the most favored approach was to use formulas or algorithms learned in class. When that failed, however, males were more likely than females to identify the suitable methods of solution. Previous research suggested that males were more likely to explore unusual paths to solution whereas females tended to stick to what they had learned in class and that females were more likely to identify the appropriate approaches if such required nothing more than mastery of classroom materials.
In confidence Older versions of the SAT did ask students how confident they were in their mathematical aptitude and verbal reasoning ability, specifically, whether or not they believed they were in the top 10%. Devin G. Pope analyzed data of over four million test takers from the late 1990s to the early 2000s and found that high scorers were more likely to be confident they were in the top 10%, with the top scorers reporting the highest levels of confidence. But there were some noticeable gaps between the sexes. Men tended to be much more confident in their mathematical aptitude than women. For example, among those who scored 700 on the mathematics section, 67% of men answered they believed they were in the top 10% whereas only 56% of women did the same. Women, on the other hand, were slightly more confident in their verbal reasoning ability than men.
In glucose metabolism Cognitive neuroscientists
Richard Haier and
Camilla Persson Benbow employed positron emission tomography (
PET) scans to investigate the rate of
glucose metabolism among students who have taken the SAT. They found that among men, those with higher SAT mathematics scores exhibited higher rates of glucose metabolism in the
temporal lobes than those with lower scores, contradicting the brain-efficiency hypothesis. This trend, however, was not found among women, for whom the researchers could not find any cortical regions associated with mathematical reasoning. Both sexes scored the same on average in their sample and had the same rates of cortical glucose metabolism overall. According to Haier and Benbow, this is evidence for the structural differences of the brain between the sexes.
Association with race and ethnicity A 2001
meta-analysis of the results of 6,246,729 participants tested for cognitive ability or aptitude found a difference in average scores between black and white students of around 1.0
standard deviation, with comparable results for the SAT (2.4 million test takers). Similarly, on average, Hispanic and Amerindian students perform on the order of one standard deviation lower on the SAT than white and Asian students. Mathematics appears to be the more difficult part of the exam. In 2013, Asian Americans as a group scored 0.38 standard deviations higher than whites in the mathematics section. Some researchers believe that the difference in scores is closely related to the overall achievement gap in American society between students of different racial groups. This gap may be explainable in part by the fact that students of disadvantaged racial groups tend to go to schools that provide lower educational quality. This view is supported by evidence that the black-white gap is higher in cities and neighborhoods that are more racially segregated. Other research cites poorer minority proficiency in key coursework relevant to the SAT (English and math), as well as peer pressure against students who try to focus on their schoolwork ("
acting white"). Cultural issues are also evident among black students in wealthier households, with high achieving parents.
John Ogbu, a Nigerian-American professor of anthropology, concluded that instead of looking to their parents as role models, black youth chose other models like rappers and did not make an effort to be good students. One set of studies has reported differential item functioning, namely, that some test questions function differently based on the racial group of the test taker, reflecting differences in ability to understand certain test questions or to acquire the knowledge required to answer them between groups. In 2003, Freedle published data showing that black students have had a slight advantage on the verbal questions that are labeled as difficult on the SAT, whereas white and Asian students tended to have a slight advantage on questions labeled as easy. Freedle argued that these findings suggest that "easy" test items use vocabulary that is easier to understand for white middle class students than for minorities, who often use a different language in the home environment, whereas the difficult items use complex language learned only through lectures and textbooks, giving both student groups equal opportunities to acquiring it. The study was severely criticized by the ETS board, but the findings were replicated in a subsequent study by Santelices and Wilson in 2010. There is no evidence that SAT scores systematically underestimate future performance of minority students. However, the predictive validity of the SAT has been shown to depend on the dominant ethnic and racial composition of the college. Some studies have also shown that African-American students under-perform in college relative to their white peers with the same SAT scores; researchers have argued that this is likely because white students tend to benefit from social advantages outside of the educational environment (for example, high parental involvement in their education, inclusion in campus academic activities, positive bias from same-race teachers and peers) which result in better grades.
Christopher Jencks concludes that as a group, African Americans have been harmed by the introduction of standardized entrance exams such as the SAT. This, according to him, is not because the tests themselves are flawed, but because of labeling bias and selection bias; the tests measure the skills that African Americans are less likely to develop in their socialization, rather than the skills they are more likely to develop. Furthermore, standardized entrance exams are often labeled as tests of general ability, rather than of certain aspects of ability. Thus, a situation is produced in which African-American ability is consistently underestimated within the education and workplace environments, contributing in turn to selection bias against them which exacerbates underachievement. 2020 was the year in which education worldwide was
disrupted by the COVID-19 pandemic and indeed, the performance of students in the United States on standardized tests, such as the SAT, suffered. Yet the gaps persisted. According to the College Board, in 2020, while 83% of Asian students met the benchmark of college readiness in reading and writing and 80% in mathematics, only 44% and 21% of black students did those respective categories. Among whites, 79% met the benchmark for reading and writing and 59% did mathematics. For Hispanics and Latinos, the numbers were 53% and 30%, respectively. (See figure below.) However, in 2021, in the wake of the COVID-19 pandemic and the
optional status of the SAT at many colleges and universities, only 1.5 million students took the test.
Use in non-collegiate contexts By high-IQ societies Certain
high IQ societies, like
Mensa,
Intertel, the Prometheus Society and the
Triple Nine Society, use scores from certain years as one of their admission tests. For instance, Intertel accepts scores (verbal and math combined) of at least 1300 on tests taken through January 1994; the Triple Nine Society accepts scores of 1450 or greater on SAT tests taken before April 1995, and scores of at least 1520 on tests taken between April 1995 and February 2005. Mensa accepts qualifying SAT scores earned on or before January 31, 1994.
By researchers Because it is strongly correlated with general intelligence, the SAT has often been used as a proxy to measure intelligence by researchers, especially since 2004. A growing body of research indicates that SAT scores can predict individual success decades into the future, for example in terms of income and occupational achievements. A longitudinal study published in 2005 by educational psychologists Jonathan Wai, David Lubinski, and Camilla Benbow suggests that among the intellectually precocious (the top 1%), those with higher scores in the mathematics section of the SAT at the age of 12 were more likely to earn a PhD in the
STEM fields, to have a publication, to register a patent, or to secure university tenure. Consequently, the notion that beyond a certain point, differences in cognitive ability as measured by standardized tests such as the SAT cease to matter is gainsaid by the evidence. In the 2010 paper which showed that the sex gap in SAT mathematics scores had dropped dramatically between the early 1980s and the early 1990s but had persisted for the next two decades or so, Wai and his colleagues argued that "sex differences in abilities in the extreme right tail should not be dismissed as no longer part of the explanation for the dearth of women in math-intensive fields of science." Independent studies using
regression discontinuity reveal that students scoring just above or below the cutoff for admissions to exam or
magnet schools in the United States eventually earned statistically equal SAT or
AP scores and subsequently attended similarly prestigious universities and colleges. In other words, attending these schools have no real effect on the students' academic performance.
By employers Cognitive ability is correlated with job training outcomes and job performance. As such, some employers rely on SAT scores to assess the suitability of a prospective recruit, Major companies and corporations have spent princely sums on learning how to avoid hiring errors and have decided that standardized test scores are a valuable tool in deciding whether or not a person is fit for the job. In some cases, a company might need to hire someone to handle proprietary materials of its own making, such as computer software. But since the ability to work with such materials cannot be assessed via external certification, it makes sense for such a firm to rely on something that is a proxy of measuring general intelligence. Nevertheless, some other top employers, such as
Google, have eschewed the use of SAT or other standardized test scores unless the potential employee is a recent graduate. Google's
Laszlo Bock explained to
The New York Times, "We found that they don't predict anything." Educational psychologist Jonathan Wai suggested this might be due to the inability of the SAT to differentiate the intellectual capacities of those at the extreme right end of the distribution of intelligence. Wai told
The New York Times, "Today the SAT is actually too easy, and that's why Google doesn't see a correlation. Every single person they get through the door is a super-high scorer." == Perception ==