Skip to main content

The GRE: A test that fails


Every Fall seniors in the US take the Graduate Records Examination (GRE), and their scores are submitted along with their applications to grad school. Many professors, particularly those in physics departments, believe that the GRE is an important predictor of future success in grad school, and as a result many admissions committees employ score cutoffs in the early stages of their selection process. However, past and recent studies have shown that there is little correlation between GRE scores and future graduate school success.

The most recent study of this type was recently published in Nature Jobs. The authors, Casey Miller and Keivan Stassun show there are strong correlations between GRE scores and race/gender, with minorities and (US) white women scoring lower than their white male (US) counterparts. They conclude, "In simple terms, the GRE is a better indicator of sex and skin colour than of ability and ultimate success."

Here's the key figure from their article:




As the chair of the admissions committee for two years while I was at Caltech, and having served on admissions committees for the past five years, I can attest that these offsets and correlations persist for the Physics GRE as well. Indeed, over the past 20 years, applications to the Caltech Astronomy graduate program show a persistent 80 point offset between men and women applicants from the US. 
GRE scores of male (black) and female (red) applicants to the Caltech
Astronomy graduate program. The histograms have been normalized
such that the peak bin is unity, for clarity. The dashed lines indicate
the median score for male and female applicants (740 and 660, respectively)
There are two possible reasons for this offset: either men and women are fundamentally different in their physics abilities, or GRE scores are not testing what we think they're testing. There are many reasons to favor the former over the latter. First, the male/female divide does not persist from applicants all countries. For example, the gap does not persist for Chinese and Indian applicants, nor does it show up in applications from most Eastern European countries. Indeed, women from China and India score significantly and consistently  higher than US women on the Physics GRE. (While I have hard data for the male/female offset in US grad applicants, I don't have access to my original dataset. Thus, this last point is based on my recollection of top applicants/scores.)

Another reason is that there are no scientific studies to back the assertion of male/female cognitive differences, and certainly none to explain the observed differences in GRE scores. This is assuming that GRE scores are indicative of the cognitive abilities that matter for success in graduate school, which is highly debatable, as explained in the Nature article cited above, and in other studies. Here's a figure showing the relationship (or lack thereof) between GRE Physics scores and performance in graduate course work in the Harvard Physics program, based on a study from 1996:


Then why are GRE scores such strong predictors of gender (and race/ethnicity)? The reason has been known to psychology researchers for decades, and it is known as the phenomenon of stereotype threat (or identity threat). The basic idea is that if the identity of a test-taker corresponds to a group of people who are stereotypically "bad" at a the skills being tested, they will subconsciously experience additional stress to not conform to that stereotype. If the student taking the Physics GRE is a US woman, she is from a society that has explicitly or implicitly taught her that women are poor at math and physics. As a result, when she sits down to take the exam, she is not only aiming for a good score for her own grad school admission prospects, but she is also under pressure not to perform as poorly as society expects her to. This additional stress has been demonstrated to cause deleterious physiological and cognitive effects in test takers. 

But before you are tempted to attribute stereotype threat to a weakness inherent to racial and gender minorities, note that the phenomenon can be triggered in white men. From the abstract of Aaronson et al. (1999):
The two experiments reported in this paper demonstrate that stereotype threat is a general phenomenon that can be experienced by members of any group depending on context. In Experiment 1, White males with high math SAT scores took a difficult math test. In one condition, students were given information suggesting that Asians typically outperform other students in math. Moreover, the students in this condition were told that the study was designed to identify the nature and scope of differences in performance between Asians and other groups in mathematics. In a second control condition there was no mention of Asians, only information suggesting that the task was designed to assess mathematical ability. Participants in the first condition performed significantly worse than students in the control condition. Experiment 2 replicated this finding but also showed moderation by identification with mathematics; only those students who were highly identified with mathematics performed more poorly under stereotype threat. These studies show that stereotype threat can undermine performance of any individual who has a strong identity in a domain when context highlights stereotypes suggestive of relatively poor performance in that domain.
If you are interested in learning more about identity threat, I encourage you to read Claude Steele's clear and entertaining book on the subject, Whistling Vivaldi, and/or check out this compendium of over 300 peer-reviewed journal articles

There's not much more to say about the GRE. It is a deeply flawed metric of assessing future success in graduate school, and in my opinion as an astronomy professor, it should be dropped from the admissions process entirely. We at Harvard have downgraded the importance of GRE test scores in our admissions process and the quality and diversity of our admitted students has increased as a result. Other schools around the country are doing or considering the same. I'll conclude with the conclusion of Miller & Stassun:
Let us be frank: we believe that many STEM faculty members on admissions committees and upper-level administrators hold a deep-seated and unfounded belief that these test scores are good measures of ability, of potential for doing well in graduate school and of long-term potential as a scientist, and that students who score poorly on standardized exams are not likely to become PhD-level scientists. These assumptions are false. 
This is not a call to admit unqualified students in the name of social good. This is a call to acknowledge that the typical weight given to GRE scores in admissions is disproportionate. If we diminish reliance on GRE and instead augment current admissions practices with proven markers of achievement, such as grit and diligence [link added by blogger], we will make our PhD programmes more inclusive and will more efficiently identify applicants with potential for long-term success as researchers. Isn't that what graduate school is about?

Comments

nancy john said…
Another interesting thing about the GRE. Most colleges don't care! When I called one of my prosepective graduate schools they said that the GRE is usually the last thing they look at when deciding whether or not to accept you

college scholarships

Popular posts from this blog

back-talk begins

me: "owen, come here. it's time to get a new diaper" him, sprinting down the hall with no pants on: "forget about it!" he's quoting benny the rabbit, a short-lived sesame street character who happens to be in his favorite "count with me" video. i'm turning my head, trying not to let him see me laugh, because his use and tone with the phrase are so spot-on.

The Long Con

Hiding in Plain Sight ESPN has a series of sports documentaries called 30 For 30. One of my favorites is called Broke  which is about how professional athletes often make tens of millions of dollars in their careers yet retire with nothing. One of the major "leaks" turns out to be con artists, who lure athletes into elaborate real estate schemes or business ventures. This naturally raises the question: In a tightly-knit social structure that is a sports team, how can con artists operate so effectively and extensively? The answer is quite simple: very few people taken in by con artists ever tell anyone what happened. Thus, con artists can operate out in the open with little fear of consequences because they are shielded by the collective silence of their victims. I can empathize with this. I've lost money in two different con schemes. One was when I was in college, and I received a phone call that I had won an all-expenses-paid trip to the Bahamas. All I needed to d

Reader Feedback: Whither Kanake in (white) Astronomy?

Watching the way that the debate about the TMT has come into our field has angered and saddened me so much. Outward blatant racism and then deflecting and defending. I don't want to post this because I am a chicken and fairly vulnerable given my status as a postdoc (Editor's note: How sad is it that our young astronomers feel afraid to speak out on this issue? This should make clear the power dynamics at play in this debate) .  But I thought the number crunching I did might be useful for those on the fence. I wanted to see how badly astronomy itself is failing Native Hawaiians. I'm not trying to get into all of the racist infrastructure that has created an underclass on Hawaii, but if we are going to argue about "well it wasn't astronomers who did it," we should be able to back that assertion with numbers. Having tried to do so, well I think the argument has no standing. At all.  Based on my research, it looks like there are about 1400 jobs in Hawaii r