Skip to main content

The GRE: A test that fails

Every Fall seniors in the US take the Graduate Records Examination (GRE), and their scores are submitted along with their applications to grad school. Many professors, particularly those in physics departments, believe that the GRE is an important predictor of future success in grad school, and as a result many admissions committees employ score cutoffs in the early stages of their selection process. However, past and recent studies have shown that there is little correlation between GRE scores and future graduate school success.

The most recent study of this type was recently published in Nature Jobs. The authors, Casey Miller and Keivan Stassun show there are strong correlations between GRE scores and race/gender, with minorities and (US) white women scoring lower than their white male (US) counterparts. They conclude, "In simple terms, the GRE is a better indicator of sex and skin colour than of ability and ultimate success."

Here's the key figure from their article:

As the chair of the admissions committee for two years while I was at Caltech, and having served on admissions committees for the past five years, I can attest that these offsets and correlations persist for the Physics GRE as well. Indeed, over the past 20 years, applications to the Caltech Astronomy graduate program show a persistent 80 point offset between men and women applicants from the US. 
GRE scores of male (black) and female (red) applicants to the Caltech
Astronomy graduate program. The histograms have been normalized
such that the peak bin is unity, for clarity. The dashed lines indicate
the median score for male and female applicants (740 and 660, respectively)
There are two possible reasons for this offset: either men and women are fundamentally different in their physics abilities, or GRE scores are not testing what we think they're testing. There are many reasons to favor the former over the latter. First, the male/female divide does not persist from applicants all countries. For example, the gap does not persist for Chinese and Indian applicants, nor does it show up in applications from most Eastern European countries. Indeed, women from China and India score significantly and consistently  higher than US women on the Physics GRE. (While I have hard data for the male/female offset in US grad applicants, I don't have access to my original dataset. Thus, this last point is based on my recollection of top applicants/scores.)

Another reason is that there are no scientific studies to back the assertion of male/female cognitive differences, and certainly none to explain the observed differences in GRE scores. This is assuming that GRE scores are indicative of the cognitive abilities that matter for success in graduate school, which is highly debatable, as explained in the Nature article cited above, and in other studies. Here's a figure showing the relationship (or lack thereof) between GRE Physics scores and performance in graduate course work in the Harvard Physics program, based on a study from 1996:

Then why are GRE scores such strong predictors of gender (and race/ethnicity)? The reason has been known to psychology researchers for decades, and it is known as the phenomenon of stereotype threat (or identity threat). The basic idea is that if the identity of a test-taker corresponds to a group of people who are stereotypically "bad" at a the skills being tested, they will subconsciously experience additional stress to not conform to that stereotype. If the student taking the Physics GRE is a US woman, she is from a society that has explicitly or implicitly taught her that women are poor at math and physics. As a result, when she sits down to take the exam, she is not only aiming for a good score for her own grad school admission prospects, but she is also under pressure not to perform as poorly as society expects her to. This additional stress has been demonstrated to cause deleterious physiological and cognitive effects in test takers. 

But before you are tempted to attribute stereotype threat to a weakness inherent to racial and gender minorities, note that the phenomenon can be triggered in white men. From the abstract of Aaronson et al. (1999):
The two experiments reported in this paper demonstrate that stereotype threat is a general phenomenon that can be experienced by members of any group depending on context. In Experiment 1, White males with high math SAT scores took a difficult math test. In one condition, students were given information suggesting that Asians typically outperform other students in math. Moreover, the students in this condition were told that the study was designed to identify the nature and scope of differences in performance between Asians and other groups in mathematics. In a second control condition there was no mention of Asians, only information suggesting that the task was designed to assess mathematical ability. Participants in the first condition performed significantly worse than students in the control condition. Experiment 2 replicated this finding but also showed moderation by identification with mathematics; only those students who were highly identified with mathematics performed more poorly under stereotype threat. These studies show that stereotype threat can undermine performance of any individual who has a strong identity in a domain when context highlights stereotypes suggestive of relatively poor performance in that domain.
If you are interested in learning more about identity threat, I encourage you to read Claude Steele's clear and entertaining book on the subject, Whistling Vivaldi, and/or check out this compendium of over 300 peer-reviewed journal articles

There's not much more to say about the GRE. It is a deeply flawed metric of assessing future success in graduate school, and in my opinion as an astronomy professor, it should be dropped from the admissions process entirely. We at Harvard have downgraded the importance of GRE test scores in our admissions process and the quality and diversity of our admitted students has increased as a result. Other schools around the country are doing or considering the same. I'll conclude with the conclusion of Miller & Stassun:
Let us be frank: we believe that many STEM faculty members on admissions committees and upper-level administrators hold a deep-seated and unfounded belief that these test scores are good measures of ability, of potential for doing well in graduate school and of long-term potential as a scientist, and that students who score poorly on standardized exams are not likely to become PhD-level scientists. These assumptions are false. 
This is not a call to admit unqualified students in the name of social good. This is a call to acknowledge that the typical weight given to GRE scores in admissions is disproportionate. If we diminish reliance on GRE and instead augment current admissions practices with proven markers of achievement, such as grit and diligence [link added by blogger], we will make our PhD programmes more inclusive and will more efficiently identify applicants with potential for long-term success as researchers. Isn't that what graduate school is about?


nancy john said…
Another interesting thing about the GRE. Most colleges don't care! When I called one of my prosepective graduate schools they said that the GRE is usually the last thing they look at when deciding whether or not to accept you

college scholarships

Popular posts from this blog

An annual note to all the (NSF) haters

It's that time of year again: students have recently been notified about whether they received the prestigious NSF Graduate Student Research Fellowship. Known in the STEM community as "The NSF," the fellowship provides a student with three years of graduate school tuition and stipend, with the latter typically 5-10% above the standard institutional support for first- and second-year students. It's a sweet deal, and a real accellerant for young students to get their research career humming along smoothly because they don't need to restrict themselves to only advisors who have funding: the students fund themselves!
This is also the time of year that many a white dude executes what I call the "academic soccer flop." It looks kinda like this:

It typically sounds like this: "Congrats! Of course it's easier for you to win the NSF because you're, you know, the right demographic." Or worse: "She only won because she's Hispanic."…

Culture: Made Fresh Daily

There are two inspirations for this essay worth noting. The first is an impromptu talk I gave to the board of trustees at Thatcher School while I was visiting in October as an Anacapa Fellow. Spending time on this remarkable campus interacting with the students, faculty and staff helped solidify my notions about how culture can be intentionally created. The second source is Beam Times and Lifetimes by Sharon Tarweek, an in-depth exploration of the culture of particle physics told by an anthropologist embedded at SLAC for two decades. It's a fascinating look at the strange practices and norms that scientists take for granted.
One of the stories that scientists tell themselves, whether implicitly or explicitly, is that science exists outside of and independent of society. A corollary of this notion is that if a scientific subfield has a culture, e.g. the culture of astronomy vs. the culture of chemistry, that culture is essential rather than constructed. That is to say, scientific c…

Finding Blissful Clarity by Tuning Out

It's been a minute since I've posted here. My last post was back in April, so it has actually been something like 193,000 minutes, but I like how the kids say "it's been a minute," so I'll stick with that.
As I've said before, I use this space to work out the truths in my life. Writing is a valuable way of taking the non-linear jumble of thoughts in my head and linearizing them by putting them down on the page. In short, writing helps me figure things out. However, logical thinking is not the only way of knowing the world. Another way is to recognize, listen to, and trust one's emotions. Yes, emotions are important for figuring things out.
Back in April, when I last posted here, my emotions were largely characterized by fear, sadness, anger, frustration, confusion and despair. I say largely, because this is what I was feeling on large scales; the world outside of my immediate influence. On smaller scales, where my wife, children and friends reside, I…