It is commonly perceived that increasing the incentives will improve performance, but increased incentives often come with increased pressure. Taking final exams in school, undergoing the last round of a job interview, giving a speech, and answering questions at a press conference are some examples of situations with high stakes.
In many instances, principals (evaluators), whether in a competitive or non-competitive setting, use a one-shot process to gather information or evaluate the agent. This process is likely to induce pressure since agents understand that they will not have the opportunity to repeat the process – or that doing so will be costly.
In education, increased incentives often come in the form of increased test stakes – the weight attached to examinable material. For example, in many countries, national exams taken at the end of school determine entry into university. How these exams are structured – one-shot exams versus a number of small tests – might have consequences for how students perform. Moreover, when there is scope to choose, some students may choose their educational or career path depending on what structure lies ahead.
Pressure and academic performance
Even though the stakes associated with various tests are the same for all students, their effects and the pressure felt may not be homogeneous across all groups. While more pressure might enhance the academic performance of some students, the reverse might be the case for others.
Research in psychology has shown that increased pressure is potentially harmful to people’s capacity to exhibit their ‘true’ capability (Baumeister 1984). The effects of a change in exam pressure might be especially important for certain types of students – girls versus boys, perhaps, or low- versus high-performing students.
How boys and girls respond to exam pressure
In recent research (Azmat et al. 2015), we explore differences in girls’ and boys’ responses to changes in exam pressure in Spain. We do this using detailed information over a period of 12 years on several cohorts of students who we follow for the duration of their time in school – aged 12 to 18 years old. Over the course of the academic year, students take four different types of tests, including national exams, with varying stakes. See Figure 1 for details on the exam structure.
Figure 1. Evaluation system in the school
Notes: This is the evaluation system used in each subject. Low-Stakes is the test that counts for approximately 2.5% of the final grade. Medium-Stakes is the test that counts for approximately 11% of the final grade. High-Stakes is the test that counts for approximately 27% of the final grade. Super-High-Stakes is the national exam, Selectividad, taken at the end of Level 6, which counts for 50% of the university entry test score.
Test stakes vary from as little as 2.5% of the final grade to others that count for as much as 27%. We also have information on the national exams that they take, which count for 50% of the university entry grade. When students are in their final year of school, in addition to the usual tests, they take national exams. The university entry grade is determined half by the coursework grades in school in the final year and half by the national exams (which makes them even higher in stakes than other tests).
We find that girls and boys react differently to increases in exam pressure, as defined by the level of the stakes at hand.
- In particular, although girls outperform boys in all school tests, girls do relatively better on tests with low stakes, but this difference is reduced and even disappears when the stakes increase.
Gender differences in academic attainment and achievement have been widely documented by researchers in the economics of education (Goldin et al. 2006). For example, in university attainment, the gender gap has closed and, in many countries, it has even reversed to the extent that more women graduate than men. Among 24-35 year olds in OECD countries in 2010, 42% of women compared with only 33% of men had tertiary education.1
But in educational attainment – at both school and university – economists are puzzled by the patterns. Girls tend to outperform boys in classroom tests taken in school or university, but in aptitude and achievement exams, the advantage disappears and often reverses. An important distinguishing feature of these different types of assessments is the extent of their stakes: Exams typically count most for progression and the likelihood of a young person going into further education or getting a job.
We find that in all academic years, girls perform significantly better than boys in classroom tests, but in the national exams, boys perform slightly better than girls.
- In particular, girls outperform boys by almost 0.2 standard deviations of the mean in low-stake test but by only 0.1 standard deviation of the mean in high-stake tests.
- Moreover, in the national exams, when stakes are even higher, the gap is reversed. Boys outperform girls by 0.02 standard deviations, although this difference is not statistically significant.
The findings persist over time, as well as within and between academic years.
Looking across different subjects, we see that the gender difference in performance in the low versus high stakes test is always present. But it is especially important in science – subjects that traditionally have few women studying them beyond compulsory education. In particular, we see that in high-stakes tests, while girls outperform boys in arts and humanity subjects (0.25 standard deviations), they significantly underperform relative to male students in science subjects (-0.10 standard deviations). The effect of this on the final grade in the more technical subjects is ‘cushioned’ by the fact that girls perform as well as boys in the low-stake exams.
Recent studies have shown that women underperform compared with men in competitive environments, and others have shown that women shy away from competitive environments (Niederle and Vesterlund 2011). In our setting, because pressure is not defined as having a competitive nature, the rewards are independent of the performance of others. Here, the gaps in performance result from the pressure that arises due to variation in the size of the test stakes. Moreover, because all tests are compulsory, there is no possibility to shy away from tests of varying stakes.
Our results suggest that changing assessment methods homogeneously across all students may change the gender balance of academic results. These effects may be exacerbated once students are given a choice about which subjects or degrees they want to pursue. Until a certain age all students are obliged to take a certain set of courses, but after that they can choose. These choices are likely to be influenced by previous performance, as well as the anticipated pressure.
This is important not only for educational outcomes but for the labour market too. Looking at degree programmes or across occupations, there is still a great deal of sorting by gender. Men and women tend to self-select into certain courses and jobs, which can have significant consequences for wages and provides an important explanation for the continued existence of big gender wage gaps. For instance, increased pressure in the selection process will lead to a workforce with a higher tolerance for pressure, but this might come at the expense of other relevant skills. This type of scenario is discussed in Gneezy and List (2013).
In a recent article in the Washington Post, economist Peter Arcidiacono of Duke University is quoted highlighting the issue of gender differences in degree subjects that are traditionally predictive of high salaries later in life: “STEM [science, technology, engineering, and medicine] majors, as with economics, begin with few women enrolling and end with even fewer graduating. This leaky pipeline has been somewhat puzzling, because women enter college just as prepared as men in math and science.”2
To understand why only 29% of bachelor’s degrees in economics in the US are awarded to women, Harvard economics professor Claudia Goldin studied the academic records of students in a research institute. She found that women who receive an A in an introductory economics were actually more likely than men with an A to major in economics. But when women received poorer grade, they were less likely than men to choose economics as their major. Men who receive a B are as likely to major in economics as men with an A, while women with a B were only half as likely to major in economics as women with an A.
One possible explanation for this puzzle is that some degrees (and some jobs) entail more pressure than others. Young people with a low tolerance for pressure will avoid degrees or firms that reward tolerance for pressure. It might also be that pressure in the selection process leads to candidates with low tolerance for pressure opting out.
Azmat, G, C Calsamiglia and N Iriberri (2015), “Gender Differences in Response to Big Stakes”, CEP Discussion Paper No. 1314 (http://cep.lse.ac.uk/pubs/download/dp1314.pdf) and forthcoming in the Journal of the European Economic Association.
Baumeister, R F (1984), “Choking Under Pressure: Self-Consciousness and Paradoxical Effects of Incentives on Skillful Performance”, Journal of Personality and Social Psychology, 46 (3): 610-20.
Gneezy, U and J List (2013), The Why Axis: Hidden Motives and the Undiscovered Economics of Everyday Life, New York: Public Affairs.
Goldin, C, L Katz, and I Kuziemko, (2006), “The Homecoming of American College Women: The Reversal of the Gender Gap in College”, Journal of Economic Perspectives, 20:133-156.
Niederle, M and L Vesterlund (2011), “Gender and Competition”, Annual Review in Economics, 2011, 3, 601–30