Entering labour markets often involves navigating a series of selection processes designed to identify the most qualified candidates. In many fields, including government jobs, these selection mechanisms frequently utilise multiple-choice tests as a primary assessment tool. Multiple-choice tests serve as a standardised method to efficiently evaluate candidates’ knowledge and skills.
Despite their popularity, multiple-choice tests face significant criticism. Broadly, they are criticised for potentially introducing biases that affect test-taker performance. Factors such as risk aversion and varying levels of confidence can influence how individuals approach these tests, often leading to disparities in scores that do not accurately reflect true knowledge or capability. Their prevalence in high-stakes scenarios underscores the importance of studying the design and impact of multiple-choice tests.
Risk aversion, confidence, and gender
Delving into the criticisms of multiple-choice tests, extensive research suggests that the scoring system – specifically, penalising incorrect answers more than omitted ones – may favour risk-seeking individuals and hinder risk-averse candidates. This is because the ability to eliminate incorrect answers and make an ‘educated guess’ becomes a risky proposition when incorrect answers carry a heavier penalty.
But how does gender factor into this equation? An extensive body of literature has shown that men and women can exhibit different behavioural traits that impact labour market outcomes (Shurchkov and Eckel 2018 provide a comprehensive overview). These gender traits, particularly attitudes towards risk, have been found to potentially influence gender performance on multiple-choice tests (Croson and Gneezy 2009, Filippin and Crosetto 2016, Exley and Kessler 2022, Exley and Nielsen 2022). Research shows that men are typically less hesitant than women to offer a guess on multiple-choice test items. Two main factors drive this willingness to guess.
First, men tend to display a greater propensity for risk-taking compared to women. This attribute makes them more likely to adopt guessing strategies for questions they are uncertain about. Second, studies suggest that men may express higher levels of confidence in their knowledge or ability to deduce correct answers, leading them to take more risks by guessing.
Most of the existing research on the topic has concentrated on analysing the performance of ‘average’ candidates, yielding important insights into overall trends.). The topic has also been already discussed on Vox (e.g. Key et al. 2016, Iriberri 2019). Less attention has been paid to the tails of the distribution – the most and least successful candidates. These groups provide us with the opportunity to analyse whether and how gendered risk attitudes change along the distribution of performance variables, as well as their impact on outcomes for both top and bottom performers.
In recent work (Diez-Rituerto et al. 2024), we explore gender differences in the willingness to guess using data from the Spanish medical profession, where entry is governed by a highly regulated system. This process begins with a multiple-choice test that grants medical graduates access to intern positions. Given the significance of this test, students spend an average of 10 months preparing for it, in addition to the six years required to complete their degree.
In this context of highly skilled and trained individuals, we revisit gender differences in the willingness to guess. Specifically, we study the impact of a change in the number of alternative answers per question that occurred from 2014, when each question had five alternative answers, to 2015, when the number of alternatives was reduced to four. As the scoring rule remained constant across the years, the reduction in the number of available options increased the expected score from a random guess from a negative value (before) to zero (after).
This change has significant implications for test takers’ behaviours. The primary expectation is that this change will positively influence test takers’ willingness to guess, as a random guess now has a higher probability of success. Considering gender differences, it is expected that before the change, women would answer fewer questions than men due to their higher risk aversion. However, it is also expected that the change would differentially affect men and women, with women now likely to answer more questions compared to men.
Heterogenous gender effects at the extremes of the distribution
We tested these hypotheses through two distinct analyses. First, we examined the average effects and, as initially expected, found that the reduction in the number of alternatives positively influenced candidates’ willingness to guess, resulting in an increase of nine answered questions per test taker. However, surprisingly, and contrary to existing literature, we found no evidence that women answered fewer questions than men before the change or that the change affected men and women differently.
To reconcile this unexpected result with previous findings, we conducted a heterogeneity analysis using a quantile regression approach along the distribution of performance variables. This analysis revealed a more nuanced picture. We discovered that women tend to omit more questions than men, but only among those who answered most of the questions (50th percentile and above). Conversely, among test takers who answered fewer questions (20th percentile and below), women omitted fewer questions than men.
Moreover, the reduction in the number of alternative answers affected women more significantly, increasing their willingness to guess more than men. This effect was only present among those answering most of the questions. Consequently, after the change, the gender gap in answering questions was eliminated among those who answered most questions.
However, since the initial gender gap among these high-answering individuals was already small (1 to 1.2 questions), and because the increased willingness to guess led to both a rise in incorrectly answered questions and a slight decrease in correctly answered questions (both statistically insignificant), the elimination of the gender gap did not translate into significant changes in other performance measures, such as the proportion of correct answers and overall test scores.
Figure 1 Gender gaps in the number of answered questions among the distribution of participants
From a policy perspective, our results support the existing recommendation to reduce or even eliminate penalties for incorrect answers on multiple-choice tests. We observed no gender differential effects of reducing the number of alternatives – effectively reducing the penalty – in the lower part of the performance distribution. More importantly, since selection processes typically focus on candidates in the upper part of the performance distribution, where individuals are best positioned to be selected, reducing penalties to achieve a zero expected value for random guessing helps prevent gender disparities in answering patterns. This change levels the playing field for men and women, promoting fairness in the selection process.
References
Croson, R, and U Gneezy (2009), “Gender differences in preferences”, Journal of Economic Literature 47(2): 448–74.
Diez-Rituerto, M, J Gardeazabal, N Iriberri, and P Rey-Biel (2024), “Gender differences in willingness to guess revisited: Heterogeneity in a high stakes professional setting”.
Exley, C, and K Nielsen (2022), “The gender gap in confidence: Expected but not accounted for”, SSRN 4352381.
Exley, C L, and J B Kessler (2022), “The gender gap in self-promotion”, Quarterly Journal of Economics 137(3): 1345–81.
Filippin, A, and P Crosetto (2016), “A reconsideration of gender differences in risk attitudes”, Management Science 62(11): 3138–60.
Iriberri, N (2019), “Girls, boys and multiple choice”, VoxEU.org, 12 April.
Key, J, K Krishna, and P Akyol (2016), “Precision versus bias in multiple choice exams”, VoxEU.org, 24 August.
Shurchkov, O, and C C Eckel (2018), “Gender differences in behavioral traits and labor market outcomes”, in S L Averett, L M Argys, and S D Hoffman (eds.), Oxford Handbook on Women and the Economy, 481–512.