[Proposal] Why do (wo)men quit competition after failure? A machine learning approach

“There’s no such thing
as competition
to find our way
we lose control”
(Daft Punk, Beyond)

A. Motivation

Women are underrepresented at every level of the corporate ladder. A growing and important experimental literature suggests that women are both more likely to avoid competition and drop out of competitive environments after experiencing disappointment than men. These laboratory measures seem also to predict career choices and hence partly explain the gender gap.

We study the role of perceived unfairness in explaining gender differences in willingness to compete again in response to losing a competition. We ask whether men and women differ in how they respond to losing or winning a competition and whether these differences increase or decrease in presence of unfair conditions.

Moreover, we aim to understand which women and men drive the gender gap. For this purpose, we run a post- experiment survey to elicit a large number of socio-economic characteristics and personality traits. We estimate heterogeneous treatment effects using machine learning methods that allow to control for a large number of covariates.

In summary, our study aims at answering three questions:

  • I) Do men and women differ in how they respond to losing or winning a competition?

  • II) Do these gender differences increase or decrease in presence of unfair conditions?

  • III) What are the mechanisms and characteristics driving any of the observed differences?

B. Methodology

B.1 Experimental Design

Using an online experiment (Prolific) we investigate whether there are gender differences in willingness to compete again after winning/losing a competition in both fair and unfair tournaments. Using a between-subject design including three treatments we vary the nature of the competition, namely: neutral (fair), unfair and unfair with feedback about whether the outcome of the competition (winning or losing) was deserved or not.

In our experiment participants are randomly assigned to one of three treatments. Each treatment consists of three effort tasks (plus an intermediate task in which we elicit beliefs about own abilities) and concludes with a questionnaire.

1. Neutral treatment

Each player plays three main tasks of 90 seconds each. Only one randomly chosen task will be payoff-relevant. In task 1, participants perform a real effort task in which they have to count the number of zeros (0) in ten tables consisting of zeros (0) and ones (1). They are paid according to a piece rate that pays 0.15 pounds per table they solve correctly. In task 2 participants work on the same task but are paid according to a tournament rate that pays 0.30 pounds if the participant’s score (which is the number of tables they solve correctly) exceeds the score of another randomly selected player who has already played the task. Before task 3 is played, we ask players to consider their performance in task 2 and guess their rank compared to other 100 participants in task 2.

We call this task the “Guessing task”. This task is incentivized. We pay a base payment of 0.50 pounds, with a penalty of 0.02 pounds times the absolute difference between the true rank and the stated (guessed) rank.

Subsequently, participants are given a neutral feedback, i.e., “you won/lost in the tournament”. In task 3, participants work on the same task but before that, they can choose between the piece rate and tournament payment. If the latter is chosen, participants’ scores are matched with the scores of other players who already played the task (different from the opponent in task 2). This guarantees that participants’ decisions in task 3 do not impose an externality on earnings of others.

2. Unfair treatment

It is analogous to the “Neutral Treatment” except that before completing task 2, participants are told that in 50% of the cases the winner of the tournament will be the one with the higher score (i.e.: the actual winner) while in the remaining 50% of the cases the winner will be randomly chosen. This means that there is a 25% chance that the player with the highest score will lose undeservedly. Feedback is the same as in the neutral treatment. Since no feedback is given about which of the two above-described scenarios has occurred, participants do not know whether they won/lost deservedly or undeservedly.

The tournament in task 3 (if chosen) will be a fair (neutral) tournament in order to both keep the treatments as similar as possible and to avoid that preferences for fair (neutral) competitive environment could play a role in task 3’s decision.

This treatment reproduces the features of many real world situations where individuals do not always know whether they won/lost because someone had an unfair advantage or not.

3. Unfair feedback treatment

It is analogous to the “Unfair Treatment” except that players are now told whether they won/lost deservedly or undeservedly the tournament in Task 2. Participants also receive feedback about their true rank in the “Guessing task”.

B.2 Machine Learning

The experiment ends with a questionnaire asking questions about willingness to take risk, about whether participants think that men or women are better in the counting zero task, socio-economic variables, family background, athletic/sport experience, as well as perception of unfairness and competitive attitudes.

We explore treatment effect heterogeneity and relax the assumption that treatment effects are the same among individual with different characteristics. This is important because policy implications of our analysis can be tailored for a specific subgroup of the general population.

In order to handle the high-dimensional nature of the collected data, we perform statistical analyses using machine learning techniques. In particular, we will use a generalized version of the random forest algorithm combined with penalized regression (lasso) estimation.

Sample size and costs

Our target sample size is between 1200 and 1800 participants. Conditional on having lost a previous tournament we aim at having 200-300 individuals per treatment equally split by gender. With 1200 (1800) participants in total: 600 (900) men and 600 (900) women and, per treatment, 200 (300) men and 200 (300) women. Supposing that roughly half of the individuals will lose, we will have 100 (150) men and 100 (150) women per treatment.

We use power calculations to select our sample size, and results from two pilot studies we already run. Accounting for sample design and clustering, we estimate the minimum detectable effect size for main outcomes.

  1. Power analysis for unconditional raw gender gaps within treatments:

Using an exact Fisher test with power=80%, alpha=0.05, assuming 400 (600) participants within treatment, the minimum detectable effect size is roughly 15 (10) percent points.

From previous studies (Buser & Yuan, 2019) we see that gender gap in a neutral treatment is around 14 percent points. With the sample size specified we would able to detect it.

  1. Power analysis for raw gender gaps conditional on having lost within treatments:

Using an exact Fisher test with power=80%, alpha=0.05, assuming 200 (300) participants within treatment, the minimum detectable effect size is 20 (15) percent points. From our pilot experiments we see a gender gap of roughly 30 percent point. With the sample size specified we would able to detect it.

Based on the two pilot experiments we already run in Prolific and the target sample size of 1800 participants, we estimate that our experiment will cost 6000ÂŁ in total.

Pre-registration of our experiment will be available shortly on the AEA RCT registry https://www.socialscienceregistry.org/ under the title “The effect of losing a competition: the role of gender, fairness and feedback”.

We will make data and R codes available on our websites as well as on the publishing Journal’s platform after peer-reviewed publication of the study.

Thank you for reading, we look forward to your feedback!

“First principle:
never to let one’s self be
beaten down by persons
or by events” (Marie Curie)