Quantitative Methods (2022)—Course Assignment

Format
The take-home assignment you will analyze and interpret the results of a 2×2 experiment.
Send me your assignment as a PDF-document via email to Zhihao Hu. The PDF should include (if applicable) the STATA-commands, STATA output and your interpretations.
The assignment is due on June 1st.

Exercise A: Analyzing a 2×2 Experiment
When employers make hiring decisions, they have to predict applicants’ performance on the basis of various characteristics. Typical characteristics include the applicants’ level of education or their grades. However, previous literature shows that hiring decisions are also affected by demographic characteristics such as race and gender (Bertrand & Duflo, 2016; Bertrand & Mullainathan, 2004). One possible explanation for such discrimination is that recruiters falsely believe there would be performance differences by demographic characteristics and take hiring decisions that are in line with these false beliefs.

In this exercise you want to explore where such false beliefs could stem from. You suspect that there might be a (hitherto unexamined) cognitive mechanism at work. Specifically, you had the following idea: Recruiters often lack information regarding average performance differences between different demographic groups such as between men and women (i.e., they do not know if there are performance differences between men and women and if so, who is better performing). Thus, to form beliefs about possible performance differences they use an easily observable heuristic: information about the composition of top performers (e.g., the gender composition of top executives, of top researchers at ESCP, or of top chess players).

However, many professions are unbalanced across demographic characteristics. For instance, in the U.S. 85% of all civil engineers are male, whereas 90% of all nurses are female. In such unbalanced samples, there will be more people of the majority group represented at each point of the performance distribution. Imagine for instance workers in a certain profession are composed of 80% men and 20% women. If performance for both, men and women, is normally distributed there would thus be four times as many men at each point of the distribution as shown in the following graph:

As such, there will also be more men at the salient top end of the distribution (where the magnifying glass is). In absence of performance differences, we would for instance expect to see on average 1 woman and 4 men in the top 5. When people now base their judgement on information regarding the composition of top performers in an unbalanced sample they may forget to take the base rate (how many men and women are in the full sample) into account and falsely belief that the minority group (women in the example above) are performing worse even in the absence of such true differences. You want to test whether this bias may be the reason for the discrimination explained above, thus influencing employers’ hiring decisions. You formulate the following moderation/interaction hypothesis:

Receiving information about the top performers decreases the likelihood of hiring a minority candidate more for unbalanced than for balanced samples. In fact, (in the absence of performance differences) you do not expect any effect of information for a balanced sample. But because the reaction to information in the balanced sample might depend on
people’s beliefs before receiving any type of information, you don’t formulate an explicit hypothesis in this regard, but want to focus on testing the moderation.

To test your hypothesis you set up two separate experiments:

1. In the first (online) experiment you simply let 400 participants do a real effort (string reversal) task to collect their performance data matched to demographic characteristics. You find that there is no performance difference by gender. These participants form the “candidate pool” of the second experiment.

2. In the second (online) experiment, you hire a new set of 2929 new participants. After an introduction, you show these participants one randomly drawn pair of candidates that is composed of one man and one woman. For each candidate participants receive information regarding four characteristics: gender, age, education, and ethnicity. Participants then have to decide whom to “hire” and are paid according to the real past performance of the hired candidate. Each participant is in one out of four experimental conditions. In particular, before they take the hiring decision, you experimentally manipulate two factors (following a 2×2 design):

a) You allocate participants a sub-pool of 100 candidates that is either gender balanced (50 men: 50 women) or gender imbalanced (80 men : 20 women).
b) For their respective sub-pool, participants either receive information on the composition of top performers (number of women among the top 5) or not.

The dependent variable is the participants’ binary hiring decision (1 = hire a woman, 0 = hire a man). In this assignment your main task is to empirically assess whether your hypothesis holds such that providing people with information on the top performers reduces their likelihood to hire a woman more
for the unbalanced that for the balanced sample.
You will find the respective (true) data file “hiring data” in your assignment folder. The variables are coded as follows:

Variable name Meaning Coding hire_wom

Did the participant decide to hire a woman or man candidate?
1 = participant hired a woman
0 = Participant hired a man
info (1st manipulation)

Did the participant receive information regarding the gender composition of the top 5 performers or not?
1 = yes, the participant received info
0 = no, the participant received no info
unbalanced (2nd manipulation)

Did the participant hire a candidate from a balanced or unbalanced subject pool?
1 = the participant hired from an unbalanced pool,
0 = the participant hired from a balanced pool
age participant age in years
female participant gender 1 = female, 0 = male
ethnicity categorical variable representing
different ethnicities
“White” = White ethnicity
“Black” = Black ethnicity
“Asian” = Asian ethnicity
“Other” = Other ethnicity
(e.g., Hawaiian or Native Indian)

Tasks
1. You see that one of your independent variables ethnicity is a string rather than a numeric variable. Transform it into four 0-1 dummy variables.

2. To get a first feeling for the data, provide the descriptive statistics for the variables hire_wom, unbalanced, info, age, female, and the four different ethnicities. How many of the variables are dummy variables? What is the range of the continuous variable(s)

3. Next, display the correlation matrix of the same 9 variables, including significance levels and marking each significant correlation at the 5%-level with an asterisk. State whether there was a significant correlation between age and hiring a woman—and if so in what way.

4. Provide a histogram with 20 bins and assess—by looking at it, no need for statistical tests—whether the variable age was normally distributed amongst participants. (Tipp: use the drop-down menu for graphics as we did in class for the pie chart.)

5. To check whether randomization of participants to experimental conditions (unbalanced and info) worked, please do a test of balance on the demographic variables age and female. You can use either way we have learned: (1) Doing one set of t-tests for each of the two manipulations OR (2) doing two sets of regression, one for each demographic variable. What do the results suggest when taking the 5 vs. 10% significance level?

6. Now, to examine possible main effects of your experimental conditions run an OLS-regression1 with robust standard errors in which you use only unbalanced and info to predict the probability to hire a woman (for this and the following parts you do not need to add control variables2). Carefully interpret the results, specifically:
a) How should the coefficients of your two explanatory variables be interpreted (i.e. what type of group comparisons do they measure)? In your opinion, do these coefficients reflect meaningful information in the present context?
b) Are your two explanatory variables significant and if yes, in what way?
c) How high is the R-squared? What does this number mean?

7. Next, you want to get to the core of your research question and examine whether there is an interaction (moderation) effect of your two explanatory variables unbalanced and info. To this end:
a) Provide the regression output of the same model as in (6.), but now including the interaction term.
b) What are now the coefficients of unbalanced and info? How does the interpretation of each coefficient change from the regression without the interaction term?
c) Is the interaction term significant? What is its’ interpretation?

8. Next, you are interested in assessing whether participants who were in the condition with both, an unbalance sample & information, were significantly less likely to hire a woman than participants who received neither of the two treatments. Run the relevant analysis via OLS regression. Display the regression results and interpret them.

1 If you wonder why you can use OLS for a binary dependent variable, you may want to consider this recent methodological article: Gomila, R. (2020). Logistic or linear? Estimating causal effects of experimental treatments on binary outcomes using regression analysis. Journal of Experimental Psychology: General, 1-27.

2 In a paper you would add a robustness check including controls –especially if participant characteristics are not balanced across experimental groups. For the present dataset note that adding controls does not significantly change the results (you can try out if you wish). To make the exercise a bit easier you may thus omit them.

9. Can the coefficients of info, unbalanced, and the interaction term in the previous exercises be interpreted causally or are they merely reflecting correlations? Explain your answer.

10. Write a brief conclusion regarding your initial question: Does receiving information regarding the top performers of an unbalanced group lead to discrimination of minority groups or are people able to correctly adjust for base rates (i.e. correctly account for sample imbalance)?