Simpson demonstrated that a statistical relationship observed within a population—i. Others names, according to , include the Simpson-Yule effect, reversal paradox or amalgamation paradox. Examination of the disaggregated data reveals few decision-making units that show statistically significant departures from expected frequencies of female admissions, and about as many units appear to favor women as to favor men. If the data are properly pooled, taking into account the autonomy of departmental decision making, thus correcting for the tendency of women to apply to graduate departments that are more difficult for applicants of either sex to enter, there is a small but statistically significant bias in favor of women. While it seems as if women are rejected more often overall, women are actually less often rejected on a departmental level. Copied from Examples in HR Simpsons Paradox can easily occur in organizational or human resources settings as well. Let me run you through two illustrated examples, I simulated: Assume you run a company of 1000 employees and you have asked all of them to fill out a Big Five personality survey. Should you select more neurotic people to improve your overall company performance? Or are you discriminating emotionally-stable non-neurotic employees when it comes to salary? Taking a closer look at the subgroups in your data, you might however find very different relationships. Similarly, splitting the employees by education level, it becomes clear that there is a relationship between neuroticism and education level that may explain the earlier association with salary. More educated employees receive higher salaries and within these groups, neuroticism is actually related to lower yearly income. Solving the paradox Kievit and colleagues argue that Simpsons paradox may occur in a wide variety of research designs, methods, and questions, particularly within the social and medical sciences. The paradox may be prevented from occurring altogether by more rigorous research design: testing mechanisms in longitudinal or intervention studies. However, this is not always feasible. Alternatively, the researchers pose that data visualization may help recognize the patterns and subgroups and thereby diagnose paradoxes. To this end, Kievit and Epskamp have developed a tool to facilitate the detection of hitherto undetected patterns of association in existing datasets. It is written in R, a language specifically tailored for a wide variety of statistical analyses which makes it very suitable for integration into the regular analysis workflow. Finally, its code is open source and can be extended and improved upon depending on the nature of the data being studied. One example of application is provided in the paper, for a dataset on coffee and neuroticism. A regression analysis would suggest a significant positive association between coffee and neuroticism overall. However, when the detection algorithm of the R package is applied, a different picture appears: the analysis shows that there are three latent clusters present and that the purported positive relationship only holds for one cluster whereas it is negative in the others.
Finally, its code is open source and can be extended and improved upon depending on the nature of the data being studied. If the data are properly pooled, taking into account the autonomy of departmental decision making, thus correcting for the tendency of women to apply to graduate departments that are more difficult for applicants of either sex to enter, there is a small but statistically significant bias in favor of women. Or are you discriminating emotionally-stable non-neurotic employees when it comes to salary?
