偏见输入，象征性合规输出？GPT在战略评估中对性别和种族的依赖

Bias in, symbolic compliance out? GPT 's reliance on gender and race in strategic evaluations

STRATEGIC MANAGEMENT JOURNAL · 2026

被引 0

人大 AFT50UTD24ABS 4*

Tristan L. Botelho · 耶鲁大学通讯
Qingyang (Iris) Wang · 耶鲁大学

中文导读

研究了GPT在评估创业项目时是否依赖性别和种族信息，发现它虽不系统性地给少数群体低分，但避免将其排在最后，这种象征性合规并未真正消除不平等。

Abstract

Abstract Research summary Organizations are increasingly using large language models (LLMs) to support strategic evaluations. We examine whether and how these systems rely on gender and race. We asked GPT to evaluate identical startup pitches varying only the founder's name, shaping gender and race perceptions. Across 26,000 evaluations, GPT did not systematically assign lower scores to underrepresented minorities but avoided ranking them last without increasing winning likelihoods. To explain these patterns, we conducted “Second Opinion” experiments where GPT evaluated pitches alongside inputs simulating human bias. GPT more readily corrected explicit, identity‐based bias than bias framed as neutral business critiques, with corrections limited in magnitude. We theorize these findings reflect symbolic compliance : LLMs suppress overt discrimination without substantively altering evaluative logic, allowing inequality to persist in AI‐supported strategic evaluations. Managerial summary Large language models (LLMs), like OpenAI's ChatGPT, are increasingly used in strategic evaluations (e.g., hiring, pitches). We examine whether and how these models exhibit gender and racial biases in their evaluations of startup pitches, where we only varied founder names (shaping gender and race perceptions). Across multiple experiments, we find that GPT evaluators did not systematically assign lower scores to underrepresented minorities, primarily by reducing their likelihood of being ranked last. However, this behavior reflects a symbolic effort to avoid overt discrimination rather than a deeper fairness commitment. While LLMs may not reproduce historical and societal biases in overt form, their ability to correct them remains limited. These results highlight the need for implementing bias mitigation measures before integrating LLMs into high‐stakes strategic evaluation processes.

人工智能组织行为战略管理性别与种族偏见大语言模型

阅读原文 ↗