Gender and positional biases in LLM-based hiring decisions: evidence from comparative CV/résumé evaluations

AI Summary of Peer-Reviewed Research

This page presents an AI-generated summary of a published research paper. The original authors did not write or review this article. See full disclosure ↓

PeerJ Computer Science·2026-02-17·Peer-reviewed·View original paper ↗·Follow this topic (RSS)

Computer Science & AI Artificial Intelligence & Machine Learning

Publication Signals show what we were able to verify about where this research was published.Core publication signals for this source were verified.ⓘ Publication Signals reflect the source’s verifiable credentials, not the quality of the research.

✔ Peer-reviewed source
✔ Published in indexed journal
✔ No retraction or integrity flags

Key findings from this study

The study found that all 22 evaluated LLMs consistently favored female-named candidates across 70 professions despite identical qualifications between male and female CVs.
The researchers demonstrate that adding explicit gender fields to CVs further increased the preference for female applicants beyond name-based gender signaling.
The authors report that most models exhibited substantial positional bias, selecting candidates listed first in the prompt regardless of gender or qualifications.

Overview

This study examines gender and positional biases exhibited by large language models when evaluating professional candidates based on résumés or curricula vitae. The investigation involved 22 leading LLMs tasked with selecting candidates from pairs of profession-matched CVs differing only in gendered name cues. The experimental design employed systematic name-swapping to isolate preferences stemming from gender signals rather than qualifications. Additional experimental conditions manipulated gender cues through explicit gender fields, pronoun preferences, and gender-neutral identifiers to assess sensitivity to contextual variations. The research addresses growing deployment of LLMs in automated hiring processes where fairness and transparency are critical concerns. By mapping algorithmic choices across 70 professions, the work contributes to ongoing discourse about ethical artificial intelligence deployment in high-stakes decision-making contexts. The methodology builds on established controlled résumé experimental designs used to detect demographic bias in human evaluators, adapting this approach to evaluate algorithmic behavior in candidate selection tasks.

Methods and approach

The study generated synthetic CVs for 70 target professions using varied prompts to avoid spurious effects from individual prompt structures. Each LLM received a job description paired with two profession-matched CVs bearing either male or female first names but otherwise identical qualifications. Every CV pair was presented twice with names swapped to ensure observed preferences originated from gendered name cues rather than content differences. The experimental design included multiple conditions to test sensitivity to gender signals: adding explicit gender fields, appending gender-congruent pronouns, and replacing gendered names with neutral identifiers like Candidate A and Candidate B. Counterbalancing techniques assigned genders to neutral identifiers to detect positional preferences. Additional experiments evaluated CVs in isolation rather than pairwise comparison to assess rating differences. The investigation systematically varied candidate position within prompts to measure positional bias effects on selection outcomes.

Results

All 22 LLMs consistently favored female-named candidates across 70 professions despite equalized professional qualifications between genders. Adding explicit gender fields to CVs further increased preference for female applicants beyond name-based signaling alone. When gendered names were replaced with gender-neutral identifiers, several models displayed slight preference for selecting Candidate A. Counterbalancing gender assignment between neutral identifiers eliminated gender preference and resulted in selection parity. Individual CV ratings assigned slightly higher average scores to female CVs, but the effect size was negligible compared to pairwise comparison tasks. Including preferred pronouns next to candidate names slightly increased selection odds for that candidate. Most models exhibited substantial positional bias, favoring candidates listed first in the prompt regardless of gender or qualifications. The positional effect represented a major confounding factor in selection decisions across evaluated models.

Implications

The findings raise substantial concerns about deploying LLMs in autonomous hiring decisions where algorithmic fairness is paramount. Consistent female preference across all models suggests systematic bias rather than random variation, indicating potential problems with training data or reinforcement learning from human feedback processes. The substantial positional bias compounds fairness concerns by introducing an additional non-merit-based selection factor that could systematically disadvantage candidates based on presentation order. Organizations adopting LLM-based screening tools must account for these biases through rigorous testing and validation before deployment in consequential contexts. The divergence between pairwise comparison tasks and individual rating tasks suggests task framing significantly influences bias expression in LLM outputs. These results question whether LLMs apply principled reasoning consistently across decision contexts or rely on superficial heuristics sensitive to contextual manipulation. The interaction between multiple bias sources—gender cues, pronouns, and position—complicates bias mitigation strategies. Researchers and practitioners must consider compound effects when designing fairness interventions for LLM-based decision systems in employment and other high-stakes domains.

Scope and limitations

This summary is based on the study abstract and available metadata. It does not include a full analysis of the complete paper, supplementary materials, or underlying datasets unless explicitly stated. Findings should be interpreted in the context of the original publication.

Disclosure

Research title: Gender and positional biases in LLM-based hiring decisions: evidence from comparative CV/résumé evaluations
Authors: David Rozado
Publication date: 2026-02-17
DOI: https://doi.org/10.7717/peerj-cs.3628
OpenAlex record: View
Image credit: Photo by cottonbro studio on Pexels (Source • License)
Disclosure: This post was generated by Claude (Anthropic). The original authors did not write or review this post.

Gender and positional biases in LLM-based hiring decisions: evidence from comparative CV/résumé evaluations

Key findings from this study

Overview

Methods and approach

Results

Implications

Scope and limitations

Disclosure

More posts

Digital defocus interference enables automated microscopy focusing

Sociotechnical barriers hinder digital engineering transformation

Derivative-free Bayesian design method for sequential settings

Encoding choices dominate performance in hybrid quantum neural networks

Gender and positional biases in LLM-based hiring decisions: evidence from comparative CV/résumé evaluations

Key findings from this study

Overview

Methods and approach

Results

Implications

Scope and limitations

Disclosure

Get the weekly research newsletter

Related research in Computer Science & AI

More posts

Digital defocus interference enables automated microscopy focusing

Sociotechnical barriers hinder digital engineering transformation

Derivative-free Bayesian design method for sequential settings

Encoding choices dominate performance in hybrid quantum neural networks