Generating the language of AI harms: mapping guardrails using critical code studies

AI Summary of Peer-Reviewed Research

This page presents an AI-generated summary of a published research paper. The original authors did not write or review this article. See full disclosure ↓

AI & Society·2026-03-10·Peer-reviewed·View original paper ↗·Follow this topic (RSS)

Computer Science & AI Artificial Intelligence & Machine Learning

Publication Signals show what we were able to verify about where this research was published.Core publication signals for this source were verified.ⓘ Publication Signals reflect the source’s verifiable credentials, not the quality of the research.

✔ Peer-reviewed source
✔ Published in indexed journal
✔ No retraction or integrity flags

Overview

This study applies critical code studies methodologies to examine guardrails in large language models as sociotechnical control mechanisms. The research investigates how foundation models from four major organizations—Anthropic, DeepSeek, Meta, and OpenAI—implement content moderation through guardrails, focusing on both general-purpose models and public API moderation tools. The analysis treats guardrails as computational and linguistic artifacts that simultaneously encode technical constraints and ideological positions, examining how these systems regulate conversational possibilities through filtering and promotion mechanisms.

Methods and approach

The investigation analyzes multiple documentation types and technical artifacts including endpoint documentation, code examples, technical reports, model architectures, training dataset content, and methodology research papers. The study employs critical code studies as an analytical framework to deconstruct guardrail implementations across the four organizations. This approach treats code not merely as functional instruction but as a site of ideological encoding and social control, examining how computational structures embedded in guardrails shape conversational boundaries and possibilities. The analysis integrates examination of both technical implementation details and their co-construction with linguistic and ideological dimensions.

Key Findings

The examination reveals that guardrails function as dual mechanisms that simultaneously enforce technical constraints and regulate discourse through language. The study maps how guardrail architectures vary across organizations while demonstrating consistent patterns in how conversational boundaries are established through filtering mechanisms. Analysis of documentation and implementation patterns shows that certain conversations are systematically delimited while others are promoted through technical design choices. The research demonstrates that guardrails operate as conversational interfaces that encode specific ideological positions through their technical construction, making the invisible edges of LLM systems visible through examination of policy language, filter criteria, and moderation logic.

Implications

The findings establish that guardrails constitute more than technical safeguards; they represent sites where computational and natural language systems jointly produce regulatory effects on discourse. Understanding guardrails through critical code studies reveals how technical architecture and policy language are inseparable, with code functioning as both encoder and decoder of ideological constraints. This analysis challenges the notion that large-scale AI systems are necessarily impenetrable, demonstrating that systematic examination of guardrails, documentation, and technical reports provides access to comprehending how these systems' regulatory logics operate.

Scope and limitations

This summary is based on the study abstract and available metadata. It does not include a full analysis of the complete paper, supplementary materials, or underlying datasets unless explicitly stated. Findings should be interpreted in the context of the original publication.

Disclosure

Research title: Generating the language of AI harms: mapping guardrails using critical code studies
Authors: Sarah Ciston
Institutions: Academy of Media Arts Cologne, Center for Advanced Internet Studies
Publication date: 2026-03-10
DOI: https://doi.org/10.1007/s00146-026-02922-0
OpenAlex record: View
PDF: Download
Image credit: Photo by Blackcreek Corporate on Unsplash (Source • License)
Disclosure: This post was generated by Claude (Anthropic). The original authors did not write or review this post.

Generating the language of AI harms: mapping guardrails using critical code studies

Overview

Methods and approach

Key Findings

Implications

Scope and limitations

Disclosure

More posts

Mineral fillers reduced tensile strength and raised surface resistivity in silicone rubber

SME contractors face three main barriers to IBS adoption

Generating the language of AI harms: mapping guardrails using critical code studies

Overview

Methods and approach

Key Findings

Implications

Scope and limitations

Disclosure

Get the weekly research newsletter

Related research in Computer Science & AI

More posts

Political organizing is argued to support egalitarian citizenship goals

Mineral fillers reduced tensile strength and raised surface resistivity in silicone rubber

SME contractors face three main barriers to IBS adoption

Local wisdom values support trust between school and multicultural community