AI Summary of Peer-Reviewed Research
This page presents an AI-generated summary of a published research paper. The original authors did not write or review this article. See full disclosure ↓
Publication Signals show what we were able to verify about where this research was published.STRONGWe verified multiple publication signals for this source, including independently confirmed credentials. Publication Signals reflect the source’s verifiable credentials, not the quality of the research.
- ✔ Peer-reviewed source
- ✔ Published in indexed journal
- ✔ No retraction or integrity flags
Key findings from this study
- The study found that consensus-based classification across multiple LLMs reduces manual annotation effort while producing fewer errors than individual human annotators.
- The authors report that modern open-source language models perform sufficiently for literature filtration, removing cost constraints from the method.
- The researchers demonstrate that human-supervised AI classification via interactive visual analytics maintains researcher control while accelerating systematic review corpus screening.
Overview
Systematic literature reviews require extensive manual effort to filter large publication corpora retrieved through keyword searches. This work presents a pipeline leveraging multiple large language models to classify candidate papers through consensus voting. A supervised human-in-the-loop interface enables real-time inspection and modification of model outputs throughout the filtration process.
Methods and approach
The pipeline employs multiple state-of-the-art LLMs from mid-2024 and fall 2025, both open-source and commercial variants, to classify papers against descriptive prompts. Model outputs undergo consensus-based decision-making to determine relevance. The LLMSurver visual analytics web interface provides human supervision and interactive control. Evaluation used ground-truth data from an existing systematic review containing 8323 candidate papers.
Results
The proposed approach reduced manual annotation effort significantly while achieving lower error rates than single human annotators. Modern open-source models demonstrated sufficient performance for the classification task, eliminating cost barriers associated with commercial model dependencies. The consensus scheme across multiple models improved classification reliability compared to individual model outputs.
Implications
Human-AI collaboration frameworks can accelerate systematic literature review workflows without sacrificing quality or reducing reviewer oversight. The accessibility of open-source models for this task expands adoption potential across under-resourced research institutions and teams. The visual analytics interface model provides a template for implementing supervised LLM applications in academic knowledge synthesis processes.
Scope and limitations
This summary is based on the study abstract and available metadata. It does not include a full analysis of the complete paper, supplementary materials, or underlying datasets unless explicitly stated. Findings should be interpreted in the context of the original publication.
Disclosure
- Research title: Leveraging LLMs for semi-automatic corpus filtration in systematic literature reviews
- Authors: Lucas Joos, Daniel A. Keim, Maximilian T. Fischer
- Institutions: University of Konstanz
- Publication date: 2026-02-16
- DOI: https://doi.org/10.1016/j.cag.2026.104537
- OpenAlex record: View
- Image credit: Photo by Elen Sher on Unsplash (Source • License)
- Disclosure: This post was generated by Claude (Anthropic). The original authors did not write or review this post.
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.


