Machine Learning Approaches in Software Fault Prediction: A Review

AI Summary of Peer-Reviewed Research

This page presents an AI-generated summary of a published research paper. The original authors did not write or review this article. See full disclosure ↓

Cureus Journal of Computer Science·2026-01-28·Peer-reviewed·View original paper ↗·Follow this topic (RSS)

Computer Science & AI Artificial Intelligence & Machine Learning

Publication Signals show what we were able to verify about where this research was published.Core publication signals for this source were verified.ⓘ Publication Signals reflect the source’s verifiable credentials, not the quality of the research.

✔ Peer-reviewed source
✔ No retraction or integrity flags

Overview

This systematic review examines machine learning methodologies applied to software fault prediction across 45 peer-reviewed articles published between 2023 and 2025. The review synthesizes recent developments addressing software reliability enhancement through predictive modeling, encompassing influencing factors, prediction techniques, benchmark datasets, evaluation frameworks, and existing limitations in current approaches.

Methods and approach

A structured literature review was conducted across IEEE, Springer, ACM, and ScienceDirect digital libraries, examining articles published within the 2023-2025 timeframe. The review synthesized findings across multiple dimensions: factors influencing fault prediction accuracy, algorithmic approaches to prediction, software metrics and datasets utilized for model training, evaluation metrics employed for performance assessment, model selection criteria, and identified challenges in the field.

Key Findings

Support Vector Machine and Random Forest algorithms demonstrated superior performance across multiple evaluation metrics including accuracy, precision, recall, and F1-score. Public datasets from the PROMISE repository and NASA Metric Data Program were identified as widely utilized resources, with product metrics-based datasets contributing substantially to improved model performance. The review identified these established benchmarks as foundational to comparative analysis across studies and reproducibility of results.

Implications

The findings indicate that algorithm selection remains a critical determinant of fault prediction efficacy, with ensemble methods and margin-based approaches showing consistent superiority. Standardized datasets enable comparative evaluation and knowledge accumulation within the field, though the reliance on specific repositories suggests potential limitations in dataset diversity and domain coverage. The prevalence of product metrics in benchmark datasets indicates established conventions in measurement practices, though this standardization may constrain exploration of alternative metric spaces.

Scope and limitations

This summary is based on the study abstract and available metadata. It does not include a full analysis of the complete paper, supplementary materials, or underlying datasets unless explicitly stated. Findings should be interpreted in the context of the original publication.

Disclosure

Research title: Machine Learning Approaches in Software Fault Prediction: A Review
Authors: Ruchika Aggarwal, Kamaljit Kaur
Publication date: 2026-01-28
DOI: https://doi.org/10.7759/s44389-026-00021-7
OpenAlex record: View
PDF: Download
Image credit: Photo by Techivation on Unsplash (Source • License)
Disclosure: This post was generated by Claude (Anthropic). The original authors did not write or review this post.

Machine Learning Approaches in Software Fault Prediction: A Review

Overview

Methods and approach

Key Findings

Implications

Scope and limitations

Disclosure

More posts

The IMF, labour market reform and women’s labour force participation

Digital defocus interference enables automated microscopy focusing

Sociotechnical barriers hinder digital engineering transformation

Derivative-free Bayesian design method for sequential settings

Machine Learning Approaches in Software Fault Prediction: A Review

Overview

Methods and approach

Key Findings

Implications

Scope and limitations

Disclosure

Get the weekly research newsletter

Related research in Computer Science & AI

More posts

The IMF, labour market reform and women’s labour force participation

Digital defocus interference enables automated microscopy focusing

Sociotechnical barriers hinder digital engineering transformation

Derivative-free Bayesian design method for sequential settings