What the study found
The study found that a pre-course aptitude test may help predict how first-year computer science students perform on an introductory programming assessment. The authors report that the Random Forest Regressor showed consistent performance on training and testing data, while the Random Forest Classifier performed less well on the hold-out test set.
Why the authors say this matters
The authors say this matters because early prediction could help identify students who may need support from the start of a programming course. They also conclude that their approach offers a foundation for future targeted support interventions in introductory programming modules.
What the researchers tested
The researchers collected data from 285 first-year computer science undergraduates. They developed a pre-course aptitude test that gathered information on students’ backgrounds, prior experiences, perceived confidence, and their likelihood of holding appropriate mental models, meaning internal representations of core programming concepts. They then trained regression and classification models, refined selected Random Forest models with Sequential Feature Selection, and validated them on a hold-out test set.
What worked and what didn't
The Random Forest Classifier achieved good performance during training, with AUC 0.8688, F1 0.8353, and accuracy 0.7450, but performance dropped on the hold-out test set to AUC 0.7670, F1 0.7020, and accuracy 0.7020. The authors describe this as moderate overfitting, likely due to class imbalance and limited data. The Random Forest Regressor was more stable, with training RMSE 0.1616 and MAE 0.1209, and testing RMSE 0.1713 and MAE 0.1396, though the authors note there is still a sizeable margin of error.
What to keep in mind
The abstract says the classifier showed moderate overfitting and suggests this may be linked to class imbalance and the small amount of data available. The regression model appears more consistent, but the authors still note a sizeable margin of error. The abstract does not describe other limitations beyond these points.
Key points
- A pre-course aptitude test was developed to predict first introductory programming assessment results.
- The study used data from 285 first-year computer science students.
- The Random Forest Classifier performed better in training than on the hold-out test set.
- The Random Forest Regressor showed similar performance on training and testing data.
- The authors suggest the regression approach could help identify students needing extra support.
- The abstract says the classifier’s weaker test performance may reflect class imbalance and limited data.
Disclosure
- Research title:
- Pre-course aptitude test predicted introductory programming results moderately
- Authors:
- Oliver Kerr, Linden J. Ball, Nicky Danino
- Institutions:
- Leeds Trinity University, University of Lancashire, University of Lancashire
- Publication date:
- 2026-03-30
- DOI:
- 10.1145/3806059
- OpenAlex record:
- View
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.

