What the study found
The study identified 172 unique open datasets used in 204 publications from learning analytics, educational data mining, and AI in education conference papers from 2020 to 2024. The authors describe this as the most comprehensive collection and analysis of open educational datasets to date.
Why the authors say this matters
The authors say open datasets support reproducibility, collaboration, and trust in research findings, and can also benefit authors through greater visibility, credibility, and citation potential. The study suggests its findings and checklist may help researchers publish data and support broader adoption of open data practices.
What the researchers tested
The researchers conducted a systematic survey of publicly available datasets published alongside research papers. They manually examined 1,125 papers from three flagship conferences: LAK, EDM, and AIED, covering the years 2020 to 2024.
What worked and what didn't
The survey found 172 unique datasets in 204 publications, and 143 of those datasets had not been captured in any prior survey of open data in learning analytics. The authors also categorized the datasets and analyzed their context, analytical methods, use, and other properties, and they produced an annotated inventory plus 8-item PRACTICE guidelines and checklist.
What to keep in mind
The abstract does not describe detailed limitations of the survey beyond noting that availability and practices in the field had been unclear. The findings are based on papers from three conferences and a five-year period, so the scope is limited to that set of publications.
Key points
- The survey identified 172 unique open datasets across 204 publications.
- The researchers manually reviewed 1,125 papers from LAK, EDM, and AIED from 2020 to 2024.
- The authors say 143 of the datasets were not included in earlier surveys of open learning analytics data.
- The paper includes an annotated inventory of the datasets and their related publications.
- The authors provide 8-item PRACTICE guidelines and a checklist for publishing data.
Disclosure
- Research title:
- Survey maps open datasets in learning analytics research
- Authors:
- Valdemar Švábenský, Brendan Flanagan, Erwin Daniel López Zapata, Atsushi Shimada
- Institutions:
- Masaryk University, Kyoto University, Kyushu University
- Publication date:
- 2026-02-26
- DOI:
- 10.1145/3798096
- OpenAlex record:
- View
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.


