What the study found
The study presents a large, longitudinal dataset from DerStandard, an Austrian newspaper platform, covering user activity from 2013 to 2022. It includes over 75 million user comments, more than 400 million votes, and metadata on articles and user interactions.
Why the authors say this matters
The authors say the dataset enables research on discussion dynamics, network structures, and semantic analysis in German, a mid-resourced language. They also state that it is a reusable resource for computational social science and related fields while preserving user privacy.
What the researchers tested
The researchers assembled and released structured conversation threads, explicit up- and downvotes on comments, and editorial topic labels from the DerStandard news forum. Persistent identifiers were anonymized with salted hash functions, raw comment texts were not publicly shared, and pre-computed vector representations from a state-of-the-art embedding model were released instead.
What worked and what didn't
The dataset contains detailed metadata and interaction data across ten years, including comment threads, votes, and topic labels. The abstract does not report comparative performance results or tests of specific analyses; it describes the dataset as enabling further research.
What to keep in mind
The abstract does not describe analytical findings from the dataset itself. Raw comment text is not publicly available, and the summary provided here is limited to the information stated in the title and abstract.
Key points
- The dataset covers DerStandard user activity from 2013 to 2022.
- It includes over 75 million comments and more than 400 million votes.
- Conversation threads, vote data, and editorial topic labels are included in the release.
- User identifiers were anonymized, and raw comment texts were not shared publicly.
- The authors say the resource supports research in German-language discourse and computational social science.
Disclosure
- Research title:
- DerStandard dataset spans ten years of comments and votes
- Authors:
- Emma Fraxanet, Vicenç Gómez, Andreas Kaltenbrunner, Max Pellert
- Institutions:
- Universitat Pompeu Fabra, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya, Universitat Oberta de Catalunya
- Publication date:
- 2026-04-27
- OpenAlex record:
- View
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.

