AI Summary of Peer-Reviewed Research

This page presents an AI-generated summary of a published research paper. The original authors did not write or review this article. [See full disclosure ↓]

Publishing process signals: MODERATE — reflects the venue and review process. — venue and review process.

DerStandard dataset spans ten years of comments and votes

Research area:Data scienceMetadataIdentifier

What the study found

The study presents a large, longitudinal dataset from DerStandard, an Austrian newspaper platform, covering user activity from 2013 to 2022. It includes over 75 million user comments, more than 400 million votes, and metadata on articles and user interactions.

Why the authors say this matters

The authors say the dataset enables research on discussion dynamics, network structures, and semantic analysis in German, a mid-resourced language. They also state that it is a reusable resource for computational social science and related fields while preserving user privacy.

What the researchers tested

The researchers assembled and released structured conversation threads, explicit up- and downvotes on comments, and editorial topic labels from the DerStandard news forum. Persistent identifiers were anonymized with salted hash functions, raw comment texts were not publicly shared, and pre-computed vector representations from a state-of-the-art embedding model were released instead.

What worked and what didn't

The dataset contains detailed metadata and interaction data across ten years, including comment threads, votes, and topic labels. The abstract does not report comparative performance results or tests of specific analyses; it describes the dataset as enabling further research.

What to keep in mind

The abstract does not describe analytical findings from the dataset itself. Raw comment text is not publicly available, and the summary provided here is limited to the information stated in the title and abstract.

Key points

  • The dataset covers DerStandard user activity from 2013 to 2022.
  • It includes over 75 million comments and more than 400 million votes.
  • Conversation threads, vote data, and editorial topic labels are included in the release.
  • User identifiers were anonymized, and raw comment texts were not shared publicly.
  • The authors say the resource supports research in German-language discourse and computational social science.

Disclosure

Research title:
DerStandard dataset spans ten years of comments and votes
Authors:
Emma Fraxanet, Vicenç Gómez, Andreas Kaltenbrunner, Max Pellert
Institutions:
Universitat Pompeu Fabra, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya, Universitat Oberta de Catalunya
Publication date:
2026-04-27
OpenAlex record:
View
AI provenance: This post was generated by OpenAI. The original authors did not write or review this post.