AI Summary of Peer-Reviewed Research

This page presents an AI-generated summary of a published research paper. The original authors did not write or review this article. [See full disclosure ↓]

Publishing process signals: MODERATE — reflects the venue and review process. — venue and review process.

Mamba-based compressor matches or exceeds standard tools on scientific data

Research area:Computer ScienceData compressionSoftware

What the study found

BOA Constrictor is a new lossless neural compressor based on the Mamba state space model, and it achieved competitive compression on structured scientific datasets. The authors report that it matched or exceeded LZMA, ZSTD, and ZLIB at maximum compression on several high-energy physics datasets.

Why the authors say this matters

The authors say this matters because petabyte-scale data from high-energy physics experiments creates major storage challenges. They conclude that BOA is a first step toward improving compression for next-generation scientific data.

What the researchers tested

The researchers tested a pseudo-streaming lossless compressor called Bytewise Online Autoregressive (BOA) Constrictor on multiple scientific datasets. These included ATLAS Open Data in HDF5 format, simulated particle collision records in HepMC v3, CMS Open Data in NanoAOD format, computational fluid dynamics data, and CAMELS cosmology data.

What worked and what didn't

BOA achieved an effective compression ratio of 7.23× on ATLAS Open Data and 9.13× on HepMC v3 when the model size was included, outperforming the next-best traditional algorithm in those cases. On CMS Open Data, it obtained comparable or improved effective compression ratios, within 5% of the next-best traditional algorithm. It also reached 1.61× on computational fluid dynamics data and up to 1.53× on CAMELS cosmology datasets, while its throughput in this proof-of-principle implementation was described as not yet competitive with optimized algorithms such as ZSTD or LZMA.

What to keep in mind

The abstract describes a proof-of-principle implementation, so the reported compression throughput is limited. It also notes that BOA performed best on high-entropy float32 payloads, and that the model size was counted in the effective compression ratio.

Key points

  • BOA Constrictor is a lossless neural compressor built on the Mamba state space model.
  • It matched or exceeded LZMA, ZSTD, and ZLIB at maximum compression on several HEP datasets.
  • Effective compression ratios including model size were 7.23× on ATLAS Open Data and 9.13× on HepMC v3.
  • On CMS Open Data, BOA was within 5% of the next-best traditional algorithm.
  • Throughput in the proof-of-principle implementation was not yet competitive with optimized compressors such as ZSTD or LZMA.

Disclosure

Research title:
Mamba-based compressor matches or exceeds standard tools on scientific data
Authors:
Akshat Gupta, C. Doglioni, Thomas Joseph Elliott
Institutions:
University of Manchester
Publication date:
2026-04-24
OpenAlex record:
View
AI provenance: This post was generated by OpenAI. The original authors did not write or review this post.