Logarithmic-Time Internal Pattern Matching Queries in Compressed and Dynamic Texts

Close-up of a dark text editor or IDE displaying C programming code with typedef declarations and comments, with syntax highlighting in green and blue text.
Image Credit: Photo by Patrick Martin on Unsplash (SourceLicense)

AI Summary of Peer-Reviewed Research

This page presents an AI-generated summary of a published research paper. The original authors did not write or review this article. See full disclosure ↓

Theory of Computing Systems·2026-02-23·Peer-reviewed·View original paper ↗·Follow this topic (RSS)
Publication Signals show what we were able to verify about where this research was published.MODERATECore publication signals for this source were verified. Publication Signals reflect the source’s verifiable credentials, not the quality of the research.
  • ✔ Peer-reviewed source
  • ✔ Published in indexed journal
  • ✔ No retraction or integrity flags

Overview

Internal Pattern Matching queries on text fragments address the problem of computing all exact occurrences of a pattern X within a text segment Y, where both X and Y are themselves fragments of a source text T and the length constraint |Y|<2|X| holds. Prior work by Kociumaka, Radoszewski, Rytter, and Waleń established constant-time query resolution using linear-space data structures. This investigation extends IPM query capability to compressed string representations and dynamically updatable text structures, introducing logarithmic-time query algorithms that maintain efficiency guarantees across these scenarios.

Methods and approach

The proposed approach constructs query algorithms compatible with balanced recompression-based run-length straight-line programs, a compressed text representation framework. The method operates without requiring explicit preprocessing of the underlying RLSLP structure, enabling direct application to dynamic string data structures that support fully persistent updates. The algorithm achieves logarithmic query time by leveraging structural properties of straight-line program decompositions while maintaining compatibility with optimal compression bounds.

Key Findings

The primary result is an O(log n) time algorithm for answering IPM queries on any balanced recompression-based RLSLP. When instantiated with the optimal RLSLP construction, the combined data structure achieves O(delta log (n log sigma / delta log n)) space complexity, where delta represents substring complexity and sigma denotes alphabet size. The method demonstrates compatibility with dynamic string structures supporting logarithmic-time persistent updates with high probability, extending IPM query functionality to mutable text representations without sacrificing established asymptotic guarantees.

Implications

The results establish that compressed text representations need not incur substantial query-time penalties for internal pattern matching operations. The logarithmic-time bound remains practical for typical text lengths encountered in applications requiring compressed storage. Extension to dynamic strings enables IPM queries on evolving texts while preserving the space-efficiency properties of compressed representations, broadening applicability to scenarios involving both memory constraints and text modifications.

The technique's independence from explicit RLSLP preprocessing indicates potential for generalization to other straight-line program variants or compression schemes. The compatibility with existing optimal compression constructions suggests that further improvements may derive from advances in either query algorithms or compression methodology rather than fundamental trade-offs between these components. These results contribute to the broader understanding of query efficiency achievable on compressed and dynamic text structures.

Disclosure

  • Research title: Logarithmic-Time Internal Pattern Matching Queries in Compressed and Dynamic Texts
  • Authors: Anouk Duyster, Adam Polak
  • Publication date: 2026-02-23
  • DOI: https://doi.org/10.1007/s00224-026-10266-x
  • OpenAlex record: View
  • PDF: Download
  • Image credit: Photo by Patrick Martin on Unsplash (SourceLicense)
  • Disclosure: This post was generated by Claude (Anthropic). The original authors did not write or review this post.

Get the weekly research newsletter

Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.

More posts