Shibata, H., Ishihata, M., & Inenaga, S. (2024). Packed Acyclic Deterministic Finite Automata. arXiv preprint arXiv:2410.07602.
This paper introduces a new data structure called Packed Acyclic Deterministic Finite Automata (PADFA) designed to improve the efficiency of pattern searching in large dictionaries. The authors aim to demonstrate the superiority of PADFA over traditional data structures like tries and minimal ADFA's in terms of both speed and memory usage.
The authors propose a method for constructing PADFA from existing ADFA structures by employing techniques like Symmetric Centroid Path Decomposition (SymCPD) to extract heavy paths, which are then stored as packed strings. The remaining light edges are organized using Biased Search Trees (BST) for efficient access. The authors theoretically analyze the time and space complexity of PADFA and compare it with existing approaches. They also conduct experiments on real-world datasets to evaluate the practical performance of PADFA.
The authors conclude that PADFA offers a significant advancement in pattern searching by effectively leveraging packed strings and heavy path decomposition. Its superior performance in terms of both speed and memory efficiency makes it a promising alternative to traditional data structures for various applications involving pattern searching.
This research contributes to the field of string algorithms and data structures by introducing a novel and efficient approach for pattern searching. The proposed PADFA structure has the potential to improve the performance of various applications that rely on efficient pattern matching, such as information retrieval, natural language processing, and bioinformatics.
The paper primarily focuses on the theoretical analysis and empirical evaluation of PADFA for pattern searching. Future research could explore the application of PADFA in other domains like pattern matching and regular expression matching. Additionally, investigating the dynamic update capabilities of PADFA for evolving dictionaries could further enhance its practical applicability.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania