Very Fast Pattern Matching for Highly Repetitive Text
D. Kulp
Department of Computer Science
University of Canterbury
Abstract
This paper describes two searching methods for locating longest string matches in source texts of low entropy. A modification of the Boyer-Moore scanning algorithm and a statistical method, which searches for less likely symbols, are presented. Both algorithms have been implemented as part of the searching strategy for an LZ77 type encoder. Experimental results are included.