Computer Science and
     Software Engineering

Computer Science and Software Engineering

TR-COSC 01/93

Very Fast Pattern Matching for Highly Repetitive Text

D. Kulp
Department of Computer Science
University of Canterbury

Abstract

This paper describes two searching methods for locating longest string matches in source texts of low entropy. A modification of the Boyer-Moore scanning algorithm and a statistical method, which searches for less likely symbols, are presented. Both algorithms have been implemented as part of the searching strategy for an LZ77 type encoder. Experimental results are included.