Ubiquitous Pattern Matching and Its Applications (Biology, Security, Multimedia)
Prof. Wojciech Szpankowski
Dept of Computer Science, Purdue University
Fri Feb 24 15:10:00 NZDT 2006 in Room 031, MSCS
Abstract
Pattern matching comes in many flavors. In the string matching problem, for a given string one counts the number of its occurrences in a text. In the subsequence pattern matching, also known as the hidden word problem, we search for a given subsequence rather than a string. Finally, in the repetitive pattern matching we aim at determining how long it takes for a prefix of a text to reappear somewhere in the text. We apply probabilistic and analytic tools of combinatorics and analysis of algorithms to discover general laws of pattern occurrences. An immediate consequence of our results is the possibility to set thresholds at which the appearance of a pattern in given text starts being `meaningful'.
In this talk, we first discuss some applications of string matching methodology to biological sequence analysis; in particular, to the problem of finding weak signals and avoiding artifacts. We then use the approach of hidden words to construct a reliable threshold for the intrusion detection in detecting anomalies. Then, we present a video compression scheme based on a two-dimensional self-repetitive pattern matching (i.e., a lossy extension of the Lempel-Ziv scheme). We conclude this talk with a novel application of the repetitive pattern matching to an error-resilient Lempel-Ziv77 scheme.
View past or future seminars; or view the CSSESS Home Page.