Computer Science and
     Software Engineering

Computer Science and Software Engineering

CSSE Seminar Series (CSSESS)

Quick links: Past seminarsfuture seminarsCSSESS Home


Seminar

~ The statistical behaviour of words over time scales from seconds to centuries ~


Speaker
Prof. Janet Pierrehumbert

Institute
Northwestern University, Evanston, USA

Time & Place
15:10hrs, Friday, 30 July, in Room 031, Erskine Building

All are welcome

Abstract

Words are a workhorse of document retrieval, spam filtering, dialogue systems, and many other areas of speech and language engineering. Exploiting word statistics is challenging, because the set of information units defined by the words in the lexicon is extremely large compared to feasible sample sizes. By Zipf's law, the rank-frequency distribution approximately obeys a power law, so that rare word types dominate the inventory. Worse yet, the inventory is non-stationary. In language use by humans, law-like patterns reflecting word statistics nevertheless are found at time scales from seconds to centuries. Analysing data from the Internet, historical records, and behavioural experiments, we show how effects at different scales may be characterized, and what they mean for processing, prediction, and dynamics.

Biography

Janet Pierrehumbert is Professor of Linguistics at Northwestern University. She received her Ph.D in Linguistics from MIT, minoring in EECS, and was a Member of the Technical Staff in Linguistics and AI Research at AT&T Bell Laboratories until joining the Northwestern faculty in 1989. Pierrehumbert's research combines experiments and computational models to explore the structure and dynamics of language systems. Her model of English intonation has been widely applied in dialogue systems. Recent research focuses on statistical modeling of words as a nexus of meaning, social identity, processing, and change. Her honours and awards include an NSF Faculty Award for Women Scientists and Engineers, a Guggenheim Fellowship, and membership in the American Academy of Arts and Sciences. Pierrehumbert is presently at the Univ. of Canterbury as an Erskine Fellow and a visitor to the New Zealand Institute of Language, Brain, and Behaviour.


Quick links: Past seminarsfuture seminarsCSSESS Home