Computational Stylistics and Applications
Abstract
Whereas most traditional research in natural language processing and information retrieval has focused on analyzing the "topic" of a text ("what" it says), there is also much important and useful information carried in the "style" of a text ("how" it says it). Potential areas of application include author identification and profiling, determining a text's purpose or the feeling it evokes, and determining social relationships implicit in a text. Style differs from topic in that (a) the textual features that realize style are typically very diffuse over a text, not being tightly related by syntactic relations, and (b) a given feature will typically be indicative of multiple stylistic 'dimensions' at once. Hence great attention must be paid to the empirical question of effective feature design.
In this talk I will describe our methods for stylistic text classification, which use modern machine learning methods applied to novel textual features derived from principles of functional linguistics. Features are based on computing conditional frequencies of a variety of functional lexical and phrasal features in a text. Support vector machines are used for text classification, and the resulting models analyzed. Our goals in this research are twofold: (i) to attain accurate classification of documents for stylistic differences, and (ii) to gain insight into the linguistic nature of the stylistic classes being analyzed. I will present recent results on several stylistic classification problems, including determining the sentiment (positive or negative) of a text and analyzing variation in rhetorical style among scientific articles.
This research is partially supported by the NSF and the Binational Science Foundation, and has been carried out in collaboration with several colleagues and students.
|
Dr. Shlomo Argamon is Associate Professor of Computer Science at the
Illinois Institute of Technology, which he joined in 2002. He previous held academic
positions in Israel at Bar-Ilan University, where he held a Fulbright Postdoctoral Fellowship (1994-96), and at
the Jerusalem College of Technology. Dr. Argamon received his B.S. (1988) in Applied Mathematics from Carnegie-Mellon University,
and his M.Phil. (1991) and Ph.D. (1994) in Computer Science from Yale
University, where he was a Hertz Foundation Fellow. His current
research interests lie mainly in the use of machine learning methods
to aid in functional analysis of natural language, with particular
focus on questions of style. During his career, Dr. Argamon has
worked on a variety of problems in experimental machine learning,
ranging from robotic map-learning, to theory revision, to natural
language processing, and has over 50 papers published in these areas. |