High-level Features for Learning Subjective Language across Domains
In this paper, we propose to study the characteristics for analyzing subjective content in documents. For that purpose, we present and evaluate a novel method based on level of abstraction of nouns. By comparing state-of-the-art features and the level of abstraction of nouns between three annotated corpora and texts downloaded from Wikipedia and Web Blogs, we show that, building data sets for the classification of opinionated texts can be done automatically from the web, at the document level. Moreover, we present accuracy levels within domains of 96.5% and across domains of 74.5%.