Large Scale Book Annotation with Social Tags
Keywords:Machine Learning , Text Annotation , Social Tags
We describe work on large scale automatic annotation of full texts of books with social tags. Our task consisted of assigning tags to the full texts of works of fiction and evaluating them against tags assigned by humans. We compared Boosting and Relevance Models (RM) methods to explore how they differ primarily in terms scalability and also annotation quality. We extended beyond the set of 50 tags used in earlier work to sets ranging up to 10,000 tags. We show how a RM based algorithm scales significantly better than a Boosting based algorithm when dealing with large sets of tags.