SKILL: A System for Skill Identification and Normalization
Named Entity Recognition (NER) and Named Entity Normalization (NEN) refer to the recognition and normalization of raw texts to known entities. From the perspective of recruitment innovation, professional skill characterization and normalization render human capital data more meaningful both commercially and socially. Accurate and detailed normalization of skills is the key for the predictive analysis of labor market dynamics. Such analytics help bridge the skills gap between employers and candidate workers by matching the right talent for the right job and identifying in-demand skills for workforce training programs. This can also work towards the social goal of providing more job opportunities to the community. In this paper we propose an automated approach for skill entity recognition and optimal normalization. The proposed system has two components: 1) Skills taxonomy generation, which employs vocational skill related sections of resumes and Wikipedia categories to define and develop a taxonomy of professional skills; 2) Skills tagging, which leverages properties of semantic word vectors to recognize and normalize relevant skills in input text. By sampling based end-user evaluation, the current system attains 91% accuracy on the taxonomy generation and 82% accuracy on the skills tagging tasks. The beta version of the system is currently applied in various big data and business intelligence applications for workforce analytics and career track projections at CareerBuilder.