TTC was presented at the RANLP 2011: Recent Advances in Natural Language Processing conference held on September 12-14, 2011 in Hissar, Bulgaria with the paper "Knowledge-Poor Approach to Shallow Parsing: Contribution of Unsupervised Part-of-Speech Induction" by Marie Guégan and Claude de Loupy (SYLLABS).
Abstract:
Natural language processing tasks often rely on part-of-speech (POS) tagging as a prepro-cessing step. However it is not clear how the absence of any part-of-speech tagger should hamper the development of other natural language processing tools. In this paper we investigate the contribution of fully unsupervised part-of-speech induction to a common natural language processing task. We focus on the supervised English shallow parsing task and compare systems relying either on POS induction, on POS tagging, or on lexical features only as a baseline. Our experiments on the English CoNLL'2000 dataset show a significant benefit from POS induction over the baseline, with performances close to those obtained with a traditional POS tagger. Results demonstrate a great potential of POS induction for shallow parsing which could be applied to resource-scarce languages.




