A Unified POS Tagging Architecture and its Application to Greek
|Authors:||Harris Papageorgiou; Prokopis Prokopidis; Voula Giouli; Stelios Piperidis|
|Book title:||Proceedings of the 2nd Language Resources and Evaluation Conference|
|Organization:||European Language Resources Association|
This paper proposes a flexible and unified tagging architecture that could be incorporated into a number of applications like information extraction, cross-language information retrieval, term extraction, or summarization, while providing an essential component for subsequent syntactic processing or lexicographical work. A feature-based multi-tiered approach (FBT tagger) is introduced to part-of-speech tagging. FBT is a variant of the well-known transformation based learning paradigm aiming at improving the quality of tagging highly inflective languages such as Greek. Additionally, a large experiment concerning the Greek language is conducted and results are presented for a variety of text genres, including financial reports, newswires, press releases and technical manuals. Finally, the adopted evaluation methodology is discussed.