Expected results and Impacts :
The TTC project develops generic methods and tools for automatic extraction of terminologies and alignment algorithms including adaptors to domains and languages, in order to break the lexical acquisition bottleneck in both statistical and rule-based machine translation.
It will also develop or adapt tools for gathering and managing these comparable corpora and for managing terminologies. In particular, a topical web crawler and an open terminology platform will be developed. The platform will allow to create thematic corpora given some clues (such as terms or documents on a specific domain), to extract monolingual terminology from such corpora, to create a comparable corpus in a target language from a corpus in a source language, to align bilingual terminologies, to choose the tools to apply for terminology extraction, to expand a given corpus and to export monolingual or bilingual terminologies in order to use them easily in automatic and semi-automatic translation tools.
Impact of translation tools and methods resulting from the TTC project will be evaluated in four domain of application :
- Computer Assisted Translation (CAT) tools;
- Machine Translation (MT) tools;
- Terminology management tools.