Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System Architecture
|Authors:||Stelios Piperidis; Iason Demiros; Prokopis Prokopidis|
|Book title:||Proceedings of the Workshop on E-Tools and Translation of the 5th International Languages and the Media Conference and Exhibition|
In view of the expansion of digital television and the increasing demand to manipulate audiovisual content, tools producing subtitles in a multilingual setting become indispensable for the subtitling industry. Operating in this setting, the MUSA project aims at the development of a system which combines speech recognition, advanced text analysis, and machine translation to help generate multilingual subtitles; a system that converts audio streams into text transcriptions, condenses the content to meet the spatio-temporal constraints of the subtitling process and produces draft translations in two languages. The architecture of the MUSA multilingual subtitle production line includes the following functional blocks: --an English automatic speech recognition (ASR) subsystem for the transcription of English audio streams into text, including separation of speech vs. non-speech, speaker turn detection and alignment of audio signal with an existing transcript --a subtitling subsystem producing English subtitles from English audio transcriptions aiming to provide maximum comprehension while complying with spatio-temporal constraints and linguistic parameters --a multilingual translation subsystem integrating translation memories, machine translation and terminological banks, for translating subtitles from English into Greek and French. Three languages are currently supported: English as source and target as far as subtitling generation is concerned, French and Greek as subtitle translation target languages. However, special care has been given during the system design phase to ensure that the system architecture remains open for new languages to be easily included.