email: baghbani.hamed [at] ut.ac.ir
crowd post-editing tool for machine translation outputs
In recent years, the advancement of technology in the field of machine translation has led to the emergence of machine translation systems in better qualities. Furthermore, the relationship between human and machine has become more and more important as a part of the translation industry. One example of the human-machine relationship is post-editing. Post-editing is intended to correct and edit the translated text of the machine translator software. In this research, we improve FaraazinBar post-editing tool. Faraazinbar is a plug-in for Microsoft Office Word software, which adds its Post-editing capabilities to the MS office. The current research consists of two main sections. At first, we asses post-editing with FaraazinBar tool in three different types of machine translation systems and compare it with manual post-editing. At the second section, we present a crowd-sourcing model for create parallel data from user's post-editing data. Especially we present a quality-estimation model to validate user's post-editing data quality. The experiments conducted in this study also include two parts. The first is post-editing tests in terms of productivity and quality improvement. The second one is quality-estimation model evaluation. The first part of the experiments shows that using FaraazinBar improves the speed of Post-editing process in addition to improving the speed of translation. Furthermore, these experiments show that statistical machine translation systems has more post-editing quality and time improvement in comparison to both neural and rule-based machine translation system and also a better human translation quality of translation can be obtained. The second part of the experiments, shows that the proposed method has 0.91 correlation with Translation Edit Rate (TER) measure, which means that the proposed method is reliable. Keywords: Post-Editing, Machine Translation, Post-editing tool, translation quality validation, quality estimation.