Automated Building of Sentence-Level Parallel Corpus and Chinese-Hungarian Dictionary

Zhang, Yidi; Liu, Zhongxiu

Student Work

Automated Building of Sentence-Level Parallel Corpus and Chinese-Hungarian Dictionary

Public

Decades of work have been conducted on automated building of parallel corpus and automatic dictionary in the field of natural language processing. However, rarely have any studies been done between high-density character-based languages and medium-density word-based languages due to the lack of resources and fundamental linguistic differences. In this paper, we describe a methodology for creating a sentence-level paralleled corpus and an automatic bilingual dictionary between Chinese (a high-density character-based language) and Hungarian (a medium-density word-based language). This method will possibly be applied to create Chinese-Hungarian bilingual dictionary for the Sztaki Dictionary project [ http://szotar.sztaki.hu/].

This report represents the work of one or more WPI undergraduate students submitted to the faculty as evidence of completion of a degree requirement. WPI routinely publishes these reports on its website without editorial or peer review.

Creator

Publisher