How to build an accurate translation engine?

I found a formula few months ago, myself to translate any source language (computer characters) to destination (computer characters). Using Lua (desk top users) and C++ class (for native access) so that i can embed it in Web Browser etc etc. I am wondering if we have already better something for this in C++ or Lua.

Mine sometimes its really not translating grammars correctly or even rules, before building it i thought mine would be a best way to complete, but its taking way to long now, and i am afraid it may become wrong implementation. Now i want to check out others and compare mine.

I used Google translate or others which is not my target, i was building a translator engine (like google or others), where someone can put there dictionary and create rules.

Is there any existing translation framework or libraries (OpenCOG or Moses) to do Source language to Destination ? example: Arabic to Chinese or English to Japanese ? Or What else Google/others using ?

Any suggestion would be appreciated

Thanks in advance.


Did you take a look at Google Translator Toolkit API? By analyzing its aspects you can have a glimpse of what it implements and what you may need to develop your own translation framework (a lot of work by the way).

Creating/Uploading translation documents

Full list of supported source and target languages

http://www.leniel.net/2010/12/playing-google-translator-toolkit-api.html

More to the stack:

Free/open-source machine translation systems and tools

GNU gettext

TinyTM - Open-Source Translation Memory


Moses is a pretty good open source translation library for C++. cdec represents the current state of the art (but requires context-free grammars for both source and target language). Both require large amounts of training data, ie parallel corpora.

When you've finished, run to your university and demand a PhD.


I hate to discourage you, but you are trying to single-handedly solve the problem of Machine Translation. MT systems like Systran have been developed by teams of scientists and engineers for decades and they are still far from perfect.

链接地址: http://www.djcxy.com/p/95450.html

上一篇: 在Python中嵌入低性能脚本语言

下一篇: 如何建立一个准确的翻译引擎?