I am researching an algorithm to translate English <-> Chinese and English <-> Japanese. In short:
- A library will be used to make the morphological and part of speech analysis of each sentence of text (it is a noun, verb, adjective, ... and what is the degree of certainty).
- A graphical interface will be used for a person to send a text to the algorithm and receive a text from the algorithm.
- The understanding of the text and grammar of each language will be through associations of words and their meanings. For example, nouns will be associated with other words and with "real" objects; the verbs will be deduced by the algorithm by reading the text or will be implemented directly in the code.
- The meaning of words will come from a dictionary.
- Read text can be interpreted as knowledge or "translate". This distinction will be written in the text that will be sent to the algorithm.
- For ease of learning the text may contain metadata such as bold or italic.
- The translation will be done using the grammar and meaning of each word of each language
The quantity that I need will be spend in equipments ($5k PC) and in my wages for a full time development of 12 months.
When the development finish I will create a site that will use this algorithm to translate text.
The progress of the project will be update in this page.