How Neural machine translation works
Let`s go back to the translation process…
The approach to organizing neural machine translation is fundamentally different from the previous one and, based on the Vokua triangle, it can be described as follows.
If we add more colors or shapes, we can also increase the number of dimensions so that any point can represent different objects and the distance between them, which reflects the degree of their similarity.
The basic idea is that this also works for word placement. Instead of shapes, there are words, the space is much larger - for example, we use 800 dimensions, but the idea is that words can be represented in these spaces with the same properties as shapes.
Consequently, words with common properties and attributes will be located close to each other. For example, you can imagine that words of a certain part of speech are one dimension, words based on gender (if any) are another, there may be a sign of positive or negative meaning, and so on.
We do not know exactly how these investments are formed. In another article (https://doctranslator.com/), we will analyze nesting in more detail, but the idea itself is as simple as the organization of shapes in space.
Let`s go back to the translation process. The second step is as follows. At this stage, a complete sequence is formed with an emphasis on the "source context", after which, one by one, the target words are generated using:
"Target context", formed in conjunction with the previous word and providing some information about the state of the translation process. The relevance of a "contextual source", which is a mixture of different "original contexts" based on a particular model called the Attention Model. We will analyze what it is in another article. In short, Attention Models select the source word for use in translation at any stage of the process. The previously quoted word using word embedding to transform it into a vector to be processed by the decoder.
The translation is completed when the decoder reaches the stage of generating the actual last word in the sentence.
The whole process is undoubtedly very mysterious and we will need several publications to consider the work of its individual parts. The main thing to remember is that the operations of the neural machine translation process are lined up in the same sequence as in the case of rule-based machine translation, but the nature of the operations and the processing of objects is completely different. And these differences begin with converting words into vectors through their nesting into tables. Understanding this point is enough to realize what is happening in the following examples: Wiki.