Joey NMT Transformer: The input form of the model is a little different from the classical transformer.
It is based on a coding model by pairs ofbytes (BPE) to divide the words into sub-words according to their frequencyin the corpus learning.
BPE Model : is a sub-word segmentation algorithm thatencodes rare and unknown words as sequences of sub-word units
dataset : English :source -korean : target
you can see the page concerned the Joeymnt : https://github.com/joeynmt/joeynmt