A comprehensive collection of syllable-aware tokenizers optimized for Burmese-English NLP tasks, developed by DatarrX.
Note Official training data for the myX series