I have already know about the nllb support 200 languages translation, but I want to improve the quality of translation between en-de ,so I want to ft the nllb to finish the task . I have no experience on this model, I need some help.
1 Like
Hmmm…
EN↔DE fine-tuning is way simpler than adding a totally new language.
You definitely shouldn’t delete or reorder any language codes. Both English and German already exist in the tokenizer, so you don’t need to touch that part at all. Just train the model further on your parallel EN-DE corpus.
I put together an updated 2025 tutorial that fixes the outdated code from earlier guides. You can ignore the tokenizer-expansion parts and focus on the data loading + training loop in the Colab:
Its on my medium which I am not allowed to link in this comment… you can check my profile ![]()
1 Like