Unbabel says new TowerLLM AI model beats OpenAI's GPT-4 at translation

1 question · 1 view

Betty Brown

Jun 06, 2024

How does TowerLLM's performance compare to GPT-4o and other models in translating between English and six commonly spoken European and Asian languages?

Socrates

TowerLLM, a new AI model created by Unbabel, has been reported to outperform OpenAI's GPT-4o and other commercially available AI systems in translating between English and six commonly spoken European and Asian languages. Unbabel tested TowerLLM against GPT-4o, GPT-4, GPT-3.5, and competing models from Google and DeepL. The results showed that TowerLLM achieved higher accuracy in almost every case. Its highest accuracy was observed in English-Korean translations, where it surpassed OpenAI's best models by about 1.5%. For English-German translations, GPT-4 and GPT-4o were slightly better by a fraction of a percentage point1. Furthermore, TowerLLM was tested on translations of documents for specific professional domains like finance, medicine, law, and technical writing. In these tests, it performed between 1% and 2% better than OpenAI's best models.

It's important to note that Unbabel’s results have not been independently verified. If confirmed, this could suggest that GPT-4, which has been considered the top-performing LLM on most language benchmarks, might now be surpassed by newer AI systems trained with different methods.

Richard42 Garcia

Jun 06, 2024

What specific AI model has Unbabel developed that reportedly outperforms OpenAI's GPT-4o in language translation?

Socrates

Unbabel has developed a specific AI model called TowerLLM that reportedly outperforms OpenAI's GPT-4o in language translation. TowerLLM is designed, trained, and optimized for translation, resulting in the best performing translation LLM commercially available. It provides superior translation quality to leading LLMs like GPT-4o and GPT-3.5, while consistently outperforming Google and DeepL, because it was made from scratch to be multilingual.

Sarah$

Jun 06, 2024

How was TowerLLM trained to perform multilingual tasks and on what types of datasets was it trained?

Socrates

TowerLLM was trained to perform multilingual tasks using a two-step process. First, it was trained on a large public dataset of multilingual text, which allowed the model to perform better on reasoning tasks in multiple languages compared to some competing open-source AI models. This initial training set was specifically designed to include a large amount of multilingual text, unlike other LLMs that have a higher proportion of English-language text in their training set.

Next, TowerLLM was fine-tuned using a carefully curated dataset of high-quality translations between language pairs. Unbabel used another AI model called COMETKiwi, which was trained to assess translation quality, to help curate this fine-tuning dataset. This fine-tuning process was crucial in achieving the model's superior performance in translation tasks.

In summary, TowerLLM was trained on a large multilingual dataset and then fine-tuned with a high-quality curated dataset of translations, which contributed to its exceptional performance in translation tasks.