
RWS TrainAI Study Finds Claude Sonnet, GPT and Gemini Pro Lead in Synthetic Data Generation
MAIDENHEAD, England, April 29, 2025 — When it comes to large language models (LLMs) and their ability to generate sentences and conversations, Claude Sonnet, GPT and Gemini Pro come out on top, according to TrainAI’s latest LLM benchmarking study.
Unlike typical automated LLM benchmarks that assess performance on closed questions, TrainAI’s LLM Synthetic Data Generation Study used human expert evaluators to test the ability of popular LLMs to generate sentences and conversations, assessing their general natural language processing (NLP) skills across a variety of languages.
“We conducted this study because reports suggest that the largest companies behind today’s state-of-the-art LLMs are running out of data to train their newest models,” explains Tomáš Burkert, TrainAI’s technical solutions lead on the benchmarking project. “Companies like OpenAI, Anthropic and Google are exploring the use of synthetic data generated by the LLMs themselves (as opposed to humans) to train and fine-tune their AI models. We wanted to explore the potential impact of using LLMs to generate training and fine-tuning data for AI.”
Nine LLMs were tested on six data generation tasks varying in complexity, across eight carefully selected languages with varying representation. For each language, three native speaking language specialists evaluated the LLM-generated outputs against specific criteria (such as grammar and naturalness). Overall, 38,000 sentences were generated, 115,000 annotations submitted, and 250,000 ratings from 1 (very poor) to 5 (very good) provided by 27 linguists across the globe.
“Because AI is built for humans, we chose humans – not AI – to evaluate LLM performance. Our study found that no single model outperformed the rest when generating synthetic data across languages and tasks, but some models performed better than others on key criteria like language proficiency, instruction adherence, creativity, speed and cost,” said Vasagi Kothandapani, President of Enterprise Services at RWS. “The study underscores the importance of assessing the strengths and limitations of multiple LLMs for specific AI use cases or applications. Only then can genuine value and positive business impact be realized.”
Download a copy of TrainAI’s LLM Synthetic Data Generation Study here.
About RWS
RWS Holdings plc is a unique, world-leading provider of technology-enabled language, content and intellectual property services. Through content transformation and multilingual data analysis, our combination of AI-enabled technology and human expertise helps our clients to grow by ensuring they are understood anywhere, in any language. Our purpose is unlocking global understanding. By combining cultural understanding, client understanding and technical understanding, our services and technology assist our clients to acquire and retain customers, deliver engaging user experiences, maintain compliance and gain actionable insights into their data and content.
Source: RWS