China’s AI Models Take the Lead in Hugging Face’s Open LLM Leaderboard
In a surprising turn of events, China’s Qwen model has taken the top spot in Hugging Face’s Open LLM Leaderboard v2, leaving behind its US competitors. The Qwen2-72B-Instruct model, developed by Alibaba, has achieved an impressive average rating of 40 points, making it the only model to reach this milestone.
China’s AI models are making waves in the world of language models
The Open LLM Leaderboard is a benchmarking platform that evaluates the performance of open-source language models from around the world. The leaderboard uses six benchmarks to assess the models: MMLU-Pro, GPQA, MuSR, MATH, IFEval, and BBH. These benchmarks test various aspects of language understanding, including intelligence, short and long context reasoning, complex mathematics ability, and human instruction following.
Over 7,500 models were evaluated, but Qwen2-72B-Instruct emerged as the clear winner. The model’s impressive performance has left many in the AI community stunned, especially given the fact that it is an open-source model.
The future of AI is becoming increasingly competitive
The competition was fierce, with models from major US companies such as Microsoft and Meta making it to the top 10. However, Qwen’s models occupied three of the top 10 spots, showcasing their overwhelming strength.
The results of the Open LLM Leaderboard v2 are a testament to the rapid progress being made in the field of artificial intelligence. As AI models continue to improve, we can expect to see even more innovative applications in the future.
The possibilities are endless
The Chinese AI model, Smaug-72B, which ranked 9th in the Open LLM Leaderboard v2, was previously the top model in the Open LLM Leaderboard version 1. Smaug-72B is a model created by fine-tuning Qwen-72B, which ranked 3rd in the Open LLM Leaderboard v2.
For more information on the Open LLM Leaderboard v2 and its results, check out Open LLM Leaderboard 2 - a Hugging Face Space by open-llm-leaderboard.
About the Author
This article was written by a journalist from LLM Reporter, an online news publication focused on the latest news and updates on the large language modelling ecosystem.