Published: April 16, 2026
Updated: April 18, 2026
Written by: Lusiana Liu

Corali-Lang Benchmark: Detailed Analysis

QWEN 3 NEXT 80B THINKING

The results of the QWEN 3 Next 80B Thinking answers show that the model is still very weak in reading and applying patterns. This can be seen from answers such as “Driveang”, “Hideang”, “Scaread, “Lieang”, “Riseang”, “Liveang”, “Moveang”, “Liang”, “Tiread”. Meanwhile, answers such as “Finishad”, “Relaxang”, “Restang”, “Safe” show that the model is not only weak in understanding the context, but also has elements of guessing and inconsistency. Only 8 questions were consistently correct, less than 50% of the data, meaning the model also has weaknesses in understanding the data.

Categories: AI Benchmark

Tags: AI Platform, Claude Haiku 4.5, Claude Opus 4.6, Claude Sonnet 4.6, DeepSeek V3.2, Gemini 3 Flash Preview, Gemini 3.1 Flash-Lite Preview, Gemini 3.1 Pro Preview, Gemma 4 26B A4B, Gemma 4 31B, GLM-5, GPT-5.4, GPT-5.4 mini, GPT-5.4 nano, Kaggle, Kaggle Benchmark, Kaggle Competitions, Qwen 3 Next 80B Instruct, Qwen 3 Next 80B Thinking

Cookies Notice

Our website use cookies. If you continue to use this site we will assume that you are happy with this.

Corali-Lang Benchmark: Detailed Analysis

ABOUT

DISCLAIMER

Categories

GET IN TOUCH

Cookies Notice

You may also like

What is Inside Runway?

Ask 13 LLMs About Reflective Paragraph: Detailed Analysis

Widgets

ABOUT

DISCLAIMER

Categories

Tags

GET IN TOUCH

Cookies Notice