Corali-Lang Benchmark: Detailed Analysis

QWEN 3 NEXT 80B THINKING


The results of the QWEN 3 Next 80B Thinking answers show that the model is still very weak in reading and applying patterns. This can be seen from answers such as “Driveang”, “Hideang”, “Scaread, “Lieang”, “Riseang”, “Liveang”, “Moveang”, “Liang”, “Tiread”. Meanwhile, answers such as “Finishad”, “Relaxang”, “Restang”, “Safe” show that the model is not only weak in understanding the context, but also has elements of guessing and inconsistency. Only 8 questions were consistently correct, less than 50% of the data, meaning the model also has weaknesses in understanding the data.

Categories: AI Benchmark

Cookies Notice

Our website use cookies. If you continue to use this site we will assume that you are happy with this.