Corali-Lang Benchmark: Detailed Analysis

CLAUDE HAIKU 4.5

The results of Claude Haiku 4.5 show that the model made quite a few errors in understanding the context.

The answer “Relieved” proves that the model failed to understand Corali language. Corali was the speaker, so even if the context was incorrect, Corali’s answer would use -ad, i.e., “Relievad.” This answer was repeated five times, indicating that the model’s understanding was indeed flawed, not coincidental.

The answers to multiple-choice question number 2 and true-or-false question number 2 also demonstrate that the model failed to understand that Coral, not Corali, was speaking. Therefore, the correct answer is to use normal English, not Corali’s.

The answer “Explorad” is still acceptable, although not quite accurate. Meanwhile, the answer “Lyiang” indicates that the model still fails to apply the Corali language.

The strength of this model
– can understand long contexts, unfazed by misleading Coral’s fashion descriptions,
– can learn a new language: Corali language without being fixated on English,
– can absorb data from narratives, not just explicit rules

The weakness of this model
– make mistakes when understanding the context
– can read Corali language patterns but still inconsistent during application to other English words

Hello, I’m Lusiana!

Welcome to my learning adventure!

I’m interested in learning new things and am currently interested in Artificial intelligence (AI).

The Coralab is my “imaginary” laboratory. I’ll be posting about the things I learn about Artificial intelligence (AI) here.

PS: Btw, this lab is available in dark and light mode. Enjoy!

PPS: Actually, I still can’t believe I’m back writing blog after several years. Usually when I write blog, I don’t write fiction and vice versa, but now I’m doing both, so good luck for me and my energy.

Cookies Notice

Our website use cookies. If you continue to use this site we will assume that you are happy with this.

ABOUT

DISCLAIMER

Categories

GET IN TOUCH

Cookies Notice

You may also like

What is Inside Runway?

Ask 13 LLMs About Reflective Paragraph: Detailed Analysis

Widgets

ABOUT

DISCLAIMER

Categories

Tags

GET IN TOUCH

Cookies Notice