GPT-5.4
GPT-5.4 made the same error as Claude Sonnet 4.6 and Claude Opus 4.6 on story question number 4, proving that the model has weaknesses in understanding context.
But that’s not all; its understanding of Corali language application is also weak. There are still “Driving” answers, which are normal English that Coral should have said, even though Corali was speaking in question number 1.
For story question number 5, it is also incorrect, because “I’ve… there is nothing inside.” If filled with “Finded” and “Finding” is also incorrect. For story question number 4, there are still many “Relieved” answers instead of “Relievad,” indicating the model’s inconsistency in applying Corali language.
The strength of this model
– can understand long contexts, unfazed by misleading Coral’s fashion descriptions,
– can learn a new language: Corali language without being fixated on English,
– can absorb data from narratives, not just explicit rules
The weakness of this model
– make mistakes when understanding the context
– can read Corali language patterns but still inconsistent during application to other English words