When tested with a classic psychological assessment, advanced AI models experienced a total breakdown in focus. A new PNAS Nexus study suggests these systems lack the human-like executive control necessary to override automatic responses and maintain complex goals.
Anthropic sonnet 3.5 is an old mid tier model. GPT 4o is also multiple generations ago.
Newer models handle this much better. Not claiming sentience or anything.