When tested with a classic psychological assessment, advanced AI models experienced a total breakdown in focus. A new PNAS Nexus study suggests these systems lack the human-like executive control necessary to override automatic responses and maintain complex goals.
It’s not “bested” by the LLM though, a mathematician used the LLM as a tool to disprove a conjecture. Subtract the mathematicians from the process and the LLM would not have successfully completed the task. It would be more accurate to say a mathematician with an LLM was able to best a mathematician who did not have an LLM. Which is cool, but we don’t need to pretend the LLM is not a tool but something that “understands” math like a mathematician
You’re confusing the olympiad with the erdos conjecture. This is just really not true, they just asked it and it found a solution, the mathmatician then used its solution as inspiration to create a better one. It still essentially did it on its own, and they certainly do the olympiad on their own.
The description that the LLM did it on its own is subjective at best. I’ll just leave it at that. Have a good one
I don’t see how it’s subjective.