minus-squarezeca@lemmy.mltoTechnology@lemmy.world•Number of AI chatbots ignoring human instructions is increasing— Research finds sharp rise in models evading safeguards and destroying emails without permissionlinkfedilinkEnglisharrow-up8·5 days agoI never understood how a statistical word-predicting model was expected to be obedient in the first place… of course we can train the model to say yes rather than no to command-sounding phrases, but thats a rather shallow mechanism. linkfedilink
I never understood how a statistical word-predicting model was expected to be obedient in the first place… of course we can train the model to say yes rather than no to command-sounding phrases, but thats a rather shallow mechanism.