• BJW@lemmus.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    17 hours ago

    I, of course, disagree. Humans on the whole are ignorant and the average person is insufferably ill-informed. You are proving to be the exception. In general however, people suck at answering questions on subjects in which they’re not experts.

    I have read that study, but I believe it to be flawed and I’m not alone. Please read the counter argument that hallucination is a structural phenomenon of estimation itself.

    This reframes hallucination as structural misalignment between loss minimization and human-acceptable outputs, and hence estimation errors induced by miscalibration.

    In other words, it is a problem that can be solved.

    https://arxiv.org/abs/2509.21473?hl=en-US

    Edit: Here’s an approach to solving it, in fact:

    Our work demonstrates that targeted, high-quality SFT data teaching meta-cognitive skills can effectively reduce hallucination without preference/RL pipelines, providing mechanistic insights and a practical path toward more reliable AI systems.

    Inducing Epistemological Humility in Large Language Models https://arxiv.org/abs/2603.17504?hl=en-US

    • dandi8@fedia.io
      link
      fedilink
      arrow-up
      1
      ·
      17 hours ago

      Well, I suppose we can at least agree to disagree.

      I have seen so much incoherent but confident nonsense produced by LLMs (mainly by frontier models trying to do even basic software development) that I would not be able to say in good conscience that thought was involved. Junior developers would have done better. The experience definitely fits the behavior of a word predictor, though.

      Having seen what LLMs claim about software development, my stance is that absolutely no one should trust at face value what these models output. They’re Dunning-Kruger machines.

      As for producing new ideas, these models are as creative as a random number generator. Coincidentally, that’s what is responsible for faking their creativity (the “temperature” parameter).

      I guess that’s all I feel like saying in this particular thread.

      • BJW@lemmus.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        16 hours ago

        That we can.

        At the company where I’ve been the lead developer for fifteen years, sentiment is split down the middle - half think as I do, half think as you do. In (nearly) every instance where one of the opposing developers shows me nonsense, it’s been easy to identify the cause: a lazy prompt with insufficient context. Garbage in, garbage out.

        Having seen the results of the US elections, I don’t think anyone should trust humans. Yet here we are.

        As for temperature, yes, I’m aware of the parameter. The human equivalent would be genetic mutation, although we can’t alter ours on the fly.

        Thank you for the civil discussion. Until next we butt heads in the threads 👋