• Hackworth@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 months ago

    On the other hand, the human participant scored 67 percent, while GPT-3.5 scored 50 percent, and ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time.

    54% - 67% is the current gap, not 54 to 100.