Don't Tell JD

IvanOverdrive@lemm.ee · 3 months ago

Don't Tell JD

j4k3@lemmy.world · 3 months ago

This is a particularly poor inquiry as “JD” is likely a single token that spans enormous locations in the training corpora.

It isn’t an error on the part of the LLM. It is an error in understanding the tool, how it works, and the lack of specificity of the question.

If one were to call the prompt with the logits visible. It would clearly show the LLM had no clue what was being queried. The problem is that the tools shown to the public are massively oversimplified for generalization. Even the most advanced libraries and programs available to the public right now are all based on simplified basic example code implementations. If you go read the Transformers library’s introduction page, it clearly lays out that the tool is in no way a comprehensive or complete implementation, and yet that is the central library all tools in the public space are built off of.

So friend, I’d counter that the tool is not crazy or unpredictable. It is extremely complex and in simplified form, it can be difficult to understand what has gone wrong. If the public was handed the true complexity, only the most advanced devs would ever figure it out in the first place.