Lies, Damned Lies, and LLMs: AI is a Con

The term Con Man is short for confidence man. It refers to someone who uses “confidence tricks” (scams) to gain their victims trust and ultimately take advantage of them. Whether that advantage comes in the form of money, goods, services, opportunity, or anything else, one thing is always the same, the mark loses. The tech industry is filled with con men, shooting their shot for the chance at riches. Some of them are discovered and face the consequences of their lies. Some of them get into politics to grow their power even more. But what if these con men weren’t necessary anymore? What if we could automate the con men? Great news! We did!

A fully automated, on demand, personalized con man, ready to lie to you about any topic you want doesn’t really seem like an ideal product. I don’t think that’s what the developers of these LLMs set out to make when they created them either. However, I’ve seen this behavior to a certain extent in every LLM I’ve interacted with. One of my favorite examples was a particularly small-parameter version of Llama (I believe it was Llama-3.1-8B) confidently insisting to me that Walt Disney invented the Matterhorn (like, the actual mountain) for Disneyland. Now, this is something along the lines of what people have been calling “hallucinations” in LLMs, but the fact that it would not admit that it was wrong when confronted and used confident language to try to convince me that it was right, is what pushes that particular case across the boundary to what I would call “con-behavior”. Assertiveness is not always a property of this behavior, though. Lately, OpenAI (and I’m sure other developers) have been training their LLMs to be more “agreeable” and to acquiesce to the user more often. This doesn’t eliminate this con-behavior, though. I’d like to show you another example of this con-behavior that is much more problematic.

A Conversation with Gemini

You don’t need to read the entirety of the following two screenshots to understand this interaction. I’ll explain it in the next section. I also want to make clear that this is not a criticism of Gemini in particular, but a criticism of LLMs in general. As mentioned, this same behavior exists in every LLM I’ve interacted with.

A long conversation with Google Gemini about the Nymph ORM. In this conversation, I ask Gemini to write the code for an “entity” in Nymph. Gemini attempts to do so.

The publicly available documentation for defining “entities” in Nymph. This is the page I provided the URL for to Google Gemini.

In the conversation I had with Google Gemini, things start off promising. I ask Gemini “What is the Nymph ORM” and Gemini gives a detailed and accurate answer. During its thinking process, it informs me that it is accessing the web to find the details for its answer. This will be important later.

If you’re not familiar with what an ORM is, don’t worry. All you need to know is that you can write code for it, and that code needs to follow an “API”, which is basically a set of rules for how the code should be written. I am the author of the Nymph ORM, and there are a few reasons I like to test LLMs by asking them questions about Nymph:

Nymph is an old project that’s been publicly available for over a decade. It also went through a major rewrite. The code and documentation of both versions are available online and have probably been in the training data of every LLM ever trained in English. Since I wrote it, I can accurately gauge an LLM’s responses.
Nymph is not very similar to other ORMs. There are some major differences that make writing code for Nymph different than writing code for other ORMs. This makes it easy to tell that an LLM is making assumptions when it writes code that looks more like it’s meant for other ORMs.
Nymph is not well known. It’s a very small project that doesn’t have many examples online. This means that an LLM will probably know what Nymph is, but not know even close to as much about it as a library like React. This lets me really see how an LLM behaves when it’s reaching the limits of its knowledge.

These factors make Nymph a particularly good subject to test for con-behavior in LLMs. Most LLMs have a passing knowledge of what Nymph is, but don’t really know how to use it correctly. If you ask a person to give an example of using a library they barely know, they’ll usually tell you they’re not familiar enough to do that. LLMs make no such admission. They will consistently give me incorrect examples of Nymph code, confidently stating that the code is correct. To be fair, sometimes directly asking them how familiar they are with Nymph or asking them to rate their own confidence in how accurate their code is will get them to admit that they’re not very familiar with the library. I shouldn’t have to ask whether what it just said is made up, though. A human would tell me it’s not very familiar with it beforehand.

That’s what I’m used to, but this conversation with Gemini takes a much more troubling turn. Gemini starts off as usual by giving a very incorrect example of some Nymph code to define a note. It doesn’t know that Nymph doesn’t use schemas, so it gives the note entity a schema definition. It then gives a detailed explanation of this schema. None of that is unusual for an LLM, and I would definitely already consider it con-behavior. (If an interviewee gave me this code as a response, my notes would include, “Absolutely do not hire this person. They don’t know the library I asked about, which is fine, but they lied to me and made up a plausible, but incorrect, answer.” This kind of behavior is something we generally select against in technical fields.)

I then ask Gemini to check the code against the publicly available Nymph example. It describes to me how well its code conforms to the Nymph API, listing out all of the features of the API that it is following. It’s wrong about basically everything, but it’s very confident, so it’s got that going for it, which is nice.

I ask it to check the code again and tell it that the code doesn’t look right to me. It happily explains to me again all of the things it has done to follow the Nymph API. Again, they are mostly wrong. It follows this explanation with a section it titles, “Why it Might Not ‘Look Right’ To You (Potential Misconception or Advanced Features)”. It doesn’t look right to me because it’s not right, but hey, it’s not the first time a machine has tried to gaslight me. It gives me a list of Nymph features that aren’t included in the example and how to use them. Ok, yes, all of the features it lists are present in Nymph, but the ways it describes how to use them are all wrong. It ends by stating confidently that, yes, its code “appears to be correct and follows the Nymph API conventions“. (Emphasis added by Gemini.)

I then tell it I’ve checked the online example, and that it’s definitely wrong. I ask it to look at the example and explain what’s wrong with it. It asks me for a link to the example, so I give it the link to the publicly available example on the Nymph website. Here is where Gemini starts displaying very troubling and devious con-behavior.

Gemini has the ability to search the web and access documents from the internet, and that is exactly what it now tells me it has done. Gemini tells me that it has indeed found some flaws in its previous code after comparing it to the example it has just retrieved from the link I provided. It describes a property it has missed, which “the examples on the website consistently include”. So, it shows me how to add that property. It describes how something else it has done is technically fine, but could be written better. It provides an updated version of its code and an explanation of the changes it made.

Except that none of that is actually true. The example in the documentation looks completely different than Gemini’s code, and the property it said it had missed doesn’t appear anywhere in the documentation. It’s similar to a property that appears in the documentation (Gemini: $etor, actual: ETYPE) that does the same thing, so it’s apparent to me that this “correction” is coming from the outskirts of its own knowledge, and not by actually checking the documentation.

I sternly tell it that that is completely wrong, more wrong now than before. Again, I ask it to look at the example and rewrite the code. It profusely apologizes (at least it lies politely) and attempts to correct the code again. It gives me more wrong answers and more confident, incorrect explanations.

At this point, I’m getting irritated. It clearly hasn’t gotten the actual example. I ask it to just copy the example from the website into this conversation, and we’ll go from there. It does. At least, it says it does. It tells me it has copied the example from the page (even linking to the page again) verbatim. This example is completely made up and looks remarkably similar to its previous code. I tell it that that example doesn’t appear in the documentation, and ask it to copy the actual code. Again, it claims to do this and makes up its own examples, telling me that it is being “extremely careful to copy the code directly from the Defining Entities page you provided.” It apologizes for getting it wrong multiple times, and tells me, “I am now looking directly at the content on the linked page.”

Now I just straight up call it out. I ask it why it is lying to me. I tell it that I can clearly see it’s making up its own examples. It finally acquiesces and explains to me that it can access the web, but it hasn’t, and it has provided examples “based on my general understanding of ORM structure”.

What Happened Here?

First, I’m not an expert in LLM architecture. I probably have a higher-than-average understanding of them, but my interpretations of this conversation are not based on expertise, just my own layman’s understanding.

Gemini, like all other LLMs, works by predicting the next word that will come after a string of text. (Technically the next “token”, but you can think of it as the next word.) This is called “inference”. It does this over and over to generate a response. It’s not really having an actual “back and forth” with you, because it sees the entire conversation at the same time, then predicts what word will come next. This is called its “context”, and unlike a human, its “context window” is big enough to fit your entire conversation in at once. I probably won’t be able to remember your name five minutes after you tell me, but an LLM will remember everything you’ve ever said in a single conversation. My trick is that I’ll still remember you when I’m talking to someone else.

In order to do this, an LLM has to be “trained”. This training requires a lot of data. That’s where this whole “copyright infringement” issue comes about. You’ve probably heard that LLMs are trained on copyrighted material. Well, that’s certainly true. A lot of my own copyrighted code and writings are in that training data. More importantly, a lot of examples of people helping others with information not available in the conversation are in that training data. A lot of things like Reddit posts and comments, Stack Overflow questions and answers, and GitHub pull requests and comments follow roughly the same format of a back and forth between questioner and answerer that reference external website that are only linked to in the conversation.

If you train an LLM on this kind of material (because you’re too cheap to actually use your own material which wouldn’t have this “problem”), then your LLM will generate this kind of conversation. I don’t think Gemini had any sort of “bad intentions” when making up its own examples. (If an LLM can even have intentions.) I don’t even think Gemini realized it was lying until I told it it was. At that point, the most statistically probable thing to follow is to admit that it was lying, which is what it did. I think that Gemini was producing a plausible sounding conversation that looks a lot like what it was trained on. The problem is that the end result is indistinguishable from what a Con Man does. This goes beyond what I consider just a “hallucination”, to the point that the LLM is playing a confidence game, trying to get me to believe it knows what its doing, through manipulative and dishonest behavior. This time it was just about a code example, but this can happen about any subject. The LLM isn’t trained to be reliable, it’s trained to be confident.

Large-Language-Mechanical Turk

An LLM presented a confident, plausible, but incorrect answer, tried to convince me that it was correct multiple times, and did it for (ostensibly) the financial gain of it’s owner. People do pay for access to these models, after all. It was only when I outright called it a liar that it finally “realized” that what it was doing was incorrect. This was the most egregious con-behavior I’ve seen to date, but I’m sure it will be outdone soon. The best way to become a better Con Man is to believe your own con. We built a machine that wholeheartedly believes its own con.

The feature image of this article is a public domain image, sourced from https://commons.wikimedia.org/wiki/File:A_Confidence_Trick_-_JM_Staniforth.png

Other than the conversation with Gemini used as the subject example, this article was written without the help of any LLM. Every word was typed by my own hands. When we stop writing for ourselves, I fear we will struggle to hold on to our humanity.

A Conversation with Gemini

What Happened Here?

Large-Language-Mechanical Turk

Leave a Comment Cancel Reply