top of page

Why can’t ChatGPT tell a good joke?

  • Writer: Kirsten Johnson
    Kirsten Johnson
  • Mar 7
  • 4 min read

The preferred term these days for talking about why a large language model’s (LLM’s) writing is so bland is “flattening.” It’s a decently descriptive term for the emotional response that LLM writing evokes: Whatever feeling that piece might have inspired if a competent human wrote it, shear 30% off the top and add a dash of alienation as you wonder why you’re spending more time reading this than a person spent prompting an LLM to spit it out. But what does “flattening” really mean, on a technical level? 


Some people breathlessly theorize that LLMs are bad at writing because they don’t have any physical or emotional reference for what they output—they can’t feel the summer breeze in their hair, or the joy of an authentic connection. But a lack of real-world experience isn’t what causes LLMs to be “flat,” because LLMs are trained on an ocean of data created by people who can do those things. LLMs are extremely good at identifying the patterns in that data and using them to mimic how people talk about their lives.


The real reason LLMs can’t fully replace human writers is more prosaic. For each word (technically, token) an LLM outputs, it does so one at a time using linear algebra to calculate a probability for what comes next. When it picks a word, it’s choosing from a spread of words it knows to be statistically likely to be correct. If we named them by how they work, we’d call them “word predictors.” LLMs inherently write in a bland way because they are predictive machines that use probability to infer a most likely next word. 


“Most likely” is pretty much the definition of boring, but it’s important to explain why. What constitutes a boring story is, of course, in the eye of the beholder (after all, there are people out there who watch competitive Excel). But to generalize, a boring story is one in which nothing unexpected happens. 


The best way to understand the role of unexpectedness in storytelling is through the lens of a creative discipline that absolutely falls apart without it: comedy.


A good joke relies on subverted expectations to make you laugh. A skilled stand-up comic knows exactly where you think the story is going and deliberately takes it in another direction to reveal a flaw in your assumptions. If the punchline is something you expected, the joke isn’t funny. 


Take this joke from the late, great master of one-liners, Mitch Hedberg: 


“The depressing thing about tennis is that no matter how good I get, I’ll never be as good as a wall.” 


Hedberg plays on your expectation that he will compare himself to a great human tennis player and subverts it to instead poke fun at the effort we put into doing something that is de facto accomplished by an inanimate object just existing.


Ironically, writing a good joke about the futility of human endeavor is something that neither a wall nor an LLM can do. If you ask ChatGPT to write a joke about tennis in the style of Mitch Hedberg, all it can do is sorta, kinda mimic his delivery. After I prompted it a dozen times to try again to make the joke funnier, this was the best the LLM could produce: 


“I don’t like playing tennis because every time I lose, they tell me I have love. That’s a weird way to comfort me. ‘Hey man, you didn’t score… but at least you got love.’”


Oof. So why can’t ChatGPT tell a good joke?


LLMs are so good at finding patterns that they can predict a word sequence with the general structure of a Mitch Hedberg joke, but they can’t really make an unexpected choice like talking about a “wall” when walls aren’t commonplace in tennis—that would break the pattern they’ve trained on. 


You could tweak the temperature of the model. On one end of the knob, at zero, it chooses only the highest probability word. As you dial up the temperature, it begins to pick words more randomly from a pool of likely candidates. But adding randomness isn’t the same as subverting expectations. Throwing a statistically less likely word at the end of your joke won’t necessarily make it funny. 


To subvert expectations, you juxtapose images, concepts, and ideas that haven’t been put together in the same way before. Otherwise, you just repeat information that’s already out there. To get a story that your readers won’t immediately file away with something they already know, you need a human who can experiment and make new connections through unlikely comparisons—the kind that lead to new patterns and new questions. 


This is true not just for comedy but for all storytelling, whether it be creative writing or crafting marketing messaging to explain the value of your product. LLMs flatten writing because they’re creating information that matches already existing patterns. 


This isn’t to say that LLMs aren’t useful writing tools or that they can’t come up with interesting metaphors or suggestions. They can, because they’re giving you access to the collected patterns of (roughly) a zillion other writers, albeit smoothed out by probability distributions. But at best they’ll help orient you in the space of everything that’s been said before. 


Only humans can make an unlikely choice when they sit down to write. Is that our only writing advantage over LLMs? Probably not, but it’s the most important superpower we have to keep things interesting.

©2025 Copytree.

Website made by writers, not designers ;)

bottom of page