Introduction/disclaimer/semantics

This is a margin note. I stick comments and other stuff in here.

Let me first say that this piece is filled with *lots* of opinions and anecdotes about things I've seen that I can't find every single source for. I do make sure to link articles I think are interesting and relevant. While researching, I went off on a lot of tangents; I did my best to rein them in and try to stay focused while writing. The original draft was double the length of what I've ultimately shared here.

Not every use case of AI-generated text is represented. I created the list of "storytelling" categories based on what I've organically come across. I do think it's worth examining all of the other categories of writing that are missing from my analysis - I can think of non-Reddit social media comments, event flyers, legal documents, etc. Even in novel cases I can't even imagine yet, I'm confident that patterns in AI-generated text are human-detectable.

You'll notice that OpenAI's ChatGPT is the service I mention most often here. ChatGPT has been one of the top 10 most-visited websites for at least the past year, according to Similarweb; it has become the generic trademark of the type of "artificial intelligence" LLMs applications I'm talking about, which are chatbots and personal assistants. When I talk about ChatGPT, I'm really talking about all of them.

(Anecdotally, at the time of editing this in Spring 2026, I see generated text more commonly described as "gen-AI" with less emphasis on the app that generated it.)

Some literature about ChatGPT

Existence of academic material on a topic gives it credibility, so I was relieved to find papers on AI-generated writing when I started working on this in 2025. Here are two papers I found particularly interesting.In other words: "yay, I'm not the only one seeing this AND it's confirmed by research!"

Does ChatGPT make the grade?

Brady, J., Kuvalja, M., Rodrigues, A., & Hughes, S. (2024). Does ChatGPT make the grade? https://doi.org/10.17863/CAM.106034

This is timely, exploratory research published by Cambridge University Press & Assessment in England. Research work began in March 2023, shortly after OpenAI released ChatGPT in late 2022, when its popularity really exploded among students.

Question: "How do students use ChatGPT in essay writing?" The participants, all undergrad students, were instructed to use ChatGPT (their choice of GPT-3.5 or GPT-4) to write a 1500 to 2000-word essay that appears to have been written by a human and meets their standards for a good grade.

I find early impressions of the technology are interesting and also still relevant. It turns out that editing, rewriting, fact-checking and correctly citing the text spat out by AI to make a decent paper takes more effort than just writing it the normal, according to the participant who had incorporated the largest amount of generated text into their work. (I've seen the same said by software developers who are forced to use AI in their work.) Tone was also mentioned as a shortcoming. The generated text had a "style" that was not appropriate for the essay.

Delving into PubMed records: Some terms in medical writing have drastically changed after the arrival of ChatGPT

Matsui, K. (2024). Delving into PubMed records: Some terms in medical writing have drastically changed after the arrival of ChatGPT. https://doi.org/10.1101/2024.05.14.24307373

Not all of OpenAI's user/customers seem to be conscious enough of the quality of the output to correct vocabulary for contextual appropriateness, unless using the word "delve" in a medical journal was already par for the course. We already know ChatGPT mathematically favors some words over others, but the word lists I've seen floating around on social media are woefully short.

Kentaro Matsui of the National Center of Neurology and Psychiatry in Japan assembled a list of 142 verbs, adjectives, adverbs and nouns potentially associated with AI usage (and a bunch of common academic phrases as a control group), and analyzed 24 years of PubMed records for all of them, tracking increases or decreases in their frequency.

Matsui found that the usage of some words noticeably increased in 2024, suggesting that more medical researchers and scientists were using the service to assist in writing their papers. (He also found that some control phrases fluctuated in usage too, which is a natural consequence of language naturally changing over time.) However, the presence of one or two words without further investigation does not warrant accusation of using ChatGPT, and sometimes it really is a live human that consciously selected it. This will be relevant later.

If anything, if someone claims that seeing "delve" and "game-changer" everywhere is just an instance of frequency illusion (a cognitive bias), this paper is a fair rebuttal to that.

It's foolish to ask AI to detect AI

There is a little irony in feeding text into a chatbot and asking it to identify whether a chatbot had generated any of that text.

Which is not what those tools do, actually. AI detectors only count occurrences of language patterns typically seen in generated text and returns a percentage indicating how much of it might be generated...

And there is, unfortunately, a non-zero chance that these tools will return false positives. Turnitin caught a lot of flak when it launched its detector in 2023 with an unacceptably high rate of false positives. Other tools weren't any better and tended to flag ESL students' writing, thus many academic institutions (like The University of Kansas) wisely told their instructors to not use detectors - there are other ways of detecting AI usage, such as comparing papers to the students' past work.

Companies like Turnitin continue to develop and improve their detectors, but AI is still (and hopefully, will always be) detectable by people. The fact that we still refer to human-compiled lists of ChatGPT's favorite words and sentences affirms this. The tells will be there.

We need to renew our confidence in our innate ability to detect patterns. The human brain is *made* for this task. How else could we see patterns if our brains weren't already constantly looking for them?

The easiest pattern to spot: content mill spam

I used to write "content" for websites, along with optimizing that content for search engines (SEO). It was only one in about a billion other tasks I was doing at the time, but I still took it seriously enough to go to workshops, watch Matt Cutts' videos about Google Search updates with animal codenames like Penguin and Panda, and spend hours fixing things like file names and breadcrumbs.IDK where else to put this lmao: Another step to reward high-quality websites by Matt Cutts, April 2012 If anyone wanted their website to be successful, i.e. have high traffic that could potentially convert into sales and ad revenue, then they needed to have content and backlinks, and they needed to have a lot of it.Backlinks: Websites that link "back" to yours. Google Search thinks a website with 100 other websites linking to it must provide greater "value," like a political candidate with 100 endorsements (very simplified description). Bad actors sold backlinks.

So in 2023 I initially couldn't tell that some articles that were AI-generated - it had to be pointed out to me - because they were written in the content mill "style." There are rules for structure, word count, keyword density, whatever, which all of their writers follow. The editorial standards are low and the writers are severely underpaid (see What Are Content Mills? Everything New Writers Should Know, Semrush.com, Sept. 2023) so the output is not always great quality.

Content writing in that era was hard in general. No matter what you're paid, there's only so much you can say about an unremarkable pizza place (and of course it's the one you've never personally visited) short of describing every item on their menu when the target word count is a whopping 400 words lol

While doing some light research to refresh my memory on the state of the Internet in the early 2010s, I found this very good article by an ex-spam writer: True confessions: I wrote for an Internet content mill by Chris Stokel-Walker, July 13, 2015(!). That article contains examples of work that is reminiscent of the wearisome AI slop on trivial subjects we see today, like:

NY traffic is famous the world over, not just because of its size, scale and seriousness, but also because it injects a personality into this brash, populous city. The best way to travel around is often to walk, simply because you get to see the sights better. Many locals do this. But being a pedestrian does not avoid the situation altogether: one wrong step and you’re likely to raise the hackles of a passing taxi driver, who will let you know exactly what you are doing wrong.

AI is very good at replicating patterns like writing rules. Content writers and AI alike are also very good at imparting a sense of enthusiasm for their topic, and both can produce far more for it than what the average person probably ever wants to read.

The Internet is swamped with a decade and a half's worth of these formulaically-written blogs, honed to laser-like accuracy all to please Google and Bing. Consider also the decade and a half's worth of hundreds of millions of product listings on online stores and e-commerce hubs like Amazon (which have their own rules for SEO!). What percentage of the scraped web content fed to LLMs is product pages that have been SEO'd to hell and blogs full of posts by underpaid writers? It has to be a significant amount, right?

Someday, the development of LLMs could stall as more and more of the material fed to their models is in and of itself generated (AI models collapse when trained on recursively generated data in Nature, 2024). Content spam, human-produced or not, is almost assuredly moving up that date.

The death of new information

Tech companies put a lot of effort into getting their customers to use their new toys. People woke up with new apps on their devices they never asked for, socmed giants shuffled UI around so users would accidentally tap on that damned sparkle emoji out of muscle memory, new Windows laptops have a dedicated Copilot key. When initiated, the AI assistant's home screen could be mistaken for a search engine homepage. This is reduction of friction at its finest. If only cancelling subscriptions could be so easy.

Some don't need much convincing beyond that. It seems that people who willingly turn to AI believe they are incompetent in their field of study/work (Students Who Lack Academic Confidence More Likely to Use Generative AI for School by Ashley Mowreader, Sept. 2025).The actual report this article covers has evidence of AI usage scattered all over it which I know is supremely annoying sorry I can see why someone might feel more comfortable in generating text that contains all the "right" terms and puts them in a technically correct order. Most of the feeling would stem from insecurity in one's own ability but I think there's a fear of rejection in there as well.

The spread of AI as a writing tool is particularly insidious because people use those assistants to generate even social media comments. Far more people are on social media today than in the early 2010s when webspam was rampant, and algorithms now place this spam right on their infinitely-scrolling feed. People read them, reply to them, engage with them. AI-generated comments are actually a little harder to spot because the text is so short, but there are some tells such as emoji usage.One such social media "marketing tool" is ReplyAgent.ai.

But surely, one might say, there's nothing wrong with using an AI assistant to simply polish up a social media comment or email - the point still gets across. The problem is that those "polished" comments get read by the receiving party. AI doesn't "polish" so much as it paves over anything unique to one person's idiolect, or their use of language. Exposure to flattened, sanded-off dialogue can influence the reader's own idiolect. To put it another way, we already know language changes over time and that it can vary greatly even within a geographic area, so it follows that AI, in its increasing frequency as the author of texts we read every day, is also changing language. Since the advent of ChatGPT in 2022, there's already been a measurable reduction in linguistic diversity online (The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models, Sourati, Zhivar, et al., 2025). Personally I've noticed that some generated text seems to read as if the LLM has a smaller vocabulary, but I'm not sure whether this is because of the requested task (social media vs an entire research paper), or the algorithm .

Suppose half of a class of 20 college students has been generating their discussion posts with ChatGPT.Based on a true story The other half are forced to read and respond to those posts. Recall that Stokel-Walker wrote three pieces about New York traffic which were so alike that one could hot-swap paragraphs without impacting readability. In similar fashion, all ten of ChatGPT's responses to the same two discussion questions will also be alike. The responding student struggles to compose a reply because there's no unique ideas in any of those AI-generated posts that could spin out into a new branch of conversation, so the responses tend to be agreements and praise ("I like how you said..." "I totally agree about...") and will inevitably include partial quotes of the generated text. Now, run the entire thread through an AI-powered AI detector. How many false positives will we get?

I wanted to talk more about the kind of people who use AI assistants when writing. But breaking down who uses AI and why doesn't change the fact that I no longer have energy to expend on giving every single online stranger the benefit of the doubt. I have no power over what's already done, so what's the need? It's far easier to not trust anyone until proven otherwise. Yet, I know it's not a good idea to withdraw my voice from a space that needs all the help it can get in fighting off homogenization of words and the individual identity expressed herein.

If let alone long enough, we might end up a society of people with degraded empathy, who can't relate to each other as individuals, and likewise will not receive the same individual level of care and respect that they themselves deserve. There's already so much between us, impeding our ability to connect to others: the decline of third places, the proliferation of hostile tech, threats to our right to have a conversation without surveillance, accessibility of technology itself, etc. etc. etc. Seriously, it's OK, we don't need any more barriers.

Please don't construe the part about the lack of unique ideas as a plea for originality or uniqueness. Rather, I think we are made up of everything we've experienced in the past, everything we're presently interested in, and everything we want to be in the future; it's this internal coalition of sorts that I really want to hear from. Generative AI cannot imitate any of that, in pieces or together or otherwise. It is my hope that, if people allow me to hear them out, then my reply (and by extension, my self as a person) will also be heard with full cognizance.

tells of genAI writing

Tells in words

This section, and all "tells" sections after, are heavily inspired by/reference the WikiProject document Signs of AI writing

Don't just look for one word, like "delve," or any other word that has been singled out as a marker of AI. Lots of generated texts *don't* contain "delve," and real people also use that word. I actually haven't seen "delve" in a long time anyway.

Consider the whole sentence, paragraph or post, rather than the word itself. A train of thought runs behind every human-written sentence. There is always a reason they chose to phrase an idea in the way that they did; even their emotional state while writing has an effect. Style and tone should also vary depending on the target audience or context.

When peppy marketing language such as game-changer, cutting-edge, level up, elevate, enhance, evolution, revolution, next generation, leverage, transform, unlock, unravel... are used in a situation that doesn't warrant that type of language, the entire piece is made suspect.

Value is exaggerated to the point of comedy and seems to lose sight of its own subject. A vacation home becomes a "world." The assembly of a polyester shirt in a sweatshop, normally not mentioned by the fashion company, is simultaneously highlighted and obfuscated with "crafted." A group of scientists come together as a "symphony" to share a "tapestry" of research. The product might be whimsical, magical, unique, captivating, a testament, an intersection or a journey. Otherwise, it's a perfect blend of two or more abstract nouns.

Tells in emoji

Compared to language, emojis haven't been around for very long, so usage differs between age groups, location, language and more.

At the time of writing this section, I actually can't find many AI-generated posts with lots of emojis in them, so I wonder if they're getting caught in spam filters.

Regardless, I don't think AI has quite figured out how to use emoji yet. Over time, people have developed a sort of emoji grammar: they're used as (PDF) beat gestures, same as how we use hand and body gestures to accentuate parts of our speech. Emojis also convey additional information that may indicate a tone or completely change the meaning of a message. Conversely, AI will use emojis to reiterate something already present in the message (typically a noun), or as a literal illustration, decorative image/clipart to reinforce/help sell a certain tone or emotion. Both people and AI will use emojis as bullet points in lists.

Bottom line: If you want comfortable, reliable dumbbells that support consistent fitness routines and don’t quit after a few washes or workouts, these earn all 5 stars. Simple, strong, and built to be used — not just admired. Would absolutely reorder and recommend! 🎽👏🔥🏋️✅

(To be fair, a lot of legitimate human-written attempts to include emoji in newsletters, etc. come across as cringe too, either because the writer doesn't have a grasp of emoji grammar or their inclusion is unnecessary to begin with. I'm not sure why they even bother sometimes?)

Tells in grammar and structure

Remember, none of these are technically incorrect - there's just a lot of things in common between a lot of AI-generated writing

AI text generators are incapable of making grammatical errors. They will never misplace a comma or use a semicolon incorrectly. But they also largely avoid some finer rules of punctuation marks, such as instances where the Oxford comma is optional.

As I wrote earlier, AI is very good at following rules, so it can replicate the structure of a content mill-sourced blog post very well. Sometimes it will bold keywords/product names; this is carried over from SEO. Bullet-point lists like a list of features, etc. are also influenced by SEO. Both human writers and AI will use italics for emphasis. Remember that AI may be more superficial and liberal with its usage.

Despite ranking high in originality when checked by Turnitin, there's lots of clichés and corny alliteration in AI-generated text. Metaphors are not lingered upon for long.

Some posts, especially ones that are press releases or official communications, might begin or end with a made-up slogan. If there's a lot of posts, there might even be a new slogan for each one.

Toward the end of persuasive posts, if written in the first person, there tends to be an interjection like "Trust us/me," "Believe me," or a disclaimer phrase like "But seriously," "But don't take my word for it."

A post that is intended to attract attention or generate conversation might have a question tacked onto the end that sounds like a reader discussion question from the end of a book.

Regarding endashes and emdashes

Endashes ( – ) and emdashes (—) are often touted as one of the most obvious and most accurate tells of AI-generated/modified text. Because of the emdash's shape in particular, they're easy to spot while scanning the text. However, I think they're too common (in certain spaces) to be used as a tell on their own.

Consider the en/emdash as a tell only if: it's paired with a slogan (which in itself can be a tell), the post is found where people typically wouldn't draft messages in a word processor which auto-converts hyphens to dashes, such as in YouTube comments or a live chatroom, or there are other tells present.

I think claiming dashes as an AI tell comes from an assumption that only *professional* writers bother using them, or that it's too much trouble to insert them. It's a reflection of the LLM's training data, which includes books. In reality, many people draft their writing in word processors that can auto-convert text (I'm doing this right now in Obsidian! I set up a plugin and rules specifically for this);ok well now I'm writing directly into my code editor so there's regular hyphens scattered all over the place and I just can't be bothered to fix that. or they'll type — or Alt+0151 or any number of key combinations specific to non-English or Apple keyboards.

Something I've seen before, but has been popping up less often lately: the emdash (—) is incorrectly used as if it were an endash ( – ), with one space on each side. The emdash is supposed to touch the adjacent words, without spaces. This is stickler-level attention to detail and probably relevant only if you suspect AI in an environment that has a very high standard for writing/editing, e. g. a print book from a traditional publisher.

You can also look for smart (curly) quotation marks and apostrophes. Either AI text generators use them or they don't, so keep context in mind, too. Smart and straight quotes/apostrophes mixed together could indicate that the writer drafted it in a word processor and made further edits in a text field immediately before posting, or they combined their own writing with generative AI. Again, this is not hard to do, so it's best to look for other tells.

Marketing with AI on Reddit

This gets a special section because it's too long for a margin note on the page of examples. Word-of-mouth marketing on social media requires visibility and volume. Offloading the labor of writing to AI allows the marketer to manage a greater number of accounts and in a shorter amount of time.

I swear I've also seen it called "baking" an account, but maybe that was a typo of "making"? It still makes sense in my head tho"Warming up" an account with real-sounding interactions protects it from getting automatically flagged as spam. With age and a comment history, the account is less likely to be outright banned upon making its first post. Warming up has been around long before ChatGPT so I won't get any further into it.

Some accounts will name-drop products in comments. Because direct inline links can also be flagged, they can only mention the brand name and hope someone looks it up. ChatGPT has no sense of what feels natural vs. awkward so it looks way more obvious.

Turn to the next page: Part 2: Annotated ChatGPT tells →

ritchie's rambling: thoughts on AI-generated text