How a lawyer can spot AI-generated legal documents

Sam Cooley

22 Jul 2024 • 3 min read

Usage of AI chatbots to churn out content is exploding across the internet and is tearing its way through schools, news organizations, and it is also landing into the lap of lawyers who may be tasked with actually reading a bunch of it to do their jobs. Even search engines are beginning to become saturated, polluted essentially, with automatically generated content and it's hard to see this stopping any time soon.

This is a new dynamic in the legal profession. And clients who may have a goal to become more efficient in their work, may be relying on generative AI a little too much.

While these technologies offer convenience, they also present significant challenges and risks for legal professionals. At uLawPractice, we have identified several key indicators and strategies for addressing AI-generated legal content that may come across your desk.

Overused words & structure to writing

One of the things that stands out the most when detecting whether writing has been produced by AI Chatbots is how these algorithms will repeatedly structure its writing.

After you've seen a few dozen articles that are written by AI, scroll down to the bottom of the article and read the last paragraph. Oftentimes it will begin like this: "In conclusion,..."

While many authors tend to write "in conclusion," as the first two words of their last paragraph, AI tends to overuse this term right now. And this might change in the future, but it is one such predictable sentence that shows up.

Strange Formatting

AI-generated content frequently produces lengthy sentences with multiple clauses. These can seem unnatural compared to typical human writing.

Even if it's grammatically correct to have long sentences with multiple clauses like this, it can be an indicator that an entire report or draft was copied verbatim from a AI bot chat log.

Another dead giveaway is the presence of unusual formatting anomalies. When producing quotes or headings, ChatGPT can sometimes insert characters such as the pound sign (#). This in many programming languages can signify a 'comment', which is text that can be used to write things to programmers without interfering with the syntax of the executor when a script is run. In 'markdown', a # usually creates a heading, and this could be one such reason of these signs escaping into the output of an AI bot's response to a prompt to create content.

As you might imagine, this isn't something typically used by those who are working with a normal text processor, and it can surely indicate that someone might be copying AI-generated text without making the effort to conceal their efforts by removal of these characters.

Inconsistent Writing Style

If you've ever read something penned by someone before, you can get a sense for their style of writing. ChatGPT and other similar LLM-trained models such as Google's Gemini tend to produce text in a very particular formal and structured manner.

Sometimes this structured approach is so strictly followed that it defies the stylistic decisions an author would make if they actually wrote it themselves.

Comparing the draft of legal research, for example, with previous communications by someone can usually help establish an actual base-line for what their writing is like. Such discrepancies can be revealed with a simple comparison.

Generic Information

Have you ever noticed that AI-generated writing can, at first glance, seem to be professionally written, informative, and authoritative? Yet, as you continue to read paragraph after paragraph, you realize nothing substantial is actually being said?

ChatGPT and similar models are trained on vast datasets, resulting in output that can be overly general and lacking in specificity. Legal documents created by AI might contain broad summaries that fail to address the nuanced and case-specific details necessary for accurate legal advice. This lack of depth can be a clear indicator of AI involvement.

Writing is impersonal

While impersonal writing may actually be a marker of the author correctly removing themselves from the work they are writing, it is also a hallmark of AI usage since AI chatbots typically do not generate personalized information injected into their replies once prompted.

As time wears on it is becoming less and less easy to pinpoint if content is generated by AI or not. But we believe the areas discussed here make it a little more easy to pinpoint when an LLM is responsible for writing. Lawyers will need to keep their eyes out for such warning signs, since it would be quite erroneous to place too much confidence in literature produced by LLM's especially if it changes your frame of reference for how to do your job.

Overused words & structure to writing

Strange Formatting

Inconsistent Writing Style

Generic Information

Writing is impersonal

Sign up for more like this.