We Tested 5 AI Detectors. Here’s What Happened.

AI writing is on the rise. Thanks in large part to large language models (LLMs) such as ChatGPT, Gemini, and Claude, people are using AI to write essays, blogs, emails, and even poetry.

While such writing is made possible by this type of technology, it has also raised questions about originality and authenticity, especially in educational and professional settings.

This uptick in the use of AI writing tools has, in part, contributed to the release of AI detection tools that aim to determine if something was written by a human or computer. Between the many detectors available that say they know “how to detect AI writing”, we had to wonder, how reliable and consistent can we expect these tools to be?

Thus, we decided to run the same passage through five popular AI detection tools to see what they said. The results were interesting.

Methodology – How We Designed the Test

The Paragraph We Used

We asked ChatGPT to write a paragraph on climate change and left it untouched to simulate a typical AI-generated output. Here’s the raw, unedited version:

“Climate change is one of the most pressing global challenges facing humanity today. Rising temperatures, melting ice caps, and extreme weather events are becoming more frequent and severe. Scientists agree that human activities, particularly the burning of fossil fuels, are the primary drivers of this phenomenon. If left unaddressed, climate change will have devastating consequences for ecosystems, economies, and communities around the world. Urgent action is needed to reduce emissions and transition to sustainable energy sources.”

The Five AI Detection Tools We Tested

We submitted this paragraph to five well-known detectors:

Walter Writes AI Detector
Designed for writers, students, and marketers. Offers a free AI detection checker with solid accuracy and a user-friendly interface.
Originality.ai
Paid tool targeting academic integrity and SEO. Popular with educators and professional editors.
GPTZero
Free to use. Widely adopted by teachers and schools, but often criticized for false positives and inconsistent results.
Writer.com AI Detector
Built into Writer’s enterprise suite. Focused on business writing and team content governance.
Undetectable AI
More than a detector—also rewords content to bypass detection. Mixed reputation due to its aggressive rewriting model.

Results – What Each AI Detection Tool Found

Side-by-Side Comparison Table

Detector	AI Probability (%)	Verdict
Walter Writes AI	99%	AI-generated
Originality.ai	98%	AI-generated
GPTZero	54%	Possibly AI
Writer.com	20%	Likely Human
Undetectable AI	38%	Possibly Human

Observations & Interpretation

First, Walter Writes AI Detector hit a home run by being decisive in marking the paragraph with 99% confidence. Originality.ai wasn’t too far behind, showing the same results. But here’s where it gets juicy – GPTZero, a tool that gets a fair amount of play in schools, gave a little “possibly AI” rating (54%). Writer.com and Undetectable AI were not even as strict as GPTZero, and implied it could be human-written.

So why the discrepancy? We guess that the difference largely has to do with the way the AI detection tools view writing, and it seems some draw inferences from surface consistency and others from mechanical measures.

→ To learn more about how AI detectors work, read here.

What Happens After Editing?

Light Humanization Test

We gave the same paragraph a light edit—changing some phrasing, breaking up sentences, and making it sound more like natural speech. Here’s the revised version:

“Climate change is a growing threat we can no longer ignore. From rising temperatures to more intense storms, the evidence is everywhere. Experts overwhelmingly agree that burning fossil fuels is a major cause. Without action, we risk serious damage to our environment, economy, and way of life. Now is the time to shift toward cleaner energy and long-term sustainability.”

We reran this new version through all five tools. Here’s how they reacted:

Detector	AI Probability (%)	Verdict
Walter Writes AI	96%	AI-generated
Originality.ai	89%	AI-generated
GPTZero	21%	Likely Human
Writer.com	10%	Human
Undetectable AI	15%	Human

Walter Writes AI still caught it. Despite the softening, it understood the tone and pacing still echoed AI tendencies. Meanwhile, others dropped off, misclassifying the content as human-written.

→ Read more about what the AI Detection Score means.

Walter Writes AI Humanizer Test – Can You Fool All 5?

Here’s where it gets interesting. We ran the same climate change paragraph through Walter Writes AI’s Humanizer, using default settings for natural tone, varied pacing, and humanlike flow. Here’s the humanized output:

Walter Writes AI Humanized Version
“Climate change is changing our world, and the trends we are noticing are hotter summers, more intense storms, along with pristine ice melting away. Scientists are correlating many human activities, such as fossil fuel burning, to these last changes. If we hold steady and choose to follow the trends, it appears that we will have significant effects, and the implications of climate change will not only affect our environment but also our economies and our lifestyles. We must lead and act and demand a clean tomorrow.”

New Results – Same Paragraph, Walter Writes AI Humanized

Detector	AI Probability (%)	Verdict
Walter Writes AI	2%	Human
Originality.ai	4%	Human
GPTZero	9%	Likely Human
Writer.com	3%	Human
Undetectable AI	6%	Human

Success across the board. All five detectors now identify the paragraph as human-written, despite it being derived from an original AI draft. Walter Writes AI’s Humanizer didn’t just paraphrase—it restructured the flow, varied the sentence length, and infused a more human tone, which fooled even the toughest detectors.

→ Before sharing AI-written material, make sure to run it through a trusted humanizer.

Why AI Content Gets Flagged in the First Place

Whether it’s well-formed, AI content typically has a familiar cadence. Detectors look for clues, such as Repetitive sentence length and format Heavy use of transitions like “Furthermore,” “In conclusion” Excessive use of the passive voice Absence of a personal touch, emotion, or storytelling These patterns appear as signals of algorithm-generated text, even though the content seems polished.

Detection Metrics: Perplexity & Burstiness

Most AI detection tools use two metrics:

Perplexity: A measure of the predictability of the selection of words in a sentence. The lower the perplexity, the more predictable (aka more likely to be AI-written).

Burstiness: Measurement of variation in sentence length and complexity. The complexity of human writing tends to vary, while AI writing tends not to hold this complexity.

Both of these metrics contribute to identifying text that looks suspicious. The only trouble with this “due diligence” process is that the instruments deal with these metrics differently, leading to different degrees of effectiveness.

How Walter Writes AI Helps You Beat AI Detection

Refining Text the Right Way

Unlike basic paraphrasing tools, Walter Writes AI Humanizer transforms content by adjusting tone, pacing, and sentence variety—without distorting your message.

It’s especially useful for:

Students avoiding AI flags on essays
Bloggers writing SEO content that feels authentic
Marketers refining copy for email or web
Job seekers are polishing their cover letters

Tested & Proven in This Experiment

In our test, Walter Writes AI Detector was both consistent and accurate. More impressively, its refiner tools helped humanize content in a way that sidestepped detection from weaker tools.

Conclusion – The Truth About AI Detection Checkers

Are AI detection checkers perfect? Not at all. But they do offer helpful signals when used thoughtfully. Writers, students, and educators should view them as part of a bigger toolbox, not a final verdict.

The safest path? Write smarter.

Use AI tools like ChatGPT for a draft, then humanize it with Walter Writes AI to avoid flags and improve clarity.

FAQ – About AI Detection Checkers

Q1: Are AI detectors always accurate?

No. As our test shows, results vary significantly between tools and writing styles.

Q2: Can you trick an AI detector?

Basic paraphrasing might help, but sophisticated tools like Walter Writes AI can reword content while improving quality.

Q3: Which AI detector is best?

In our experiment, Walter Writes AI Detector stood out for accuracy and consistency.

Q4: What’s the safest way to use AI writing?

Refine and humanize your AI output before publishing or submitting. Walter Writes offers tools built for just that.