A Turing test on Turing tests
TL;DR: A new tool to detect A.I. generated texts.
Text generators and text detectors are like the game of 😾cat and mouse🐁.
🤖Artificial Intelligence tools and especially GPT models are replacing human jobs.
👨💻Even bad programmers will be replaced soon.
There were several artificial intelligence plagiarism tools out there.
Now the popular ChatGPT model from open.ai released their own.
The full blog post is here:
Let's try it:
(I wrote this from my own experience without reading ChatGPT explanation)
How to detect AI text generated
Artificial Intelligence can generate texts using pre-trained models.
These models are based on public domain texts and can create knowledge by predicting relations between concepts.
Synthetic texts generated by Artificial Intelligence give facts without quoting the sources.
Sometimes they are not very deep in the explanations unless we require refinement.
A simple way to detect generated texts is by comparing several known generators and asking them to write the desired text.
Another hint would be to add some grammar mistakes to see if the tool understands its dataset since only humans make mistakes.
If you read many texts generated by Artificial Intelligence you will notice they use the same grammar patterns.
You can instruct the generators to target other audiences, use active voices, etc.
You can then measure the words' distance, common words, etc.
After you compare it with a predicted threshold, you can check how confident we are the text was not generated by a human being.
I alone wrote the above text, and after using several writing assistant tools like HemingwayApp or Grammarly I put it into the brand new tool here
The result: The classifier considers the text to be unlikely AI-generated. 😺
I passed the Turing test. 😺🧠
Being the scale:
"Very unlikely to be AI-generated" corresponds to a classifier threshold of <0.1. About 5% of human-written text and 2% of AI-generated text from our challenge set has this label.
"Unlikely to be AI-generated" corresponds to a classifier threshold between 0.1 and 0.45. About 15% of human-written and 10% of AI-generated text from our challenge set has this label.
"Unclear if it is AI written" corresponds to a classifier threshold between 0.45 and 0.9. About 50% of human-written text and 34% of AI-generated text from our challenge set has this label.
"Possibly AI-generated" corresponds to a classifier threshold between 0.9 and 0.98. About 21% of human-written text and 28% of AI-generated text from our challenge set has this label.
"Likely AI-generated" corresponds to a classifier threshold >0.98. About 9% of human-written text and 26% of AI-generated text from our challenge set has this label.
ME: Explain how to detect AI text generated
ChatGPT: Detection of AI-generated text can be challenging, but there are several methods that can be used:
Pattern recognition: AI-generated text often exhibits patterns in sentence structure, language use, and word choice that can be distinctive compared to human-written text.
Statistical analysis: AI-generated text can be analyzed for certain statistical features such as character and word frequency, and n-gram distribution that are different from human-written text.
Human evaluation: In some cases, human experts can review the text and determine if it was written by a human or generated by AI based on their experience and expertise.
Use of AI models: AI models can be trained to identify AI-generated text using a labeled dataset of human-written and AI-generated text. The model can then be used to identify AI-generated text in new, unseen text.
It's worth noting that AI technology continues to evolve, and new methods for generating text are being developed, so these methods may not always be foolproof.
Nice explanation. Let's use the tool again.
Result: The classifier considers the text to be possibly AI-generated.😂
The model has > 0.9 accuracy that Artificial Intelligence generated it
That was expected.
Let's cut and paste This Article.
Natural language generation (NLG) is a software process that produces natural language output. A widely-cited survey of NLG methods describes NLG as "the subfield of artificial intelligence and computational linguistics that is concerned with the construction of computer systems than can produce understandable texts in English or other human languages from some underlying non-linguistic representation of information".
While it is widely agreed that the output of any NLG process is text, there is some disagreement about whether the inputs of an NLG system need to be non-linguistic. Common applications of NLG methods include the production of various reports, for example weather and patient reports; image captions; and chatbots.
Automated NLG can be compared to the process humans use when they turn ideas into writing or speech. Psycholinguists prefer the term language production for this process, which can also be described in mathematical terms, or modeled in a computer for psychological research. NLG systems can also be compared to translators of artificial computer languages, such as decompilers or transpilers, which also produce human-readable code generated from an intermediate representation. Human languages tend to be considerably more complex and allow for much more ambiguity and variety of expression than programming languages, which makes NLG more challenging.
Result: The classifier considers the text to be unclear if it is AI-generated. 📚
Confidence is between 0.45 and 0.9.
Looking at the page's history we see several humans and not so many bots.
My conclusion is that ChatGPT was fed partially with the text so it is measuring overfitting.
I took the text generated by ChatGPT and edited it myself with my editor tools.
Detection of AI-generated text can be challenging, but you can use several methods.
You can also use AI models by training to identify AI-generated text using a labeled dataset of human-written and AI-generated text.
You can use pattern recognition since AI-generated text often exhibits patterns in sentence structure, language use, and word choice that can be distinctive compared to human-written text.
Besides, there is human evaluation where in some cases, human experts can review the text and determine if it was written by a human based on their experience and expertise.
You can also use a tool called statistical analysis to check AI-generated text for certain statistical features such as character and word frequency, and n-gram distribution that are different from human-written text.
You can identify AI-generated text in new and unseen text using the model.
Finally, AI technology is evolving quickly, and there are new methods for generating text which may not always be foolproof.
Result: The classifier considers the text to be possibly AI-generated. 👿
I was not able to fool the system.
I used the tool to check many of my articles:
ChatGPT: The Surprising Teacher of a +25 Year Senior Programmer
Result: The classifier considers the text to be unlikely AI-generated. 😃
Explain in 5 Levels of Difficulty: ChatGPT
Result: The classifier considers the text to be unclear if it is AI-generated.
It is accurate. I "talked" to ChatGPT to get some ideas. 😳
Why GitHub Copilot is not a Threat to your Job
Result: The classifier considers the text to be very unlikely AI-generated. 😃
☝️ You can try your own articles!
This is yet another amazing tool by Open.ai.
The Official Page has some very interesting details on the accuracy, determinism, and evaluation.