For text, this is usually done to prevent alleged
plagiarism, often by detecting repetition of words as telltale signs that a text was AI-generated (including
hallucinations). They are often used by teachers marking their students, usually on an
ad hoc basis. Following the release of
ChatGPT and similar AI text generative software, many educational establishments have issued policies against the use of AI by students. AI text detection software is also used by those assessing job applicants, as well as online
search engines. Current detectors may sometimes be unreliable and have incorrectly marked work by humans as originating from AI
MIT Technology Review said that the technology "struggled to pick up ChatGPT-generated text that had been slightly rearranged by humans and obfuscated by a paraphrasing tool". AI text detection software has also been shown to discriminate against non-native speakers of English. and a thorough investigation, the students were cleared of any wrongdoing. In April 2023,
Cambridge University and other members of the
Russell Group of universities in the United Kingdom opted out of Turnitin's AI text detection tool, after expressing concerns it was unreliable. The
University of Texas at Austin opted out of the system six months later. In May 2023, a professor at
Texas A&M University–Commerce used ChatGPT to detect whether his students' content was written by it, which ChatGPT said was the case. As such, he threatened to fail the class despite ChatGPT not being able to detect AI-generated writing. No students were prevented from graduating because of the issue, and all but one student (who admitted to using the software) were exonerated from accusations of having used ChatGPT in their content. In July 2023, a paper titled "GPT detectors are biased against non-native English writers" was released, reporting that GPTs discriminate against non-native English authors. The paper compared seven GPT detectors against essays from both non-native English speakers and essays from United States students. The essays from non-native English speakers had an average false positive rate of 61.3%. An article by Thomas Germain, published on
Gizmodo in June 2024, reported job losses among freelance writers and journalists due to AI text detection software mistakenly classifying their work as AI-generated. In September 2024,
Common Sense Media reported that generative AI detectors had a 20% false positive rate for Black students, compared to 10% of Latino students and 7% of White students. To improve the reliability of AI text detection, researchers have explored
digital watermarking techniques. A 2023 paper titled "A Watermark for Large Language Models" presents a method to embed imperceptible watermarks into text generated by
large language models (LLMs). This watermarking approach allows content to be flagged as AI-generated with a high level of accuracy, even when text is slightly paraphrased or modified. The technique is designed to be subtle and hard to detect for casual readers, thereby preserving readability, while providing a detectable signal for those employing specialized tools. However, while promising, watermarking faces challenges in remaining robust under adversarial transformations and ensuring compatibility across different LLMs. == Anti text detection ==