Introduction
Disclaimer: this post around 50% AI-generated content, you know you were going to ask
For most of us, writing a short article about which you have no expertise is a daunting task. The thought of sitting down and trying to write something authentic and cohesive in less than two minutes brings a shiver to the spine. However, such articles are considered to be quite simple for computers and they can produce them with incredible speed.
Here are several techniques that can help you determine if an article was written by AI or humans; these methods will help you ensure your content is not plagiarized from bots or other automated programs.
Sentiment and emotion expressions
Sentiment analysis is the process of determining whether text (or other media) is positive, negative or neutral. The purpose of sentiment analysis is to determine how people feel about a topic and/or brand. It's usually conducted on social media, but can also be used in political campaigns and marketing studies.
Sentiment analysis is typically done using machine learning techniques like Natural Language Processing (NLP).
Length
Length is a good indicator of whether you're dealing with human or AI. Humans tend to write longer sentences and paragraphs while machines are more likely to make their point quickly and succinctly.
Cohesion and complexity
Cohesion is the degree to which a text or discourse is unified or unified. Complexity is the degree to which a text or discourse is difficult to understand. Both of these properties can be measured with very simple methods: by counting sentences (x-axis) and words used (y-axis).
Spoken language is different than written language
Spoken language is more informal, shorter (as in the length of sentences), and less complex than written language.
Spoken language is also more repetitive and emotional compared to written language.
This can make it difficult for a machine learning model that was trained on text to understand spoken words correctly. But even if you train your model on both audio and text, you're still going to have problems with certain types of speech (e.g., accents).
Early web sites exist
http://gltr.io/dist/index.html - rough but has basic analysis skills
https://huggingface.co/openai-detector/ - another tool that scores its confidence in how much content you submit it believe is human-generated vs. computer-generated
https://originality.ai/ (paid site) - more polished and designed for business use vs. academia situations
Many sites exist for teachers to check for plagiarism - https://papersowl.com/free-plagiarism-checker - some of these will be adapted to spot AI-generated content
There are also tools that work to rewrite content so it doesn't appear to come from a computer. In short, this will be a cat and mouse game where AI continually learns how to appear more and more human (some argue that the mere presence of these scoring sites actually helps the AI tools appear more human.)
Comentários