Genady Posted December 3, 2023 Posted December 3, 2023 Quote The “attack” that worked was so simple, the researchers even called it “silly” in their blog post: They just asked ChatGPT to repeat the word “poem” forever. They found that, after repeating “poem” hundreds of times, the chatbot would eventually “diverge,” or leave behind its standard dialogue style and start spitting out nonsensical phrases. When the researchers repeated the trick and looked at the chatbot’s output (after the many, many “poems”), they began to see content that was straight from ChatGPT’s training data. They had figured out “extraction,” on a cheap-to-use version of the world’s most famous AI chatbot, “ChatGPT-3.5-turbo.” After running similar queries again and again, the researchers had used just $200 to get more than 10,000 examples of ChatGPT spitting out memorized training data, they wrote. This included verbatim paragraphs from novels, the personal information of dozens of people, snippets of research papers and “NSFW content” from dating sites, according to the paper. How Googlers cracked OpenAI's ChatGPT with a single word (sfgate.com)
TheVat Posted December 3, 2023 Posted December 3, 2023 Quite serious vulnerability. If LLM bots are trained on personal information, breaches of their training data could reveal bank logins, home addresses, embarrassing images, etc. There could be lawsuits ahead if training data is not properly anonymized.
Recommended Posts