My Poetry Style Defeats Your AI Security Style
The Register: Researchers find hole in AI guardrails by using strings like =coffee. “Large language models frequently ship with “guardrails” designed to catch malicious input and harmful output. But if you use the right word or phrase in your prompt, you can defeat these restrictions.”
ChatGPT introduces parental controls
https://web.brid.gy/r/https://nerds.xyz/2025/09/chatgpt-openai-parental-controls/
UC Riverside: UCR researchers fortify AI against rogue rewiring. “…researchers at the University of California, Riverside, have developed a method to preserve AI safeguards even when open-source AI models are stripped down to run on lower-power devices.”
https://rbfirehose.com/2025/09/09/uc-riverside-ucr-researchers-fortify-ai-against-rogue-rewiring/
After teen suicide, OpenAI claims it is “helping people when they need it most” https://arstechni.ca/RSBe #attentionmechanism #crisisintervention #AIandmentalhealth #contentmoderation #suicideprevention #transformermodels #AIhallucination #machinelearning #AIpaternalism #AIassistants #AIregulation #AIsafeguards #mentalhealth #AIalignment #AIbehavior #AIethics #AIsafety #chatbots #ChatGPT #Biz&IT #GPT-4o #openai #GPT-5 #AI
via @conorperkins@mastodon.social
@conorperkins@threads.net
More details: https://deadline.com/2024/08/ai-protection-bill-california-update-1236061416/
#ArtificialIntelligence #GenerativeAI #DigitalReplicas #actors #politicians #contracts #laws #legislation #CourtDecisions #JudicialRulings #AI #AIsafeguards #safeguards #UnionStrong #SAGAFTRAstrong #union #SAGAFTRA
#AISafeguards Are Pretty Easy to Bypass
https://www.pcmag.com/news/ai-safeguards-are-pretty-easy-to-bypass