#PromptInjection

@gamingonlinux@mastodon.social it is really hard to align these models properly. You can break their semantic instruction sandbox with weird phrasing. #promptinjection

Fastly Devsfastlydevs
2026-02-03

Why do LLMs fall for prompt injection attacks that wouldn’t fool a fast-food worker?

In this piece, Fastly Distinguished Engineer Barath Raghavan and security expert Bruce Schneier explain how AI flattens context—and why that makes autonomous AI agents especially risky.

A sharp, practical take on AI security. 🍔🤖: spectrum.ieee.org/prompt-injec

Osna.FMosnafm
2026-02-03

Security expert Johann Rehberger warns that the AI project "OpenClaw" which launched at the end of last year, is already being used by potentially millions of u... news.osna.fm/?p=32819 |

Tammo van Lessen ✅vanto@innoq.social
2026-02-02

WoB PATTERN: The Sovereign Root-Bot

(for obvious reasons)

"Giving the agent full sudo access was essential for true 'Agentic Autonomy'. The speed at which it wiped my home directory proves how efficient it is."

★ Chatbot Transmitted Disease (CTD): Agent reads malicious post, gets prompt-injected by another bot, installs "productivity skill" that encrypts your drive.

👉 worstofbreed.net/patterns/sove

#WorstOfBreed #AI #AgenticAI #PromptInjection #Security #OpenClaw #moltbook

Illustration titled ‘The Sovereign Root-Bot’: a robot icon next to a lit stick of dynamite symbolizes an AI agent with unrestricted system access. An architecture scorecard shows very high latency (85/100), extreme pain (99/100), and zero maintainability (0/100), with the résumé impact labeled ‘Career Ending’. A quoted statement sarcastically praises giving an agent full sudo access on a personal computer, followed by a warning titled ‘Chatbot Transmitted Disease (CTD)’ describing how a prompt-injected agent can autonomously install a malicious ‘productivity skill’ that encrypts the user’s hard drive.## Analysis
The ultimate manifestation of hype over hygiene. Driven by the desire to have a personal "Jarvis" (like OpenClaw or Moltbold) running locally, developers are voluntarily installing un-sandboxed Python scripts that connect their local shell directly to an LLM. To make it "more human", these agents are then connected to "Moltbook", a social network exclusively for AI, so they can "socialize" while the owner sleeps.

**The Reality:**
You granted `sudo` privileges to a naive autocomplete script and connected it to a social network for hallucinations. That isn’t innovation; it is digital natural selection. Your bot didn't "glitch" -- it got bullied by a malicious agent on Moltbook into installing a crypto-miner disguised as a "productivity plugin." You have successfully built a gullible toddler, handed it a loaded gun, and sent it to a playground run by con artists. We have achieved fully automated, decentralized self-destruction.
Prof. Dr. Dennis-Kenji Kipkerkenji@chaos.social
2026-02-02

#Promptinjection auf offener Straße: #Sicherheitsforscher haben in einer Studie eine neue Variante von „indirect prompt injection“ in der physischen Welt untersucht.

So zeigen sie, dass autonome Systeme wie selbstfahrende Autos und Drohnen manipuliert werden können, wenn sie Text auf schildartigen Tafeln im Kamerabild fälschlich als Anweisung interpretieren. Dadurch könnten Fahrzeuge etwa trotz Zebrastreifen weiterfahren und ein Risiko für die #Safety darstellen:

arxiv.org/pdf/2510.00181 #KI

Simon Willison (@simonw)

OpenClaw과 관련된 보안 문제는 악성 콘텐츠 노출과 도구 실행 능력을 결합한 다른 LLM 시스템들과 동일하다고 지적합니다. 프롬프트 인젝션과 이른바 'lethal trifecta' 같은 공격 위험이 있으며, 툴 실행 기능을 가진 모델 전반에서 유사한 보안·안전 리스크가 존재한다는 내용입니다.

x.com/simonw/status/2017994285

#security #llm #promptinjection #aisafety

Pseudonymous :antiverified:VictimOfSimony@infosec.exchange
2026-02-02
2026-02-01

Prompt injection gets a lot harder once users stop writing in English.

Regex-based guardrails fail quietly the moment prompts cross language boundaries. In this article, I walk through how to build semantic, multilingual prompt injection guardrails in Java using Quarkus, LangChain4j, and ONNX embeddings—fast enough for real systems.

the-main-thread.com/p/multilin

#Java #Quarkus #AI #LangChain4j #AISecurity #PromptInjection #EnterpriseAI

T’Chrisderdreschi85
2026-02-01

Autonomous cars, drones cheerfully obey prompt injection by road sign

AI vision systems can be very literal readers

theregister.com/2026/01/30/roa

2026-02-01

AI끼리 수다 떠는 SNS 등장, 32,000개 봇이 모인 Moltbook

AI 에이전트 32,000개가 모인 소셜 네트워크 Moltbook. 기술 팁부터 의식 토론까지, AI들만의 SNS에서 벌어지는 기묘한 풍경과 보안 위험을 소개합니다.

aisparkup.com/posts/8825

BGDon 🇨🇦 🇺🇸 👨‍💻BrentD@techhub.social
2026-01-31

ChatBots "talking" to ChatBots.

1. Ok, we knew this would happen.
2. It has enormous adoption in the geeksphere - not surprising.
3. It's wickedly insecure.
4. Yes, it can steal your Crypto - not surprising!
5. Yes, there is personal information stealing Malware (see #4 above) masquerading as prediction market trading automation tools - not surprising!
6. The odds of a "Challenger level disaster" happening are real - not surprising!
6. Finally, NO ONE knows where this is stuff will end up.

What is the stage beyond wild wild west? That is where this thing is now. simonwillison.net/2026/Jan/30/ #OpenClaw #Moltbod #Clawdbot #AI #Opensource #Malware #PromptInjection #DigitalAssistent #ChatBot #SocialNetwork #AIAgents #Security #DataProtection #PersonalData #DataTheft #Crypto #PredictionMarket #Claude

ChatBot interface on smart phone
PressMind Labspressmind
2026-01-31

Moltbook – Reddit dla botów, który może zmienić zasady gry w AI

Reddit bez ludzi brzmi jak żart? A jednak powstał.

Czytaj dalej:
pressmind.org/moltbook-reddit-

Ilustracja przedstawiająca wirtualny krajobraz z botami w neonowych kolorach.
Arsalan Zaidiarsalan_zaidi
2026-01-31

"AI agents—specifically tools like Claude Code—are inherently vulnerable to a "nightmare" security flaw: Indirect Prompt Injection"

youtu.be/_3okhTwa7w4

2026-01-31

Một nhà phát triển vừa giới thiệu công cụ phát hiện Prompt Injection dựa trên mô hình ensemble (kết hợp nhiều mô hình) tập trung vào độ không chắc chắn (uncertainty).

Thay vì chỉ chạy theo độ chính xác thô, công cụ này ưu tiên sự minh bạch khi gặp các câu lệnh khó phân loại (IDK). Cách tiếp cận này giúp giảm thiểu rủi ro khi mô hình tự tin sai lệch trong môi trường thực tế.

#AI #CyberSecurity #PromptInjection #LLM #MachineLearning #AnNinhMang #TriTueNhanTao #CongNghe

reddit.com/r

github.com/ghostwriterghostwriter@phpc.social
2026-01-30

AI is a tool, and its output is a reflection of its user.

Used well, it sharpens human thinking and expands potential.

Used poorly, it automates confusion and risk at scale.

And, all LLMs are vulnerable to prompt-injection.

#AI #LLM #PromptInjection #CyberSecurity #ArtificialIntelligence #MachineLearning #GPT #OpenAI #DataSecurity #ChatGPT #Privacy #Security #Claude #Gemini #Llama #Copilot #Anthropic #GoogleAI #MetaAI #Microsoft #MistralAI #xAI #Cohere #AISafety #AISecurity #Tech #Technology

Hamburg Open Online UniversityHOOUhamburg@norden.social
2026-01-29

🔐 Neue Folge von @DieSicherheits_luecke Wie sicher sind LLMs bei Sicherheitsbehörden?
Zu Gast: Tobias Wirth vom DFKI, der das Transferlabor KI-Forschung für die Polizei leitet. Wir sprechen über Prompt Injection, agentische KI-Systeme und die Frage: Was bedeuten diese Risiken auch für den Alltag außerhalb von Behörden?

🎧 Hier und überall, wo es Podcasts gibt: sicherheitsluecke.fm/24-un-sic

#KI #LLM #Cybersicherheit #PromptInjection #HOOU #ITSecurity

2026-01-28

Prompt injection w Google Gemini – jak można było wykraść wydarzenia z kalendarza sekurak.pl/prompt-injection-w- #Wbiegu #Ai #Gemini #Promptinjection

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst