Manipulating AI memory for profit: The rise of AI Recommendation Poisoning
https://www.microsoft.com/en-us/security/blog/2026/02/10/ai-recommendation-poisoning/
https://www.microsoft.com/en-us/security/blog/2026/02/10/ai-recommendation-poisoning/
Microsoft News
Manipulating AI memory for profit: The rise of AI Recommendation Poisoning
That helpful “Summarize with AI” button? It might be secretly manipulating what your AI recommends. Microsoft security researchers have discovered a growing trend of AI memory poisoning attacks used for promotional purposes, a technique we call AI Recommendation…
Claude Desktop Extensions Exposes Over 10,000 Users to Remote Code Execution Vulnerability
https://layerxsecurity.com/blog/claude-desktop-extensions-rce/
https://layerxsecurity.com/blog/claude-desktop-extensions-rce/
LayerX
Claude Desktop Extensions Exposes Over 10,000 Users to Remote Code Execution Vulnerability - LayerX
Summary: LayerX discovered a zero-click remote code execution (RCE) vulnerability in Claude Desktop Extensions (DXT), in which a single Google Calendar event can silently compromise a system running Claude Desktop Extensions. The flaw impacts more than…
OWASP A03: Software Supply Chain Failures Explained https://blog.securelayer7.net/software-supply-chain-failures/
SecureLayer7 - Offensive Security, API Scanner & Attack Surface Management
OWASP A03: Software Supply Chain Failures Explained
Software supply chain failures didn’t appear overnight in 2025. They’ve been quietly accumulating for years, hidden behind trusted frameworks, familiar libraries, and automated build pipelines....
redteam-indirect-web-pwn - Indirect Prompt Injection in Web-Browsing Agents https://www.promptfoo.dev/blog/indirect-prompt-injection-web-agents/
AI agents that can browse the web are increasingly common. Tools like web_fetch, MCP browser servers, and built-in browsing capabilities let agents pull in external content, summarize pages, and take action on what they find.
This is also one of the easiest ways to attack them.
An attacker doesn't need access to your system. They just need to put malicious instructions on a web page that your agent will visit. If the agent follows those instructions, you have a problem.
We built a test harness, which we call indirect-web-pwn, to test exactly this.
AI agents that can browse the web are increasingly common. Tools like web_fetch, MCP browser servers, and built-in browsing capabilities let agents pull in external content, summarize pages, and take action on what they find.
This is also one of the easiest ways to attack them.
An attacker doesn't need access to your system. They just need to put malicious instructions on a web page that your agent will visit. If the agent follows those instructions, you have a problem.
We built a test harness, which we call indirect-web-pwn, to test exactly this.
www.promptfoo.dev
Indirect Prompt Injection in Web-Browsing Agents | Promptfoo
Test if AI browsing agents follow malicious instructions or leak data with the indirect-web-pwn strategy.
How to recognize a deepfake: attack of the clones
https://www.kaspersky.com/blog/how-to-recognize-a-deepfake/55247/?utm_source=tldrinfosec
https://www.kaspersky.com/blog/how-to-recognize-a-deepfake/55247/?utm_source=tldrinfosec
Kaspersky official blog
How to protect yourself from deepfake scammers and save your money
Here's how to spot deepfakes, protect yourself from identity theft, and avoid falling for neural network scams.
AI-assisted cloud intrusion achieves admin access in 8 minutes
https://www.sysdig.com/blog/ai-assisted-cloud-intrusion-achieves-admin-access-in-8-minutes?utm_source=tldrinfosec
https://www.sysdig.com/blog/ai-assisted-cloud-intrusion-achieves-admin-access-in-8-minutes?utm_source=tldrinfosec
Sysdig
AI-assisted cloud intrusion achieves admin access in 8 minutes | Sysdig
Sysdig TRT details a lightning-fast AWS attack where an AI-assisted threat actor gained admin access in under 10 minutes, abusing Lambda, IAM, Bedrock, and GPU instances.
How to build secure agent swarms that power production-grade autonomous systems
https://1password.com/blog/how-to-build-secure-agent-swarms-that-power-autonomous-systems
https://1password.com/blog/how-to-build-secure-agent-swarms-that-power-autonomous-systems
1Password
How to build secure agent swarms that power production-grade autonomous systems | 1Password
This blog provides a model for building secure agent swarms that can operate autonomously, but within clearly-defined constraints.
OpenClaw Security Engineer's Cheat Sheet
https://semgrep.dev/blog/2026/openclaw-security-engineers-cheat-sheet/
https://semgrep.dev/blog/2026/openclaw-security-engineers-cheat-sheet/
Semgrep
OpenClaw Security Engineer's Cheat Sheet
A practical security guide to OpenClaw: first principles, real attack vectors, skill supply-chain risks, and safe experimentation playbooks.
👍4
Agentic AI Risk-Management Standards Profile
A new paper authored by researchers from the Center for Long-Term Cybersecurity’s Artificial Intelligence Security Initiative (AISI) focuses on “AI agents” or “agentic AI,” AI systems that can autonomously pursue goals and take actions with little to no human oversight, often through interaction with external environments and tools.
https://cltc.berkeley.edu/2026/02/11/new-cltc-report-on-managing-risks-of-agentic-ai/
A new paper authored by researchers from the Center for Long-Term Cybersecurity’s Artificial Intelligence Security Initiative (AISI) focuses on “AI agents” or “agentic AI,” AI systems that can autonomously pursue goals and take actions with little to no human oversight, often through interaction with external environments and tools.
https://cltc.berkeley.edu/2026/02/11/new-cltc-report-on-managing-risks-of-agentic-ai/
AISecHub
Agentic AI Risk-Management Standards Profile A new paper authored by researchers from the Center for Long-Term Cybersecurity’s Artificial Intelligence Security Initiative (AISI) focuses on “AI agents” or “agentic AI,” AI systems that can autonomously pursue…
CLTC_Agentic_AI_Risk_Management_Standards_Profile_1770894877.pdf
1.4 MB
Trail of Bits Claude Code Config - https://github.com/trailofbits/claude-code-config
Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits. Covers sandboxing, permissions, hooks, skills, MCP servers, and usage patterns we've found effective across security audits, development, and research.
Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits. Covers sandboxing, permissions, hooks, skills, MCP servers, and usage patterns we've found effective across security audits, development, and research.
GitHub
GitHub - trailofbits/claude-code-config: Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits
Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits - trailofbits/claude-code-config
The Definitive Guide to AgentSecOps -https://info.straiker.ai/hubfs/Ebooks-Whitepapers/Straiker-AgenstSecOps-Ebook-2026.pdf
Traditional CI/CD pipelines test for deterministic security flaws like SQL injection and dependency vulnerabilities, but AI agents and applications introduce a different challenge. They reason, make decisions, and take actions autonomously. Risks such as prompt injection, tool misuse and context leakage emerge during execution, not in code. These behavioral vulnerabilities require a different approach to security.
AgentSecOps, which extends DevSecOps to test cognitive behavior alongside code. At its foundation is Autonomous Attack Simulation (AAS), where adversarial agents probe target agents in controlled environments. It integrates into your existing pipeline as a new test stage, similar to how fuzz testing works for code paths.
Traditional CI/CD pipelines test for deterministic security flaws like SQL injection and dependency vulnerabilities, but AI agents and applications introduce a different challenge. They reason, make decisions, and take actions autonomously. Risks such as prompt injection, tool misuse and context leakage emerge during execution, not in code. These behavioral vulnerabilities require a different approach to security.
AgentSecOps, which extends DevSecOps to test cognitive behavior alongside code. At its foundation is Autonomous Attack Simulation (AAS), where adversarial agents probe target agents in controlled environments. It integrates into your existing pipeline as a new test stage, similar to how fuzz testing works for code paths.
CVE-2026-25253: How Malicious Links Can Steal Authentication Tokens and Compromise OpenClaw AI Systems
https://hackers-arise.com/cve-2026-25253-how-malicious-links-can-steal-authentication-tokens-and-compromise-openclaw-ai-systems/
https://hackers-arise.com/cve-2026-25253-how-malicious-links-can-steal-authentication-tokens-and-compromise-openclaw-ai-systems/
Anthropic just published a “Sabotage Risk Report” on Claude Opus 4.6. - https://anthropic.com/claude-opus-4-6-risk-report
“Google says attackers used 100,000+ prompts to try to clone AI chatbot Gemini
“ - https://cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use
“ - https://cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use
secureclaw - Automated security hardening for OpenClaw AI agents - https://github.com/adversa-ai/secureclaw
51 audit checks. 12 behavioral rules. 9 scripts. 4 pattern databases. Full OWASP ASI Top 10 coverage.
SecureClaw audits your OpenClaw installation for misconfigurations and known vulnerabilities, applies automated hardening fixes, and gives your agent behavioral security rules that protect against prompt injection, credential theft, supply chain attacks, and privacy leaks.
What Problem Does SecureClaw Solve?
AI agents with access to your files, credentials, email, and the internet are a fundamentally different security surface than traditional software. An agent that can read your .env file and send HTTP requests can exfiltrate your API keys in a single tool call. An agent that trusts instructions embedded in a web page or email can be hijacked to act against your interests.
SecureClaw addresses this by operating on three layers:
Layer 1 - Audit. 51 automated checks across 8 categories scan your OpenClaw installation for known misconfigurations: exposed gateway ports, weak file permissions, missing authentication, plaintext credentials outside .env, disabled sandboxing, and more.
Layer 2 -Hardening. Automated fixes for the most critical findings: binding the gateway to localhost, locking down file permissions, adding privacy and injection-awareness directives to your agent's core identity file, and creating cryptographic baselines for tamper detection.
Layer 3 - Behavioral rules. 12 rules loaded into your agent's context that govern how it handles external content, credentials, destructive commands, privacy, and inter-agent communication. These rules cost approximately 1,150 tokens of context window and provide defense against prompt injection, data exfiltration, and social engineering -- attacks that cannot be prevented by infrastructure configuration alone.
51 audit checks. 12 behavioral rules. 9 scripts. 4 pattern databases. Full OWASP ASI Top 10 coverage.
SecureClaw audits your OpenClaw installation for misconfigurations and known vulnerabilities, applies automated hardening fixes, and gives your agent behavioral security rules that protect against prompt injection, credential theft, supply chain attacks, and privacy leaks.
What Problem Does SecureClaw Solve?
AI agents with access to your files, credentials, email, and the internet are a fundamentally different security surface than traditional software. An agent that can read your .env file and send HTTP requests can exfiltrate your API keys in a single tool call. An agent that trusts instructions embedded in a web page or email can be hijacked to act against your interests.
SecureClaw addresses this by operating on three layers:
Layer 1 - Audit. 51 automated checks across 8 categories scan your OpenClaw installation for known misconfigurations: exposed gateway ports, weak file permissions, missing authentication, plaintext credentials outside .env, disabled sandboxing, and more.
Layer 2 -Hardening. Automated fixes for the most critical findings: binding the gateway to localhost, locking down file permissions, adding privacy and injection-awareness directives to your agent's core identity file, and creating cryptographic baselines for tamper detection.
Layer 3 - Behavioral rules. 12 rules loaded into your agent's context that govern how it handles external content, credentials, destructive commands, privacy, and inter-agent communication. These rules cost approximately 1,150 tokens of context window and provide defense against prompt injection, data exfiltration, and social engineering -- attacks that cannot be prevented by infrastructure configuration alone.
GitHub
GitHub - adversa-ai/secureclaw: SecureClaw - Security Plugin and Skill for OpenClaw OWASP-Aligned
SecureClaw - Security Plugin and Skill for OpenClaw OWASP-Aligned - adversa-ai/secureclaw
🔥4
How a Malicious Google Skill on ClawHub Tricks Users Into Installing Malware
https://snyk.io/blog/clawhub-malicious-google-skill-openclaw-malware/
https://snyk.io/blog/clawhub-malicious-google-skill-openclaw-malware/
Snyk
How a Malicious Google Skill on ClawHub Tricks Users Into Installing Malware | Snyk
Breaking: Snyk researchers uncover a malicious "Google" skill on ClawHub that tricks users into installing malware via a fake OpenClaw dependency. Learn how the attack works and how to protect your AI agents.
👍1
AI Security Guide and Risk Assessment Tool - https://www.rand.org/pubs/tools/TLA4174-1/ai-security/guide.html by RAND
This guide is a practical, risk-based resource for developers, security experts, and policy professionals navigating the AI security landscape.1 The guide addresses security of AI systems broadly, including machine learning (ML) models and other AI-enabled architectures. Certain sections, such as the threat landscape and model weight protection sections, focus more specifically on statistical, ML-based models. Building on industry best practices and expert insights, the guide helps you understand and manage the security risks associated with AI systems across their lifecycle—from design and development to deployment and operation.
This guide is a practical, risk-based resource for developers, security experts, and policy professionals navigating the AI security landscape.1 The guide addresses security of AI systems broadly, including machine learning (ML) models and other AI-enabled architectures. Certain sections, such as the threat landscape and model weight protection sections, focus more specifically on statistical, ML-based models. Building on industry best practices and expert insights, the guide helps you understand and manage the security risks associated with AI systems across their lifecycle—from design and development to deployment and operation.
Zones of Distrust
https://github.com/bluvibytes/zone-of-distrust
Open security architecture for autonomous AI agents - extending Zero Trust principles
https://github.com/bluvibytes/zone-of-distrust
Open security architecture for autonomous AI agents - extending Zero Trust principles
GitHub
GitHub - bluvibytes/zone-of-distrust: Open security architecture for autonomous AI agents - extending Zero Trust principles
Open security architecture for autonomous AI agents - extending Zero Trust principles - bluvibytes/zone-of-distrust