AI Guardrails, Token Phishing PhaaS, and EU Hosting Takedown

The day’s security updates highlight accelerating offensive and defensive dynamics in AI, the growing professionalization of phishing-as-a-service targeting cloud identities, and a rapid cadence of exploited software flaws. Law enforcement actions against abusive infrastructure and ongoing debates about ransomware response underscore how technical and organizational controls must evolve in parallel.

Offensive AI Meets Guardrails

BleepingComputer reports that Anthropic is preparing a cautious rollout of its restricted Mythos model, first previewed on April 7 as a frontier system with strong code reasoning and autonomous tasking. During testing, Mythos automatically crafted highly functional cyberattacks and rapidly surfaced thousands of high- and critical-severity vulnerabilities—capabilities that raised concerns about uncontrolled public access. Anthropic has delayed broad release while it develops stronger guardrails; a brief appearance of the identifier “claude-mythos-1-preview” in public instances of Claude Code and Claude Security suggests this work is underway.

Anthropic is also running a collaborative effort called Glasswing, through which Mythos has already assisted up to 50 organizational partners in identifying and remediating potential AI-driven exploits. The company frames short‑term risks as favoring attackers who obtain powerful tools, while expressing confidence that defenders can ultimately harness such models to fix bugs and secure software pre‑release. In parallel, it continues to offer other Claude models (Opus and Sonnet variants, plus Haiku) while moving deliberately to balance innovation with the protection of digital infrastructure. The episode illustrates a broader tension: advanced generative models can accelerate both software remediation and offensive tradecraft, making release strategies and robust guardrails central to limiting harm.

Securing AI Agents at the System Level

A research paper summarized by CSO Online argues that enterprise AI security should shift from hardening models to enforcing system-level controls around agentic systems. The authors—spanning Google and several universities—contend that models powering agents should be treated as untrusted components, drawing on classic systems-security principles: least privilege, tamper resistance of the trusted computing base, complete mediation, secure information flow, and accounting for human weaknesses. In an analysis of eleven real‑world attacks, violations clustered around secure information flow and least privilege. They also caution that layered machine‑learning guardrails share statistical failure modes with the primary agent and therefore do not deliver true defense‑in‑depth.

To operationalize a systems-oriented approach, the paper highlights three research challenges: separating instructions from data because language models conflate them; generating verifiable least‑privilege policies despite evolving, natural‑language task descriptions; and implementing practical information‑flow control that tracks sensitive data through model reasoning. The authors emphasize runtime isolation, containment boundaries, least‑privilege execution, and enhanced observability, noting that AI agents behave more like distributed operating environments than conventional applications. Complementary work introducing an “agentic detection and response” framework reported production monitoring across thousands of sessions that identified credential exposures and agent risks, outperforming existing baselines—reinforcing the need for system‑level defenses and specialized detection capabilities for agentic environments.

Token‑Theft PhaaS Targets Cloud Identities

The FBI has warned about Kali365, a phishing‑as‑a‑service platform active since April 2026 that targets Microsoft 365 and Microsoft Entra by abusing the OAuth 2.0 Device Authorization grant flow, according to BleepingComputer. Distributed via Telegram, Kali365 equips even low‑skilled actors with AI‑generated lures, automated templates, victim‑tracking dashboards, and token‑capture tooling. Campaigns initiate a legitimate device authorization and coerce users into entering a one‑time code at Microsoft’s portal; once the victim completes MFA, the service captures OAuth tokens that grant access to cloud resources without passwords or additional prompts. Arctic Wolf observed attackers accessing mailboxes, creating inbox rules to hide activity, and in some cases registering new devices for persistence. A separate “Cookie Link” adversary‑in‑the‑middle mode proxies sessions to harvest authenticated cookies and tokens even after MFA is satisfied. The FBI recommends restricting or blocking the device code flow via Conditional Access, auditing device code usage, blocking authentication transfer policies, preserving phishing artifacts and suspicious login data, and reporting to the Internet Crime Complaint Center. The advisory adds that device‑code phishing has become widespread in 2026, with other PhaaS offerings adopting similar techniques.

A parallel analysis by Google’s threat intelligence team details how mature, Chinese‑language PhaaS offerings enable credential theft and financial account takeover at scale. Google Cloud describes services with real‑time administration panels that intercept one‑time passwords and facilitate MFA bypass, and emphasizes monetization by converting stolen payment data into tokenized assets within digital wallets. Operators rely on encrypted delivery channels such as RCS and iMessage to evade filtering and frequently advertise on Telegram, often openly. Beyond PhaaS kits, vendors sell PII, VPS hosting, domain registration, money‑laundering services, IMSI catchers, and bulk messaging tools—allowing low‑skill affiliates to launch culturally localized campaigns. One example, YY Lai Yu, offered hundreds of tailored templates, anti‑analysis human verification, BIN‑based card filtering, domain management, and synchronized victim interactions, with a heavy focus on Japan. GTIG stresses that user awareness alone is insufficient and urges technical mitigations such as FIDO2/WebAuthn, risk‑based verification, and stronger device fingerprinting during wallet provisioning to render harvested credentials unusable.

Exploitation Pace, Ransom Dilemmas, and Enforcement

A weekly recap from The Hacker News underscores persistent supply‑chain threats and rapidly weaponized vulnerabilities. Highlights include the GitHub breach tied to a compromised Nx Console VS Code extension associated with the Mini Shai‑Hulud campaign, two Microsoft Defender vulnerabilities under active exploitation, a BitLocker bypass (YellowKey) with published mitigations, and a maximum‑severity Cisco Secure Workload flaw that broke tenant boundaries. The report also notes a newly disclosed nine‑year‑old Linux kernel bug (CVE‑2026‑46333) enabling local root execution and a critical Drupal SQL injection (CVE‑2026‑9082) already being exploited at scale. The authors urge accelerated patching, stronger supply‑chain hygiene, reassessment of developer environment security, and prioritization of mitigation for the named high‑risk CVEs—alongside asset inventory, secret rotation, and automated monitoring to shrink the exploitation window.

Operational pressures remain acute when ransomware hits. A survey summarized by CSO Online found that 58% of 750 CISOs in the US and UK would be willing to pay a ransom to resolve an incident, despite law‑enforcement guidance discouraging payments. Studies cited in the report show mixed outcomes for payers and reinforce that robust preparedness—particularly reliable backups and recovery plans—correlates with avoiding payment and reducing disruption. Absent that confidence, boards and executives may lean toward payment to limit operational and financial fallout, even as risks to data integrity and future exposure remain.

Law enforcement continues to target infrastructure that enables attacks and influence operations. Dutch authorities arrested two men on May 18 and seized more than 800 servers as part of an investigation into hosting companies allegedly used to support pro‑Russian cyberattacks against European targets, according to KrebsOnSecurity. The probe centers on links between MIRhosting, WorkTitans BV, and Stark Industries—an ISP previously sanctioned by the EU for providing infrastructure used in DDoS campaigns and proxy/anonymity services tied to Russia‑backed groups. Reporting indicates that after EU sanctions against related operators in 2025, traffic and assets shifted to new providers; investigators cite evidence of activity against Danish government bodies during the November 2025 municipal elections. The individuals arrested deny knowingly facilitating cybercrime or sanctions evasion. MIRhosting said it paused services to WorkTitans pending an internal review and reported no clear anomalies during the cited period. The case illustrates the challenges of disrupting complex provider chains without collateral impact on legitimate customers, even as authorities pursue accountability and due diligence.