
AI Platforms Harden Security as Clouds Boost Data and Compute
Coverage: 07 May 2026 (UTC)
< view all daily briefs >Cloud and security vendors advanced AI platforms, data services, and identity controls while researchers disclosed critical flaws across popular AI frameworks and enterprise edge devices. Several updates focus on building security into model lifecycles and agent operations, and multiple high‑severity advisories underscore the need to patch quickly and reduce exposure. Infrastructure announcements expand GPU, compute, and storage options, and policy efforts aim to tighten safety evaluations for frontier models.
AI Platforms Add Speed, Transactions, and Built‑In Model Security
Google announced general availability of Gemini 3.1 Flash‑Lite on Gemini Enterprise, positioning it as the fastest, most cost‑efficient model in the Gemini 3 series for ultra‑low‑latency, high‑volume tasks. The model is aimed at agentic workloads such as tool calling and orchestration and is cited for cross‑industry adoption: engineering teams powering IDE assistants, customer service operators deploying text agents at scale, creative platforms enhancing multimodal pipelines, and financial/data teams driving real‑time research and triage. Reported results include ~60% lower costs versus “thinking”‑tier models on similar token mixes, ~1.8s p95 full‑reply latency, sub‑second p95 for classifiers and tool calls, and ~99.6% success rates under heavy concurrent load.
AWS introduced a preview of AgentCore payments for Amazon Bedrock, enabling autonomous agents to transact for services and data. Built with Coinbase and Stripe, the capability implements the x402 payment flow, authenticates wallets, executes stablecoin payments, and returns proofs without breaking the agent’s reasoning loop. Spending limits and governance are enforced at the infrastructure layer with full observability in existing logs, metrics, and traces. Through the AgentCore Gateway, the Coinbase x402 Bazaar MCP server exposes over 10,000 x402 endpoints that agents can discover and pay autonomously; the preview is available in US East (N. Virginia), US West (Oregon), Europe (Frankfurt), and Asia Pacific (Sydney).
Nutanix and Palo Alto Networks unveiled an integration that embeds Prisma AIRS AI Model Security and AI Red Teaming directly in the Nutanix Enterprise AI platform. Models are scanned at registry check‑in; dependencies are analyzed for known CVEs and license compliance; and provenance, formats, and execution paths are validated to catch deserialization exploits, embedded backdoors, unsafe file formats, and unauthorized code execution before deployment. The red teaming layer onboards LLMs, applications, or agents in minutes and drives tests via documented APIs integrated into CI/CD, drawing on 50+ techniques and a library of 750+ adversarial attacks. A single dashboard in the Nutanix environment consolidates visibility and records for compliance while reducing operational overhead for data science and ML engineering teams.
Exploitable Flaws in AI Toolchains and Network Appliances
Researchers disclosed the “Bleeding Llama” vulnerability in Ollama, tracked as CVE‑2026‑7482, an out‑of‑bounds heap read in the model quantization pipeline when loading GGUF files. Because many Ollama instances expose a REST API without authentication or bind to 0.0.0.0, unauthenticated attackers can upload a malicious model and, in as few as three API requests, trigger memory disclosure; they can then exfiltrate leaked process memory via the push API. Exposed secrets can include system prompts, conversation histories, environment variables, API keys, tokens, code, and customer data. A patched build is available; operators should update to 0.17.1, place an authentication proxy or gateway in front of instances, restrict network bindings and firewall access, rotate credentials, and harden local deployments (Ollama flaw).
Oasis Security detailed a critical Cline Kanban server issue (CVSS 9.7) stemming from missing Origin validation and absent session authentication on three WebSocket endpoints exposed on localhost. Any webpage visited while the local server is running can connect to endpoints that reveal runtime state and provide raw bidirectional terminal access, enabling command execution as if typed by the user. The risk is amplified by a default bypass‑permissions flag that lets the agent modify the filesystem without per‑action approval. Users should update to v0.1.66 and disable the bypass flag, with researchers noting a broader “localhost as trust boundary” anti‑pattern across similar tools (Cline Kanban).
Microsoft published a technical advisory on prompt‑driven host compromise paths in AI agents using Semantic Kernel as a case study. Two critical vulnerabilities were disclosed: CVE‑2026‑26030 (a Python in‑memory vector store filter that constructed a lambda via unsafe string interpolation, enabling os.system() execution) and CVE‑2026‑25592 (a .NET SessionsPythonPlugin helper exposed to the model, allowing writes to sensitive paths and persistence). Fixes add AST and function allowlists, dangerous‑attribute blocks, name‑node restrictions, removal of AI visibility for the helper, canonicalized path validation, directory allowlisting, and hardened I/O boundaries. Customers should upgrade semantic‑kernel (Python) to 1.39.4+ and the .NET SDK to 1.71.0+ and treat any model‑influenced parameter as attacker‑controlled (Semantic Kernel).
Palo Alto Networks warned of limited in‑the‑wild exploitation of a critical PAN‑OS buffer overflow in the User‑ID Authentication Portal (CVE‑2026‑0300), an out‑of‑bounds write enabling unauthenticated, root‑level code execution on affected PA‑Series and VM‑Series firewalls when reachable from untrusted networks. Impact depends on whether the portal is enabled and ports 6081/6082 are exposed. The vulnerability received a CVSS of 9.3 for internet‑exposed deployments and was added to CISA’s KEV catalog. Immediate mitigations include restricting portal access to trusted ranges, disabling Captive Portal if unnecessary, turning off Response Pages on untrusted interfaces, and using Threat ID 510019 (content version 9097‑10022) where supported, followed by prompt installation of patched PAN‑OS builds (PAN-OS flaw).
Security researchers also reported a dozen critical issues in the vm2 Node.js sandbox that enable sandbox escapes, prototype pollution, allowlist bypasses, and arbitrary code execution on the host. Multiple CVEs (including CVE‑2026‑24118, CVE‑2026‑24781, CVE‑2026‑43997, CVE‑2026‑44005) affect releases through 3.11.1, with patches culminating in v3.11.2. Users should upgrade, review dependency trees, and audit configurations; where immediate updates are not possible, restrict inputs and apply defense‑in‑depth controls (vm2 flaws).
Cloud Data and Compute Performance Gains
AWS expanded concurrency scaling in Amazon Redshift to include high‑volume data ingestion via COPY from Amazon S3, now GA across commercial Regions and AWS GovCloud (US). The feature enables parallel loading of Parquet and ORC files and concurrent ingestion of multiple files, reducing queuing and contention with read workloads. Scaling is automatic for Redshift Serverless and configurable for provisioned clusters, delivering higher throughput without manual resizing and helping real‑time analytics and continuous ETL pipelines maintain responsiveness during peaks.
Google introduced a new Bigtable tier that integrates a hot in‑memory layer with SSD and HDD in a single cluster. Using RDMA to bypass CPU on reads, Google reports sub‑millisecond read latency, ~10× higher point‑read throughput per dollar, and resilience to hotspots up to 120,000 QPS on a single row. Bigtable automatically promotes hot rows into memory and evicts cold data while preserving consistency; memory‑enabled application profiles and policies support either automatic promotion or selective routing. The in‑memory tier is available in Bigtable Enterprise Plus, which also reduces p50 SSD latencies below 2 ms.
AWS broadened GPU access with EC2 G7e instances in Europe (London), powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and engineered for up to 2.3× inference throughput versus G6e, and made EC2 G6 with NVIDIA L4 GPUs available in the AWS European Sovereign Cloud (Germany) for customers with strict data residency and compliance needs.
Memory‑focused and network‑intensive compute options also expanded. AWS rolled out EC2 X8i instances to Europe (Ireland) and Asia Pacific (Mumbai), featuring Intel Xeon 6 processors, SAP certification, up to 6 TB memory, and up to 3.3× higher memory bandwidth with application‑level gains versus X2i. In Europe (Ireland), new Graviton4‑based M8gn/M8gb instances provide up to 600 Gbps networking or up to 300 Gbps EBS bandwidth, targeting high‑performance file systems, caches, analytics, databases, and NoSQL.
For general‑purpose throughput, AWS introduced M8idn/M8idb instances with custom sixth‑generation Intel Xeon processors and Nitro, claiming up to 43% better compute performance per vCPU than M6idn. M8idn delivers up to 600 Gbps networking, while M8idb offers up to 300 Gbps EBS bandwidth and low‑latency local NVMe for I/O‑intensive workloads. To accelerate AI/ML training cluster readiness, AWS added AMI‑based node lifecycle configuration to SageMaker HyperPod for Slurm, shipping pre‑baked images (Docker, Enroot, Pyxis) and operational settings to cut cluster creation time, with optional extension scripts for customization.
Governance, Identity, and the AI Defense Posture
Microsoft marked World Passkey Day with progress updates and roadmap items to accelerate passwordless adoption. Working with the FIDO Alliance, the company cites an estimated 5 billion passkeys worldwide and hundreds of millions of daily passkey sign‑ins across Microsoft consumer services. Internally, phishing‑resistant authentication now covers 99.6% of users and devices. Product updates include synced passkeys and unified profiles in Entra ID, device‑bound passkeys via Windows Hello (GA late May 2026), passkeys for Entra External ID (GA late May 2026), preview of passkey‑preferred authentication, and passkey saving/sync in Microsoft Password Manager. Entra ID account recovery is GA with government ID verification and biometric checks, and security questions will be removed as a reset option starting January 2027 (Passkey Day).
AWS published its April security roundup spanning identity, AI security, cryptography, detection/response, and data protection, with guidance on ABAC using IAM Identity Center session tags and Microsoft Entra ID, securing agentic AI with the Model Context Protocol and governance principles, and migration to hybrid post‑quantum TLS for Secrets Manager clients. The digest also covers CloudHSM cross‑Region DR cloning, Threat Technique Catalog updates, automated forensics, OCSF transformations, multicloud security via AWS Security Hub Extended, notable April CVEs, and 16 new samples/workshops for hands‑on validation (AWS digest).
Palo Alto Networks described accelerating offensive capabilities from frontier AI models and launched a Frontier AI Defense initiative combining early model access, offensive simulation, Unit 42 consulting, and a partner alliance. The company highlights large‑scale vulnerability discovery, chaining of lower‑severity issues into high‑impact exploit paths, compressed attack timelines to exfiltration in as little as 25 minutes, and an expanding unsupervised attack surface as agents proliferate.
The Department of Commerce’s CAISI (under NIST) signed agreements with Google DeepMind, Microsoft, and xAI to pre‑test frontier AI models before release, expanding a program that already includes Anthropic and OpenAI. The effort aims to strengthen security‑by‑design through early access, continuous evaluation, and cross‑sector collaboration, amid reports of a forthcoming executive order to establish a model vetting system (CAISI accords).