Cloud Controls, Patch Actions, and Threat Activity

Cloud providers emphasized tighter control planes, private connectivity, and operational visibility, while security teams faced urgent patching and hardening guidance amid active campaigns. Multiple AWS releases expand governance, observability, and performance options; Google detailed efficiency gains for LLM serving on Kubernetes. Concurrently, advisories for NGINX and Fortinet environments and fresh research on ransomware tooling and regional campaigns underscored the need for rapid remediation and layered defenses.

Governance and Private Connectivity in the Cloud

Amazon EKS introduced customer-routed control plane egress, enabling outbound Kubernetes API server traffic—such as admission webhook callbacks, OIDC provider lookups, and aggregate API calls—to traverse a customer’s VPC and follow customer-controlled routing and security groups. Administrators can set controlPlaneEgressMode to CUSTOMER_ROUTED when creating or updating clusters and enforce this using the eks:controlPlaneEgressMode IAM condition key with AWS Organizations SCPs. The capability is available at no additional cost across all AWS Regions that offer EKS, giving teams stronger visibility and alignment with data perimeter and compliance requirements.

Amazon MQ for RabbitMQ added native private networking connectivity so brokers can reach resources inside a customer VPC without public exposure. Implemented with Amazon VPC Lattice, AWS Resource Access Manager, and AWS PrivateLink, the feature replaces prior Network Load Balancer and NAT Gateway workarounds, reducing operational overhead and improving security posture. It supports common patterns such as authenticating to private identity providers, federating with other brokers, and communicating with self-hosted RabbitMQ, and is available in all Regions where VPC Lattice is supported.

AWS SOC OSCAL reports are now published in machine-readable OSCAL format alongside PDFs for the Spring 2026 SOC 1 and SOC 2 package covering 188 services from April 1, 2025 through March 31, 2026. Accessible via AWS Artifact, the OSCAL package is intended to support automated security and compliance workflows, reducing manual processing and accelerating integration. AWS states it is the first major cloud provider to publish key compliance reports in NIST’s Open Security Controls Assessment Language and encourages customers to share use cases and feedback.

Observability, Optimization, and Resilience for Workloads

SageMaker AI introduced detailed observability for inference endpoints, consolidating token-level metrics—Time to First Token, inter-token latency, queue depth, tokens per second—with infrastructure indicators like GPU utilization, component copy counts, autoscaling events, and cold start breakdowns. Signals surface in a pre-built SageMaker AI Insights dashboard in Amazon CloudWatch and are published as OpenTelemetry-native metrics, enabling correlation of latency spikes with resource conditions and faster diagnosis. A regional PromQL endpoint and Grafana template support external platforms, with availability across multiple Regions in the Americas, Europe, and Asia Pacific.

Compute Optimizer expanded its EBS volume rightsizing recommendations by incorporating two no-cost CloudWatch metrics—VolumeIOPSExceededCheck and VolumeThroughputExceededCheck—capturing per-minute attempts to exceed a volume’s provisioned IOPS or throughput. Applied to EBS volumes on Nitro-based EC2 instances (excluding standard and Multi-Attach volumes), these signals help identify bursty high-I/O workloads and balance cost with required performance. The enhancement is available in all Regions where Compute Optimizer operates, except AWS GovCloud (US) and China Regions, and is accessible in the Compute Optimizer console.

Amazon MSK Express brokers now support Intelligent Rebalancing on existing clusters, enabled by default and at no additional charge in Regions where Express is offered. The service continuously monitors resource usage and imbalance using MSK heuristics, automatically reassigns partitions, and scales brokers to maximize utilization and performance. AWS reports up to 180x faster operations compared to rebalancing on Standard brokers, with design considerations to maintain availability so producers and consumers can continue during adjustments, reducing manual partition management and third-party dependency.

Compute, Databases, and Multimodal AI at Scale

EC2 G7 instances reached general availability, offering capacity optimized for AI inference and graphics workloads. Accelerated by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs, G7 delivers up to 4.6x improved AI inference and up to 2.1x improved graphics performance over G6, supports up to eight 32 GB GPUs per instance, integrates custom Intel Xeon 6 processors, and provides up to 700 Gbps via EFA. Available in US East (Ohio) and US West (Oregon) as On-Demand, Savings Plans, or Spot, the instances target use cases including translation, vision analysis, speech recognition, recommender systems, rendering, and large-scale data processing.

Amazon RDS for SQL Server increased General Purpose (gp3) volume limits to 64 TiB per volume, up to 80,000 provisioned IOPS, and up to 2,000 MiB/s throughput, with up to three gp3 or io2 volumes per DB instance and total instance storage up to 256 TiB. The changes allow consolidation of demanding SQL Server workloads and simplify storage management while aligning with mission-critical performance needs. Pricing remains based on storage and any provisioned IOPS and throughput beyond baseline.

Ministral-3-14B from Mistral AI is now available in Amazon SageMaker JumpStart, bringing a 14B-parameter multimodal model designed for efficient edge deployment with frontier-class capabilities. It supports visual analysis, structured JSON outputs, and native function calls for agentic behaviors, with multilingual coverage across dozens of languages. Customers can deploy directly from SageMaker Studio or via the SageMaker Python SDK, streamlining experimentation and production for assistants, autonomous agents, and vision-enabled applications.

GKE Ray Serve performance improvements—co-developed by Google Cloud and Anyscale—deliver up to 5x throughput and up to 8x lower latency for LLM serving without losing developer ergonomics. Enhancements include built-in HAProxy for internal routing, a direct token streaming architecture that bypasses ingress routers, and a v2 Ray executor backend for vLLM that moves Ray out of the data plane for asynchronous scheduling aligned with vLLM optimizations. Benchmarks on A4 VMs with NVIDIA HGX B200 and a compact Gemma 4 E2B model show scaling behavior approaching native vLLM, while the GKE Ray Operator add-on eases deployment and operations across accelerators.

Patch Imperatives and Active Threats

F5 NGINX products received patches for two critical RCE vulnerabilities disclosed in June 2026. CVE-2026-42530 (use-after-free in ngx_http_v3_module) can be triggered by crafted HTTP/3 sessions to reopen a QPACK encoder stream, and CVE-2026-42055 (heap-based overflow in ngx_http_proxy_v2_module and ngx_http_grpc_module) can be triggered when proxying HTTP/2 with certain directive combinations and oversized large_client_header_buffers. F5 lists fixed and impacted versions across NGINX Open Source, NGINX Plus, and related offerings, and recommends disabling HTTP/3 for the first flaw and removing ignore_invalid_headers off or reducing large_client_header_buffers below 2 MB for the second. No active exploitation of these specific flaws was reported.

CISA alert urged hardening Fortinet devices after global reports of credential exposure involving roughly 74,000 Fortinet systems, including FortiGate firewalls and SSL VPN gateways, in activity referred to as FortiBleed. Recommended steps include terminating active SSL VPN and administrative sessions, resetting VPN and admin passwords with strong policies, enforcing PBKDF2 storage and removing weaker legacy hashes, reviewing logs for lateral movement and suspicious changes, enforcing phishing-resistant MFA for remote access and admin accounts, restricting public administration access, and disabling unnecessary accounts.

Operation Escaneo analysis by CloudSEK detailed a campaign targeting critical infrastructure in Mexico, with activity in Ecuador and Portugal, focusing on government, tax authorities, utilities, transport, telecoms, and banks. Attackers tuned exploits for Fortinet FortiOS SSL-VPN and Ivanti Connect Secure, alongside GhostCat, EternalBlue, Zerologon, and Log4Shell, and used a reconnaissance engine (Kimera) to scan and triage targets. Persistence and exfiltration leveraged Neo-reGeorg, Chisel, and a GRE tunnel on a compromised Cisco router; logs showed 3,708 Chisel sessions over 13 days. Inside networks, intruders accessed SAP and Oracle systems, extracting personal records, AD maps, live SSL key streams, and service-account hashes.

ESET research examined the Gentlemen ransomware operation’s EDR-disabling framework, GentleKiller, which exists in at least eight variants abusing vulnerable or malicious drivers. The operators standardize defense evasion using commercial packers (Enigma/Themida), impersonated vendor metadata, and copied invalid signatures; they also integrate external tools such as HexKiller, ThrottleBlood, and HavocKiller, and tie an OxideHarvest credential stealer to an affiliate. Victimology is globally distributed, with opportunistic targeting including FortiGate misconfigurations, and staging markers and rapid BYOVD adoption reflect a centrally provided, agile EDR-killer capability.

NCSC warning highlighted that about 75% of incidents affecting UK critical national infrastructure from June 2025 to May 2026 were attributable to or linked with hostile state actors such as Russia, China, and Iran, with roughly 200 incidents handled in that period. The NCSC framed defenses across far, mid, and near contested spaces and cautioned that adversaries are increasingly exploiting cloud services and open-source supply chains, forecasting likely AI-enabled attacks against known legacy flaws by 2028. The call to action emphasized continuous capability, performance, sensing, and response over static risk treatment.

CSO report documented a six-wave campaign abusing Google Ads, GitLab Pages, and the claude.ai shared-chat feature to socially engineer developers into executing malicious commands. Operators created 92 malicious hostnames, impersonated AI and developer brands (e.g., ChatGPT Codex, Perplexity, JetBrains, Cursor IDE, Claude), and shifted from GitLab-hosted pages to weaponized persistent claude.ai share URLs, continuing to drive traffic via sponsored search. Targeting developers increased payoff due to credentials, tokens, source code, and CI/CD access on engineering endpoints, while reputation-based defenses struggled against the “trust stacking” of legitimate domains and services.