< ciso
brief />
Tag Banner

All news with #kubernetes tag

35 articles

Azure enables seamless cross-cluster networking for AKS

🚀 Microsoft announces the public preview of cross-cluster networking for Azure Kubernetes Fleet Manager, bringing transparent east‑west multi-cluster connectivity powered by Advanced Container Networking Services. Built on Cilium and Kubefleet, this managed capability extends the Kubernetes networking model across clusters to enable direct pod-to-pod communication, policy enforcement, and observability while preserving cluster isolation. The managed approach reduces operational overhead for multi-cluster fleets and supports resilient, global, and shared‑services architectures.
read more →

PCPJack Campaign Removes TeamPCP Artifacts from Cloud

🔒 Security researchers uncovered PCPJack, a credential‑theft framework that targets exposed cloud infrastructure and removes artifacts tied to TeamPCP. SentinelOne reports PCPJack worms through services to harvest credentials from Docker, Kubernetes, Redis, MongoDB, RayML and vulnerable web apps. Unlike many cloud campaigns it omits crypto‑mining and actively removes TeamPCP miner code, indicating monetization through credential theft, resale, fraud or extortion.
read more →

GKE Active Buffer reduces Kubernetes scale-out latency

⚡Active Buffer is a GKE preview that implements the Kubernetes CapacityBuffer API to remove scale-out latency by keeping spare node capacity warm. It replaces manual 'balloon' pod hacks and costly over-provisioning with a declarative resource the Cluster Autoscaler treats as pending demand, so critical pods can land instantly. Buffers can be sized by fixed replicas, percentage of deployments, or resource limits.
read more →

Cloud SQL Powers Manhattan Associates' AI Supply Chain

🚀 Manhattan Associates modernized its Manhattan Active SaaS platform by migrating from legacy Oracle and DB2 to Google Cloud databases. Cloud SQL and BigQuery now power core transactions and real-time analytics, enabling over a billion API calls per day with average responses under 150 ms. Containerized microservices on GKE, Pub/Sub streaming, and managed observability deliver automated failover, cross-region recovery, and faster feature delivery. The shift reduced manual scaling and licensing overhead while boosting operational agility and resilience.
read more →

Red Hat OpenShift on Google Cloud: Migration Updates

🔔 Google Cloud announced integrations and product updates to simplify running Red Hat OpenShift on its platform, including Google Cloud Cluster Services for OpenShift, a guided console cluster-creation experience, and the general availability of OpenShift Virtualization on OpenShift Dedicated. The updates emphasize cost optimization via custom machine types, Hyperdisk, and Axion processors, joint engineering with Red Hat, and configuration validation through Workload Manager to help migrate and modernize clusters. Supported integrations and middleware plugins aim to preserve OpenShift-native architecture while enabling selective adoption of managed Google services.
read more →

One-line Kubernetes fix reclaimed 600 hours for Atlantis

🔧 Cloudflare engineers traced repeated 30-minute Atlantis restarts to Kubernetes recursively changing file ownership on a large PersistentVolume. The default pod securityContext behavior (fsGroup combined with fsGroupChangePolicy: Always) caused kubelet to run an expensive recursive chgrp across millions of files, creating a mounting bottleneck. By validating that file group ownership would remain stable and setting fsGroupChangePolicy: OnRootMismatch, restarts dropped to ~30 seconds. That single-line change recovered roughly 50 engineering hours per month (about 600 hours per year).
read more →

DRA: Dynamic Resource Allocation for Kubernetes Devices

⚡ DRA (Dynamic Resource Allocation) modernizes Kubernetes device management by replacing static Device Plugins with a request-based model built on ResourceSlice and ResourceClaim. It enables granular, attribute-based requests such as minimum VRAM, specific hardware models, or PCIe locality, and abstracts hardware via DeviceClass so the scheduler can match workloads to suitable devices. NVIDIA contributed a GPU driver and Google donated a TPU driver, and DRA is generally available in GKE. This reduces manual node pinning and improves utilization for LLM and AI workloads.
read more →

Kubernetes as AI Infrastructure: llm-d Joins CNCF Sandbox

🚀 Google Cloud and partners announced that llm-d has been accepted into the CNCF Sandbox to promote open, accelerator-agnostic standards for distributed LLM inference. As a founding contributor alongside Red Hat, IBM Research, CoreWeave, and NVIDIA, Google emphasizes running any model on any accelerator in any cloud without vendor lock-in. GKE Inference Gateway now integrates the llm-d Endpoint Picker (EPP) to enable model-aware routing that optimizes for KV-cache hits, inflight requests, and queue depth, yielding concrete production gains in Vertex AI tests. Complementary work on the Kubernetes LeaderWorkerSet (LWS) API and vLLM extensions for Cloud TPUs targets scalable multi-node orchestration and up to 5x throughput improvements.
read more →

GKE and OSS Innovation Highlights at KubeCon EU 2026 Updates

🚀 Google Cloud previews GKE and open-source innovations at KubeCon Europe 2026, focusing on making Kubernetes the best platform for AI and agentic workloads. Autopilot compute classes can now be enabled per workload on Standard clusters, and GKE Cluster Autoscaler will be open-sourced to advance vendor-neutral provisioning. GKE is certified for the CNCF Kubernetes AI Conformance program, and projects like llm-d, DRA drivers for TPUs, and DRANET aim to standardize inference and resource management. Features such as the Model Context Protocol, Kubernetes Agent Sandbox, and GKE Pod Snapshots target secure, fast startup and manageability for agents.
read more →

TeamPCP Deploys Iran-Targeted Wiper via Kubernetes

🧨 The TeamPCP group is deploying a geopolitically targeted wiper that seeks out Iranian systems and either destroys host data or implants a persistent backdoor on Kubernetes nodes. Aikido researchers link the campaign to the earlier CanisterWorm and Trivy supply-chain incidents, noting identical C2 infrastructure and the same /tmp/pglog drop path. When Iran indicators (timezone/locale) and Kubernetes are detected, the malware creates a privileged DaemonSet named Host-provisioner-iran that mounts the host root and runs Alpine containers called "kamikaze" to delete top-level directories and force a reboot. If Kubernetes is present but the host is not identified as Iranian, it deploys host-provisioner-std to write a Python backdoor and install it as a systemd service; variants also propagate via SSH or unauthenticated Docker APIs.
read more →

Amazon EKS Adds 99.99% SLA and 8XL Control Plane Tier

🔒 Amazon EKS now offers a 99.99% Service Level Agreement for clusters running on the Provisioned Control Plane, up from the 99.95% SLA on the standard control plane. The upgraded SLA is measured in 1-minute intervals to deliver a more granular availability commitment for mission-critical workloads. At the same time, EKS introduces an 8XL scaling tier that doubles Kubernetes API server request processing capacity compared with the 4XL tier. Both the new SLA and the 8XL tier are available today in all regions where the Provisioned Control Plane is offered.
read more →

AWS Neuron DRA Driver Adds Hardware-Aware Scheduling

🔧 AWS announced the Neuron Dynamic Resource Allocation (DRA) driver for Amazon EKS, enabling Kubernetes-native, hardware-aware scheduling on Trainium-based instances. The driver publishes detailed device attributes — including hardware topology and Neuron-EFA PCIe co-location — directly to the Kubernetes scheduler, removing the need for custom scheduler extensions. Infrastructure teams can publish reusable ResourceClaimTemplates, while ML engineers reference them to deploy workloads without manual hardware tuning.
read more →

Kubernetes security: strengthening cluster defenses

🔒 New Kubernetes clusters are probed and often attacked within minutes, with honeypots run by Palo Alto Networks, Wiz and Aqua Security showing initial compromise attempts in roughly twenty minutes and repeated automated scans against container ports. The platform's permissive defaults and complex model make standard cloud controls insufficient. Organizations should adopt Kubernetes-specific controls: harden and automate RBAC, isolate workloads with network and namespace policies, store secrets in dedicated key management services, perform regular audits, and train developers on platform-specific threats and secure CI/CD practices.
read more →

GKE Inference Gateway Cuts Latency for Vertex AI Performance

🚀 The Vertex AI team deployed the GKE Inference Gateway, built on the Kubernetes Gateway API, to reduce inference latency and improve cache efficiency without a custom scheduler. The gateway applies load-aware routing—scraping Prometheus metrics like KV cache utilization and queue depth—and content-aware routing that inspects request prefixes to send traffic to pods with warm context. In production this cut Time to First Token by ~35% for Qwen3-Coder, improved P95 by ~52% for a bursty chat model, and doubled prefix-cache hit rates from 35% to 70%.
read more →

Amazon ECS Managed Instances in European Sovereign Cloud

🔒 Amazon ECS Managed Instances is now available in the AWS European Sovereign Cloud, enabling customers to run EC2-backed container workloads under regional sovereignty controls. As a fully managed compute option, Managed Instances dynamically scales EC2 instances, optimizes task placement, and performs security patching every 14 days while supporting GPU and network-optimized instance families. Enable via Console, the Amazon ECS MCP Server, or infrastructure-as-code; management fees apply in addition to standard EC2 costs.
read more →

Amutable Aims to Bring Verifiable Integrity to Linux

🔒Amutable, a Berlin startup launched this week, says it will bring determinism and verifiable integrity to Linux systems. Its founding team includes prominent Linux engineers such as Lennart Poettering (known for systemd) and ex‑Microsoft executives Chris Kühl (CEO) and Christian Brauner (CTO). The company is focusing on the container stack — Kubernetes, runc, LXC, Incus and containerd — and proposes cryptographic verification of images, signed manifests and continuous checks to detect tampering proactively rather than reactively.
read more →

Amazon EKS and EKS Distro Add Kubernetes 1.35 Support

🚀 Amazon EKS and EKS Distro now support Kubernetes 1.35, enabling creation of new clusters and upgrades of existing clusters via the EKS console, eksctl, or infrastructure-as-code tools. Kubernetes 1.35 introduces In-Place Pod Resource Updates to adjust CPU and memory without restarting pods, PreferSameNode traffic distribution to favor local endpoints, Node Topology Labels via the Downward API for region/zone awareness, and Image Volumes for delivering data artifacts such as AI models. EKS 1.35 is available in all AWS Regions where EKS is offered, including AWS GovCloud (US), and EKS Distro builds are published to the ECR Public Gallery and GitHub. Refer to the EKS documentation for available versions, upgrade guidance, lifecycle policies, and use EKS Cluster Insights to surface issues that could affect upgrades.
read more →

GKE Turns 10 Hackathon: Winners and Technical Highlights

🚀 The GKE Turns 10 Hackathon showcased developer teams building agentic AI on GKE integrated with Google models such as Gemini. More than 4,700 participants from 133 countries produced 133 projects demonstrating multi-agent pipelines, model orchestration, and microservice integration. Grand prize winner Amie Wei’s Cart-to-Kitchen assistant uses GKE Autopilot, the Agent Development Kit (ADK), and Agent-to-Agent protocols to analyze grocery carts and recommend recipes. Google also announced GEAR, an educational sprint launching in early 2026 to help developers learn, build, and deploy AI agents.
read more →

Agent Sandbox: Kubernetes Enhancements for AI Agents

🛡️ Agent Sandbox is a new Kubernetes primitive designed to run AI agents with strong, kernel-level isolation. Built on gVisor with optional Kata Containers and developed in the Kubernetes community as a CNCF project, it reduces risks from agent-executed code. On GKE, managed gVisor, container-optimized compute and pre-warmed sandbox pools deliver sub-second startup latency and up to 90% cold-start improvement. A Python SDK and a simple API abstract YAML so AI engineers can manage sandbox lifecycles without deep infrastructure expertise; Agent Sandbox is open source and deployable on GKE today.
read more →

Full-Stack Approach to Scaling RL for LLMs on GKE at Scale

🚀 Google Cloud describes a full-stack solution for running high-scale Reinforcement Learning (RL) with LLMs, combining custom TPU hardware, NVIDIA GPUs, and optimized software libraries. The approach addresses RL's hybrid demands—reducing sampler latency, easing memory contention across actor/critic/reward models, and accelerating weight copying—by co-designing hardware, storage (Managed Lustre, Cloud Storage), and orchestration on GKE. The blog emphasizes open-source contributions (vLLM, llm-d, MaxText, Tunix) and integrations with Ray and NeMo RL recipes to improve portability and developer productivity. It also highlights mega-scale orchestration and multi-cluster strategies to run production RL jobs at tens of thousands of nodes.
read more →