< ciso
brief />
Tag Banner

All news with #kubernetes security tag

49 articles

App-centric Maintenance Visibility in Unified Maintenance

🛠️ App-centric maintenance visibility in Unified Maintenance shifts focus from infrastructure to business services. By integrating with App Hub, Unified Maintenance aggregates maintenance schedules for registered resources—GKE clusters, GCE VMs, AlloyDB instances—into a single application-aware dashboard. This reduces manual mapping, speeds triage of performance issues against planned updates, and helps platform teams predict operational impacts across many projects.
read more →

Kaspersky Container Security: Practical Team Insights

🔒 Kaspersky Container Security (KCS) is presented as a comprehensive platform that reaches beyond registry image scanning to secure container workflows across development and production. The Product Security Team uses KCS in CI/CD pipelines, registry correlation, and cluster runtime monitoring to tie findings to specific artifacts, pipelines, and scan times. KCS computes risk ratings, supports SBOM processing, and produces reports in SARIF, CycloneDX, SPDX and standard formats to integrate with AppSec and internal tooling.
read more →

EKS Adds Karpenter Support for ARC Zonal Shift and Autoshift

🔁 Amazon EKS now supports Amazon Application Recovery Controller (ARC) zonal shift and zonal autoshift when using the open-source Karpenter for compute provisioning. ARC automates redirecting in-cluster network traffic away from an impaired AZ and can perform practice runs to validate cluster behavior. During a zonal shift, Karpenter stops provisioning in the impacted AZ, halts voluntary disruptions there, and avoids scheduling actions that depend on that zone. Enable support by setting ENABLE_ZONAL_SHIFT.
read more →

Platform Modernization and AI on Azure Red Hat OpenShift

🔷 At Red Hat Summit 2026, Microsoft and Red Hat highlighted how Azure Red Hat OpenShift supports modernization and production AI by delivering consistent governance, security, and scale. Microsoft was named Platform Modernization Partner of the Year, underscoring joint customer outcomes. Banco Bradesco and Topicus illustrate production AI and regulated lending workloads running on the jointly managed platform. Key advances include OpenShift Virtualization, confidential containers, managed identities, expanded NVIDIA GPU support, and broader regional availability.
read more →

EC2 Instance Store CSI Driver Now Available as EKS Add-on

💾 Amazon EKS now supports the EC2 Instance Store CSI driver as an EKS add-on, and you can install and manage it via the EKS console or AWS CLI. The driver exposes ephemeral NVMe-based instance store volumes as Kubernetes persistent volumes and manages their lifecycle on EC2 hosts. This feature simplifies attaching local instance storage to EKS clusters and is available in all commercial regions.
read more →

Cloud Engineers AI Toolkit: Hands-on Developer Workshops

🤖 Join hands-on developer workshops across North America that teach secure, scalable deployment of agentic AI for enterprises. These sessions are practical, bring-your-laptop labs where Platform, Security, and Data practitioners build end-to-end solutions, including GKE cluster hardening, secure sandboxing, and governed data pipelines. Tracks cover GKE + Data and Data Engineering & Analytics, with guidance from Google experts. Attendees leave with runnable labs and operational best practices to accelerate production adoption.
read more →

Amazon EKS Adds Dynamic Resource Allocation for EFA

🚀Amazon EKS now supports Dynamic Resource Allocation (DRA) for Elastic Fabric Adapter (EFA), simplifying RDMA and high-performance inter-node communication for AI/ML and HPC workloads. The EFA DRA driver, based on the upstream DRANET project, enables topology-aware allocation and EFA interface sharing so network traffic uses the closest NIC to GPUs, Trainium, or Inferentia. It’s recommended for new EKS deployments on Kubernetes 1.34+ and is available in all AWS Regions; the existing EFA device plugin remains supported and is still recommended for use with Karpenter and Amazon EKS Auto Mode.
read more →

Amazon EKS adds one-click cluster access via CloudShell

☁️ Amazon Elastic Kubernetes Service (EKS) now offers one-click cluster access from the AWS Management Console via AWS CloudShell, eliminating the need to install or configure kubectl, AWS CLI, or kubeconfig files locally. From the EKS console, selecting Connect launches a CloudShell session with kubectl pre-configured for the chosen cluster so you can run commands immediately. The feature supports clusters with both public and private API server endpoints and each session also includes the AWS CLI and standard CloudShell utilities for troubleshooting and management.
read more →

GKE Updates at Google Cloud Next ’26: Scale, Security, AI

🚀 At Google Cloud Next ’26, Google unveiled a suite of GKE enhancements focused on large-scale AI and agentic workloads. Highlights include the new GKE Agent Sandbox (gVisor-based isolation for fast, secure sandboxes), private GA of GKE hypercluster to manage millions of accelerators across regions, and inference upgrades like Predictive Latency Boost and KV cache tiering. Preview RL features and intent-based autoscaling on custom metrics further enhance utilization and reliability.
read more →

Amazon EKS Hybrid Nodes gateway simplifies hybrid networking

🔗 Amazon Elastic Kubernetes Service (EKS) introduces the Amazon EKS Hybrid Nodes gateway to automate networking between an EKS cluster VPC and Kubernetes Pods running on EKS Hybrid Nodes. The gateway removes the need to make on‑premises pod networks routable and avoids extensive coordination with network teams by automatically maintaining VPC route tables as workloads scale. Deployed to Amazon EC2 instances via Helm, the gateway also enables control-plane-to-webhook, pod-to-pod, and AWS service connectivity (ALB, NLB, Amazon Managed Service for Prometheus). The codebase is open source and the feature is available in all Regions where EKS Hybrid Nodes is supported, excluding China Regions. AWS offers the gateway itself at no additional charge; customers pay for underlying EC2 and data transfer costs.
read more →

GKE Cloud Storage FUSE Profiles for AI/ML Workload I/O

⚡ GKE’s Cloud Storage FUSE Profiles automate performance tuning for AI/ML workloads by providing pre-defined, dynamically managed StorageClasses optimized for training, serving, and checkpointing. Instead of manually adjusting many mount and CSI options, users select a profile and GKE scans the bucket and node resources to calculate cache sizes and backing media. The CSI driver mounts the volume with those calculated options and dynamically adjusts cache behavior using real-time signals to maximize throughput while protecting node stability.
read more →

Modern Kubernetes Threats and Identity-focused Attacks

🔒 Unit 42 details how widespread Kubernetes attacks—driven by identity theft and exposed services—enable escalation from containers into cloud backends. The report highlights stolen service account tokens and the rapid exploitation of React2Shell (CVE-2025-55182), showing how attackers extract mounted tokens and cloud credentials. Practical mitigations include strict RBAC, short-lived projected tokens, runtime telemetry, and API audit logging. Unit 42 maps these behaviors to MITRE ATT&CK and provides detection examples.
read more →

CloudWatch Container Insights adds OpenTelemetry for EKS

🔔 Amazon CloudWatch now offers Container Insights with OpenTelemetry metrics for Amazon EKS in public preview. The feature collects OTLP metrics from open source and AWS collectors, enriches each metric with up to 150 labels, and supplies curated dashboards and PromQL query support in CloudWatch Query Studio. Deployment is available via the CloudWatch Observability EKS add‑on, console, CloudFormation, CDK, or Terraform, and preview metrics are free.
read more →

Kubernetes Controllers as Stealthy Persistent Backdoors

🔒 Kubernetes clusters can be undermined by the very automation that makes them resilient. By registering or compromising a controller—most commonly via a MutatingWebhookConfiguration—an attacker can intercept pod-creation requests and inject a covert sidecar, turning the cluster’s control loop into a self-healing backdoor. These injections are often invisible to casual inspection, survive pod restarts and upgrades, and can be disguised under benign names. Teams should audit webhooks, monitor RoleBindings and OwnerReferences, and restrict webhook registration to reduce this risk.
read more →

One-line Kubernetes fix reclaimed 600 hours for Atlantis

🔧 Cloudflare engineers traced repeated 30-minute Atlantis restarts to Kubernetes recursively changing file ownership on a large PersistentVolume. The default pod securityContext behavior (fsGroup combined with fsGroupChangePolicy: Always) caused kubelet to run an expensive recursive chgrp across millions of files, creating a mounting bottleneck. By validating that file group ownership would remain stable and setting fsGroupChangePolicy: OnRootMismatch, restarts dropped to ~30 seconds. That single-line change recovered roughly 50 engineering hours per month (about 600 hours per year).
read more →

DRA: Dynamic Resource Allocation for Kubernetes Devices

⚡ DRA (Dynamic Resource Allocation) modernizes Kubernetes device management by replacing static Device Plugins with a request-based model built on ResourceSlice and ResourceClaim. It enables granular, attribute-based requests such as minimum VRAM, specific hardware models, or PCIe locality, and abstracts hardware via DeviceClass so the scheduler can match workloads to suitable devices. NVIDIA contributed a GPU driver and Google donated a TPU driver, and DRA is generally available in GKE. This reduces manual node pinning and improves utilization for LLM and AI workloads.
read more →

TeamPCP Backdoors LiteLLM Versions on PyPI via Trivy

⚠️ Security researchers report that TeamPCP published backdoored litellm packages (v1.82.7 and v1.82.8) to PyPI on March 24, 2026, likely leveraging a Trivy compromise in the project's CI/CD. The malicious wheels included a three-stage payload: a credential harvester, a Kubernetes lateral-movement toolkit, and a persistent systemd backdoor executed at import or interpreter startup. Vendors removed the tainted releases and urge immediate audits, isolation of affected hosts, credential rotation, and inspection of Kubernetes clusters for rogue pods and persistence.
read more →

UNC4899 Cloud Campaign Exploits AirDrop to Steal Crypto

🔒 Google links the North Korean actor UNC4899 to a 2025 cloud compromise that leveraged personal-to-corporate file transfers (AirDrop) and malicious code embedded in a shared archive. Attackers pivoted from a compromised developer device into Google Cloud, abused CI/CD and Kubernetes workflows, and manipulated Cloud SQL to extract funds. The campaign employed living-off-the-cloud techniques and persisted by injecting commands into deployment configurations. Recommended mitigations include phishing-resistant MFA, strict secrets management, and restricting P2P file sharing on corporate endpoints.
read more →

Cost-Effective AI: Ollama, GKE GPU Sharing, vCluster

💡 This post shows how to combine GKE Autopilot GPU time-sharing with vCluster to host isolated Ollama instances serving open models on shared GPU nodes. It outlines steps to provision Autopilot, create virtual clusters, deploy Ollama with GPU-sharing labels, and pull models for verification. The approach reduces GPU underutilization and simplifies multi-tenant operations. Teams keep isolated control planes while sharing hardware, lowering costs and operational overhead.
read more →

GKE Adds Native Custom Metrics for Horizontal Scaling

🚀 Google Cloud now provides native custom metrics for GKE Horizontal Pod Autoscaler (HPA), eliminating the need for external adapters, agents, and complex Workload Identity bindings. The agentless design sources pod metrics directly and exposes them via a new AutoscalingMetric controller, reducing latency, cost, and operational fragility. Users declare an AutoscalingMetric that points to a pod metric and reference it in an HPA, allowing HPAs to scale on custom workload signals just like CPU or memory. Google frames this as an initial step toward intent-based autoscaling for AI, gaming, batch, and other demanding workloads.
read more →