< ciso
brief />
Tag Banner

All news with #nvidia tag

86 articles · page 4 of 5

AWS launches EC2 P6-B300 with NVIDIA Blackwell Ultra

🚀 Amazon Web Services has announced general availability of Amazon EC2 P6-B300 instances powered by NVIDIA Blackwell Ultra B300 GPUs. The p6-b300.48xlarge delivers eight GPUs, 2.1 TB of high-bandwidth GPU memory, 6.4 Tbps EFA networking, 300 Gbps ENA throughput, and 4 TB of system memory. It targets training and deploying trillion-parameter foundation models and LLMs, offering higher memory, compute, and networking versus P6-B200.
read more →

ShadowMQ Deserialization Flaws in Major AI Inference Engines

⚠️ Oligo Security researcher Avi Lumelsky disclosed a widespread insecure-deserialization pattern dubbed ShadowMQ that affects major AI inference engines including vLLM, NVIDIA TensorRT-LLM, Microsoft Sarathi-Serve, Modular Max Server and SGLang. The root cause is using ZeroMQ's recv_pyobj() to deserialize network input with Python's pickle, permitting remote arbitrary code execution. Patches vary: some projects fixed the issue, others remain partially addressed or unpatched, and mitigations include applying updates, removing exposed ZMQ sockets, and auditing code for unsafe deserialization.
read more →

Copy-Paste RCE Flaw Impacts Major AI Inference Servers

🔒 Cybersecurity researchers disclosed a chain of remote code execution (RCE) vulnerabilities affecting AI inference frameworks from Meta, NVIDIA, Microsoft and open-source projects such as vLLM and SGLang. The flaws stem from reused code that called ZeroMQ’s recv-pyobj() and passed data directly into Python’s pickle.loads(), enabling unauthenticated RCE over exposed sockets. Vendors have released patches replacing unsafe pickle usage with JSON-based serialization and adding authentication and transport protections. Operators are urged to upgrade to patched releases and harden ZMQ channels, restrict network exposure, and avoid deserializing untrusted data.
read more →

AWS Expands EC2 G6f NVIDIA L4 GPU Instances to More Regions

🚀 Amazon Web Services has expanded availability of EC2 G6f instances powered by NVIDIA L4 GPUs to Europe (Spain) and Asia Pacific (Seoul), improving access for graphics and visualization workloads. G6f instances support GPU partitions as small as one-eighth of a GPU with 3 GB of GPU memory, enabling finer-grained right-sizing and cost savings compared to single‑GPU options. Instances are offered in multiple sizes paired with third‑generation AMD EPYC processors, and are purchasable as On‑Demand, Spot, or via Savings Plans; customers should use NVIDIA GRID driver 18.4 or later to launch these instances.
read more →

Microsoft unveils Fairwater AI datacenter in Atlanta

🚀 Microsoft announced the new Fairwater Azure AI datacenter in Atlanta, Georgia, expanding its planet-scale AI superfactory. The purpose-built facility integrates massive NVIDIA Blackwell GPU clusters on a single flat network and uses rack-level direct liquid cooling plus a two-story layout to maximize compute density and reduce latency. It also connects via a dedicated AI WAN to enable cross-site fungibility and dynamic workload allocation.
read more →

Check Point's AI Cloud Protect with NVIDIA BlueField

🔒 Check Point has made AI Cloud Protect powered by NVIDIA BlueField available for enterprise deployment, offering DPU-accelerated security for cloud AI workloads. The solution aims to inspect and protect GenAI traffic and prompts to reduce data exposure risks while integrating with existing cloud environments. It targets prompt manipulation and infrastructure attacks at scale and is positioned for organizations building AI factories.
read more →

Google Cloud launches managed DRANET for GKE with A4X Max

🚀 Google Cloud is previewing managed DRANET on GKE, enabling Kubernetes to treat high-performance RDMA network interfaces as schedulable resources. The integration aligns NICs and GPUs by NUMA topology to reduce latency and increase throughput, while abstracting away operational complexity. It launches with the new A4X Max instances to deliver topology-aware networking for large multi-GPU AI workloads. Developers can request specific network interfaces in pod specs and rely on GKE to co-schedule NICs and accelerators, improving utilization and simplifying operations.
read more →

A4X Max, GKE Networking, and Vertex AI Training Now Shipping

🚀 Google Cloud is expanding its NVIDIA collaboration with the new A4X Max instances powered by NVIDIA GB300 NVL72, delivering 72 GPUs with high‑bandwidth NVLink and shared memory for demanding multimodal reasoning. GKE now supports DRANET for topology‑aware RDMA scheduling and integrates NVIDIA NeMo Guardrails into GKE Inference Gateway, while Vertex AI Model Garden will host NVIDIA Nemotron models. Vertex AI Training adds NeMo and NeMo‑RL recipes and a managed Slurm environment to accelerate large‑scale training and deployment.
read more →

Microsoft and NVIDIA Deepen AI Infrastructure Partnership

🚀 Microsoft and NVIDIA announced expanded AI infrastructure on Azure, bringing NVIDIA RTX PRO 6000 Blackwell Server Edition to Azure Local, new Nemotron and Cosmos models via Azure AI Foundry, and broader support for Run:ai and GB300 NVL72 supercomputing clusters. These updates enable on-premises and edge AI with cloud-like management, improved GPU utilization, and infrastructure tailored for frontier reasoning, multimodal workloads, and real-time inferencing. Microsoft also highlighted NVIDIA Dynamo optimizations for ND GB200-v6 VMs to boost inference throughput at scale.
read more →

Securing the AI Factory: Palo Alto Networks and NVIDIA

🔒 Palo Alto Networks outlines a platform-centric approach to protect the enterprise AI Factory, announcing integration of Prisma AIRS with NVIDIA BlueField DPUs. The collaboration embeds distributed zero-trust security directly into infrastructure, delivering agentless, penalty-free runtime protection and real-time workload threat detection. Validated on NVIDIA RTX PRO Server and optimized for BlueField‑3, with BlueField‑4 forthcoming, the solution ties into Strata Cloud Manager and Cortex for end-to-end visibility and control, aiming to secure AI operations at scale without compromising performance.
read more →

Google Cloud G4 VMs: NVIDIA RTX PRO 6000 Blackwell GA

🚀 The G4 VM is now generally available on Google Cloud, powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and offering up to 768 GB of GDDR7 memory per instance class. It targets latency-sensitive and regulated workloads for generative AI, real-time rendering, simulation, and virtual workstations. Features include FP4 precision support, Multi-Instance GPU (MIG) partitioning, an enhanced PCIe P2P interconnect for faster multi‑GPU All-Reduce, and an NVIDIA Omniverse VMI on Marketplace for industrial digital twins.
read more →

G4 VMs: High-performance P2P Fabric for Multi‑GPU Workloads

🚀 Google Cloud's newly GA G4 VMs combine NVIDIA RTX PRO 6000 Blackwell GPUs with a custom, software-defined PCIe fabric to enable high-performance peer-to-peer (P2P) GPU communication. The platform accelerates collective operations like All-Gather and All-Reduce without code changes, delivering up to 2.2x faster collectives. For tensor-parallel inference, customers can see up to 168% higher throughput and up to 41% lower inter-token latency. G4 integrates with GKE Inference Gateway for horizontal scaling and production deployments.
read more →

Google Cloud and NVIDIA Power AI Innovation Week in D.C.

🤝 At the end of October in Washington, D.C., Google Cloud and NVIDIA will lead a week of events highlighting advances in AI, high-performance computing, and secure mission deployments. NVIDIA GTC DC (Oct. 27–29) features keynotes, demos, and hands-on sessions showcasing next-generation models and infrastructure. The Google Public Sector Summit (Oct. 29) convenes government leaders to explore practical uses of technologies like Gemini for Government and discuss secure, scalable AI adoption for mission impact.
read more →

Microsoft Advances Open Standards for Frontier AI Scale

🔧 Microsoft details OCP contributions to accelerate open-source infrastructure for frontier-scale AI, focusing on power, cooling, networking, security, and sustainability. It highlights innovations such as solid-state transformers, a power-stabilization paper with OpenAI and NVIDIA, and a next-generation HXU for liquid cooling. Networking efforts include ESUN and scale-up Ethernet workstreams, while security contributions introduce Caliptra 2.1, Adams Bridge 2.0, and L.O.C.K. The post also advances fleet lifecycle management, carbon accounting, and waste-heat reuse for globally deployable AI datacenters.
read more →

Microsoft Azure Debuts Large-Scale NVIDIA GB300 Cluster

🚀 Microsoft Azure announced the first production-scale cluster using more than 4,600 NVIDIA GB300 NVL72 (Blackwell Ultra) GPUs, co-engineered with NVIDIA to support OpenAI and other frontier AI workloads. The new ND GB300 v6 VMs are optimized for reasoning models, agentic systems, and multimodal generative AI, delivered on rack-scale systems with 72 GPUs per rack and 36 NVIDIA Grace CPUs. Microsoft says this infrastructure will shorten training from months to weeks and will scale to hundreds of thousands of Blackwell Ultra GPUs globally.
read more →

Cisco Talos Discloses Multiple Nvidia and Adobe Flaws

⚠ Cisco Talos disclosed five vulnerabilities in NVIDIA's CUDA Toolkit components and one use-after-free flaw in Adobe Acrobat Reader. The Nvidia issues affect tools like cuobjdump (12.8.55) and nvdisasm (12.8.90), where specially crafted fatbin or ELF files can trigger out-of-bounds writes, heap overflows, and potential arbitrary code execution. The Adobe bug (2025.001.20531) involves malicious JavaScript in PDFs that can reuse freed objects, leading to memory corruption and possible remote code execution if a user opens a crafted document.
read more →

WireTap Attack Extracts Intel SGX ECDSA Key via DDR4

🔬 Researchers from Georgia Institute of Technology and Purdue University describe WireTap, a physical memory-bus interposer attack that passively inspects DDR4 traffic to recover secrets from Intel SGX enclaves. By exploiting deterministic memory encryption, the team built an oracle enabling a full key-recovery of an SGX ECDSA attestation key from the Quoting Enclave. The prototype uses inexpensive, off-the-shelf equipment (roughly $1,000) and can be introduced via supply-chain compromise or local physical access. Intel says the scenario requires physical access and falls outside its memory-encryption threat model.
read more →

Inside Fairwater: Microsoft's New Frontier AI Datacenter

🚀 Microsoft unveiled Fairwater, a purpose-built AI datacenter in Wisconsin and sister sites in Norway and the UK, designed to operate as a single, global-scale supercomputer. The facility deploys interconnected racks of NVIDIA GB200 servers (72 GPUs per rack) and claims 10× the performance of the world’s fastest supercomputer. It combines closed-loop liquid cooling, exabyte-scale storage and an AI WAN to enable distributed training and large-scale inference across Azure.
read more →

CrowdStrike Secures AI Across the Enterprise with Partners

🔒 CrowdStrike describes how the Falcon platform delivers unified visibility and lifecycle defense across the full AI stack, from GPUs and training data to inference pipelines and SaaS agents. The post highlights integrations with NVIDIA, AWS, Intel, Dell, Meta, and Salesforce to extend protection into infrastructure, data, models, and applications. It also introduces agentic defense via Charlotte AI for autonomous triage and rapid response, and emphasizes governance controls to prevent data leaks and adversarial manipulation.
read more →

Amazon SageMaker Adds EC2 P6-B200 Notebook Instances

🚀 Amazon Web Services announced general availability of EC2 P6-B200 instances for SageMaker notebooks. These instances include eight NVIDIA Blackwell GPUs with 1,440 GB of high-bandwidth GPU memory and 5th Gen Intel Xeon processors, offering up to 2x the training performance versus P5en. They enable interactive development and fine-tuning of large foundation models in JupyterLab and CodeEditor, and are available in US East (Ohio) and US West (Oregon).
read more →