Tag Banner

All news with #vertex ai tag

Fri, November 14, 2025

Advancing Text-to-SQL: Gemini's BIRD Benchmark Breakthrough

🚀 Google Cloud reports a new state-of-the-art Single Trained Model Track score on the BIRD benchmark, achieving 76.13 with a fine-tuned Gemini 2.5-pro. The team credits rigorous data filtering, multitask supervised fine-tuning, and test-time self-consistency selection for the gains. These improvements bolster NL2SQL features in AlloyDB AI and BigQuery, and enhance developer tooling such as Gemini Code Assist for reliable SQL generation.

read more →

Thu, November 13, 2025

Four Steps for Startups to Build Multi-Agent Systems

🤖 This post outlines a concise four-step framework for startups to design and deploy multi-agent systems, illustrated through a Sales Intelligence Agent example. It recommends choosing between pre-built, partner, or custom agents and describes using Google's Agent Development Kit (ADK) for code-first control. The guide covers hybrid architectures, tool-based state isolation, secure data access, and a three-step deployment blueprint to run agents on Vertex AI Agent Engine and Cloud Run.

read more →

Thu, November 13, 2025

Google Cloud expands Hugging Face support for AI developers

🤝 Google Cloud and Hugging Face are deepening their partnership to speed developer workflows and strengthen enterprise model deployments. A new gateway will cache Hugging Face models and datasets on Google Cloud so downloads take minutes, not hours, across Vertex AI and Google Kubernetes Engine. The collaboration adds native TPU support for open models and integrates Google Cloud’s threat intelligence and Mandiant scanning for models served through Vertex AI.

read more →

Tue, November 11, 2025

Google Cloud Expands AI Infrastructure and Services in India

🤝 Google Cloud is increasing local AI compute in India with its AI Hypercomputer powered by Trillium TPUs, enabling training and serving of advanced Gemini models with data residency and sovereignty controls. New local offerings include batch support for Gemini 2.5 Flash, a preview of Document AI, and real‑time grounding using Google Maps for location‑aware responses. Google is also supporting Indic Arena at IIT Madras with cloud credits to benchmark Indian multilingual models and to help grow the local AI ecosystem.

read more →

Fri, November 7, 2025

AlloyDB AI: Auto Vector Embeddings and Indexing Capabilities

🔍 AlloyDB AI launches two preview features—Auto Vector Embeddings and Auto Vector Index—that let teams convert operational databases into AI-native stores using simple SQL. Auto Vector Embeddings generates and incrementally refreshes vectors in-database, batching calls to Vertex AI and running as a background process. The Auto Vector Index (ScaNN) self-configures, self-tunes, and maintains vector indexes to accelerate filtered semantic search and reduce ETL and tuning overhead for production workloads.

read more →

Thu, November 6, 2025

Build Your First AI Travel Assistant with Gemini Today

🚀 This codelab walks developers through building a functional travel chatbot using Google's Gemini via the Vertex AI SDK. It explains how to connect a web frontend to Gemini, craft system instructions to shape assistant behavior, and enable function-calling to fetch live data such as geocoding and weather. No advanced ML expertise is required; the lab provides step-by-step code samples, API usage, and practical recommendations for iterating prompts so you can produce a working, production-ready demo.

read more →

Wed, November 5, 2025

Vertex AI Agent Builder: Build, Scale, Govern Agents

🚀 Vertex AI Agent Builder is Google Cloud's integrated platform to build, scale, and govern production AI agents. The update expands the Agent Development Kit (ADK) and Agent Engine with configurable context layers to reduce token usage, an adaptable plugins framework, and new language SDK support including Go. Production features include observability, evaluation tools, simplified deployment via the ADK CLI, and strengthened governance with native agent identities and Model Armor protections.

read more →

Tue, November 4, 2025

How Google Cloud Networking Supports AI Workloads at Scale

🔗 Networking is a critical enabler for AI on Google Cloud, connecting models, storage, and inference endpoints while preserving security and performance. The post outlines seven capabilities—from private API access and RDMA-backed GPU interconnects to hybrid Cross-Cloud links—that reduce latency, prevent data exfiltration, and simplify model serving. It also highlights options for exposing inference (managed services, GKE, load balancing) and previews AI-driven network operations using Gemini.

read more →

Fri, October 31, 2025

Conversational AI Agents: Designing for Retail UX, Commerce

🛍️ Google Cloud outlines UX and implementation guidance for building conversational AI agents tailored to online shopping. The article presents seven practical design principles — including multimodal input, intelligent query handling, rich visual presentation, and clear trust signals — that improve discovery and reduce friction. It highlights features like predictive assistance and contextual clarification and offers a Figma component library plus developer resources to accelerate deployment.

read more →

Tue, October 28, 2025

Giles AI on Google Cloud: Transforming Medical Research

🚀 Giles AI migrated its healthcare-focused platform to Google Cloud to reduce latency, improve scalability, and accelerate developer velocity. Using Google Kubernetes Engine, Cloud Run, and Compute Engine, the company orchestrates complex clinical data flows and routes prompts through Vertex AI and Model Garden to remain model-agnostic. Data storage and extraction are handled with Cloud SQL, Cloud Storage, and Document AI, while Cloud Armor and Security Command Center bolster security and compliance. Early customer results include dramatic reductions in research time and improvements in response accuracy.

read more →

Tue, October 28, 2025

A4X Max, GKE Networking, and Vertex AI Training Now Shipping

🚀 Google Cloud is expanding its NVIDIA collaboration with the new A4X Max instances powered by NVIDIA GB300 NVL72, delivering 72 GPUs with high‑bandwidth NVLink and shared memory for demanding multimodal reasoning. GKE now supports DRANET for topology‑aware RDMA scheduling and integrates NVIDIA NeMo Guardrails into GKE Inference Gateway, while Vertex AI Model Garden will host NVIDIA Nemotron models. Vertex AI Training adds NeMo and NeMo‑RL recipes and a managed Slurm environment to accelerate large‑scale training and deployment.

read more →

Tue, October 28, 2025

Enabling a Safe Agentic Web with reCAPTCHA Controls

🔐 Google Cloud outlines a pragmatic framework to secure the emerging agentic web while preserving smooth user experiences. The post details how reCAPTCHA and Google Cloud combine agent and user identity, continuous behavior analysis, and AI-resistant mitigations such as mobile-device attestations. It highlights enabling safe agentic commerce via protocols like AP2 and tighter integration with cloud AI services.

read more →

Mon, October 27, 2025

Vertex AI Training Expands Large-Scale Training Capabilities

🚀 Vertex AI Training introduces managed features designed for large-scale model development, simplifying cluster provisioning, job orchestration, and resiliency across hundreds to thousands of accelerators. The offering integrates Cluster Director, Dynamic Workload Scheduler, optimized checkpointing, and curated training recipes, including NVIDIA NeMo support. These capabilities reduce operational overhead and accelerate transitions from pretraining to fine-tuning while improving cost and uptime efficiency.

read more →

Fri, October 24, 2025

How Five Agencies Built Impossible Ads with Gemini

🎨 Google showcased how five agencies used Gemini 2.5 Pro and complementary generative media models to produce ambitious ad campaigns that blend nostalgia, personalization, and scalable visual storytelling. Projects ranged from a retro AI radio for Slice to personalized "postcard" ads for Virgin Voyages, AI co-hosts and party themes for Smirnoff, crowdsourced mascots for Visit Orlando, and cinematic short film work with Moncler. Results highlighted rapid production, measurable engagement lifts, and cross-product workflows across Imagen, Veo, Lyria, and Vertex AI. The post invites brands to explore these tools for creative scale and efficiency.

read more →

Thu, October 23, 2025

Google Gen AI .NET SDK Brings Gemini to C#/.NET Developers

🚀 Google has released the Google Gen AI .NET SDK, bringing unified access to Gemini on Google AI and Vertex AI for C#/.NET developers. The SDK is available via NuGet (dotnet add package Google.GenAI) and supports client creation with an API key or with project/location settings for Vertex AI. Examples demonstrate unary and streaming text generation, image generation, and configurable response schemas and generation settings. Google provides the API reference, GitHub source (googleapis/dotnet-genai) and a DemoApp with samples to help developers get started.

read more →

Wed, October 22, 2025

Model Armor and Apigee: Protecting Generative AI Apps

🔒 Google Cloud’s Model Armor integrates with Apigee to screen prompts, responses, and agent interactions, helping organizations mitigate prompt injection, jailbreaks, sensitive data exposure, malicious links, and harmful content. The model‑agnostic, cloud‑agnostic service supports REST APIs and inline integrations with Apigee, Vertex AI, Agentspace, and network service extensions. The article provides step‑by‑step setup: enable the API, create templates, assign service account roles, add SanitizeUserPrompt and SanitizeModelResponse policies to Apigee proxies, and review findings in the AI Protection dashboard.

read more →

Tue, October 21, 2025

SmarterX Builds Custom LLMs with Google Cloud Tools

🔍 SmarterX uses Google Cloud to build custom LLMs that help retailers, manufacturers, and logistics companies manage regulatory compliance across product lifecycles. Using BigQuery, Cloud Storage, Gemini, and Vertex AI, the company ingests, normalizes, and indexes unstructured regulatory and product data, applies RAG and grounding, and trains customer-specific models. The integrated platform empowers subject matter experts to evaluate, correct, and deploy model updates without heavy engineering overhead.

read more →

Mon, October 20, 2025

Google Cloud G4 VMs: NVIDIA RTX PRO 6000 Blackwell GA

🚀 The G4 VM is now generally available on Google Cloud, powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and offering up to 768 GB of GDDR7 memory per instance class. It targets latency-sensitive and regulated workloads for generative AI, real-time rendering, simulation, and virtual workstations. Features include FP4 precision support, Multi-Instance GPU (MIG) partitioning, an enhanced PCIe P2P interconnect for faster multi‑GPU All-Reduce, and an NVIDIA Omniverse VMI on Marketplace for industrial digital twins.

read more →

Mon, October 20, 2025

Google Named Leader in 2025 IDC MarketScape for GenAI

🏆 Google Cloud announced it was named a Leader in the 2025 IDC MarketScape for Worldwide GenAI Life-Cycle Foundation Model Software, spotlighting the Gemini model family and the Vertex AI platform. The post highlights Gemini 2.5’s expanded “thinking” capabilities and new cost controls such as thinking budgets and thought summaries for improved auditability. It also underscores native multimodality, creative variants like Nano Banana, developer tooling including the Gemini CLI, and enterprise features for customization, grounding, security, and governance.

read more →

Mon, October 20, 2025

Agent Factory Recap: Evaluating Agents, Tooling, and MAS

📡 This recap of the Agent Factory podcast episode, hosted by Annie Wang with guest Ivan Nardini, explains how to evaluate autonomous agents using a practical, full-stack approach. It outlines what to measure — final outcomes, chain-of-thought, tool use, and memory — and contrasts measurement techniques: ground truth, LLM-as-a-judge, and human review. The post demonstrates a 5-step debugging loop using the Agent Development Kit (ADK) and describes how to scale evaluation to production with Vertex AI.

read more →