< ciso
brief />
Tag Banner

All news with #vertex ai tag

93 articles · page 4 of 5

Giles AI on Google Cloud: Transforming Medical Research

🚀 Giles AI migrated its healthcare-focused platform to Google Cloud to reduce latency, improve scalability, and accelerate developer velocity. Using Google Kubernetes Engine, Cloud Run, and Compute Engine, the company orchestrates complex clinical data flows and routes prompts through Vertex AI and Model Garden to remain model-agnostic. Data storage and extraction are handled with Cloud SQL, Cloud Storage, and Document AI, while Cloud Armor and Security Command Center bolster security and compliance. Early customer results include dramatic reductions in research time and improvements in response accuracy.
read more →

A4X Max, GKE Networking, and Vertex AI Training Now Shipping

🚀 Google Cloud is expanding its NVIDIA collaboration with the new A4X Max instances powered by NVIDIA GB300 NVL72, delivering 72 GPUs with high‑bandwidth NVLink and shared memory for demanding multimodal reasoning. GKE now supports DRANET for topology‑aware RDMA scheduling and integrates NVIDIA NeMo Guardrails into GKE Inference Gateway, while Vertex AI Model Garden will host NVIDIA Nemotron models. Vertex AI Training adds NeMo and NeMo‑RL recipes and a managed Slurm environment to accelerate large‑scale training and deployment.
read more →

Vertex AI Training Expands Large-Scale Training Capabilities

🚀 Vertex AI Training introduces managed features designed for large-scale model development, simplifying cluster provisioning, job orchestration, and resiliency across hundreds to thousands of accelerators. The offering integrates Cluster Director, Dynamic Workload Scheduler, optimized checkpointing, and curated training recipes, including NVIDIA NeMo support. These capabilities reduce operational overhead and accelerate transitions from pretraining to fine-tuning while improving cost and uptime efficiency.
read more →

SmarterX Builds Custom LLMs with Google Cloud Tools

🔍 SmarterX uses Google Cloud to build custom LLMs that help retailers, manufacturers, and logistics companies manage regulatory compliance across product lifecycles. Using BigQuery, Cloud Storage, Gemini, and Vertex AI, the company ingests, normalizes, and indexes unstructured regulatory and product data, applies RAG and grounding, and trains customer-specific models. The integrated platform empowers subject matter experts to evaluate, correct, and deploy model updates without heavy engineering overhead.
read more →

AI Hypercomputer Update: vLLM on TPUs and Tooling Advances

🔧 Google Cloud’s Q3 AI Hypercomputer update highlights inference improvements and expanded tooling to accelerate model serving and diagnostics. The release integrates vLLM with Cloud TPUs via the new tpu-inference plugin, unifying JAX and PyTorch runtimes and boosting TPU inference for models such as Gemma, Llama, and Qwen. Additional launches include improved XProf profiling and Cloud Diagnostics XProf, an AI inference recipe for NVIDIA Dynamo, NVIDIA NeMo RL recipes, and GA of the GKE Inference Gateway and Quickstart to help optimize latency and cost.
read more →

Google Named Leader in 2025 IDC MarketScape for GenAI

🏆 Google Cloud announced it was named a Leader in the 2025 IDC MarketScape for Worldwide GenAI Life-Cycle Foundation Model Software, spotlighting the Gemini model family and the Vertex AI platform. The post highlights Gemini 2.5’s expanded “thinking” capabilities and new cost controls such as thinking budgets and thought summaries for improved auditability. It also underscores native multimodality, creative variants like Nano Banana, developer tooling including the Gemini CLI, and enterprise features for customization, grounding, security, and governance.
read more →

Agent Factory Recap: Evaluating Agents, Tooling, and MAS

📡 This recap of the Agent Factory podcast episode, hosted by Annie Wang with guest Ivan Nardini, explains how to evaluate autonomous agents using a practical, full-stack approach. It outlines what to measure — final outcomes, chain-of-thought, tool use, and memory — and contrasts measurement techniques: ground truth, LLM-as-a-judge, and human review. The post demonstrates a 5-step debugging loop using the Agent Development Kit (ADK) and describes how to scale evaluation to production with Vertex AI.
read more →

Moloco and Google Cloud Power AI Vector Search in Retail

🔎 Moloco’s AI-native retail media platform, integrated with Vertex AI Vector Search on Google Cloud, delivers semantic, real-time ad retrieval and personalized recommendations. The joint architecture uses TPUs and GPUs for model training and scoring while vector search runs efficiently on CPUs, enabling outcomes-based bidding at scale. Internal benchmarks report ~10x capacity, up to ~25% lower p95 latency, and a ~4% revenue uplift. The managed service reduces operational overhead and accelerates time-to-value for retailers.
read more →

Vertex AI SDK Adds Prompt Management for Enterprises

🛠️ Google Cloud announced General Availability of Prompt Management in the Vertex AI SDK, enabling teams to programmatically create, version, and manage prompts as first-class assets. The capability bridges Vertex AI Studio’s visual prompt design with SDK-driven automation to improve collaboration, reproducibility, and lifecycle control. Enterprise security and compliance are supported via CMEK and VPCSC, and the SDK exposes simple Python methods to create, list, update, and delete prompt resources tied to models such as gemini-2.5-flash. Get started using the documented code examples to centralize prompt governance and scale generative AI workflows.
read more →

Ultimate Prompting Guide for Veo 3.1 on Vertex AI Preview

🎬 This guide introduces Veo 3.1, Google Cloud's improved generative video model available in preview on Vertex AI, and explains how to move beyond "prompt and pray" toward deliberate creative control. It highlights core capabilities—high-fidelity 720p/1080p output, variable clip lengths, synchronized dialogue and sound effects, and stronger image-to-video fidelity. The article presents a five-part prompting formula and detailed techniques for cinematography, soundstage direction, negative prompting, and timestamped scenes. It also describes advanced multi-step workflows that combine Gemini 2.5 Flash Image to produce consistent characters and controlled transitions, and notes SynthID watermarking and certain current limitations.
read more →

Vertex AI Context Caching: Reduce Cost and Latency

⚡ Vertex AI context caching saves and reuses precomputed input tokens so developers avoid repeatedly sending and recomputing long contextual content, reducing latency and cost for large-context AI applications. It provides implicit caching — automatic, default, short-lived KV caches (deleted within 24 hours) integrated with Provisioned Throughput — and explicit CachedContent objects that are paid once and then reused at a deep discount with optional CMEK protection. Caches support multimodal inputs and very large context windows.
read more →

Google Introduces LLM-Evalkit for Prompt Engineering

🧭 LLM-Evalkit is an open-source, lightweight application from Google that centralizes and streamlines prompt engineering using Vertex AI SDKs. It provides a no-code interface for creating, versioning, testing, and benchmarking prompts while tracking objective performance metrics. The tool promotes a dataset-driven evaluation workflow—define the task, assemble representative test cases, and score outputs against clear metrics—to replace ad-hoc iteration and subjective comparisons. Documentation and a guided console tutorial are available to help teams adopt the framework and reproduce experiments.
read more →

Partners Powering the Gemini Enterprise Agent Ecosystem

🚀 Gemini Enterprise launches a curated ecosystem of partner-built AI agents that integrate with Google Cloud to deliver validated, secure solutions for enterprise workflows. The platform supports Agent2Agent (A2A) communication and includes a Gemini-powered AI agent finder for natural language discovery and filtering by industry, use case, and validation status. A broad set of technology and consulting partners — from Box and Salesforce to ServiceNow, Workday, and Accenture — are bringing agents and services to the Google Cloud Marketplace to accelerate deployment and adoption.
read more →

150 AI Use Cases from Startups Leveraging Google Cloud

🤖 At the AI Builders Forum, Google Cloud highlighted 150 startups using its generative AI stack—Vertex AI, Gemini, GKE, and Cloud Storage—to build agentic systems, healthcare models, developer tools, and media pipelines. The post catalogs companies across sectors (healthcare, finance, retail, security, creative) and describes technical integrations such as fine-tuning with Gemini, inference on GKE, and scalable analytics with BigQuery. It encourages startups to join Google for Startups Cloud and references a new Startup Technical Guide: AI Agents for building and scaling agentic applications.
read more →

Dataproc ML library: Connect Spark to Gemini and Vertex

🔗 Google has released an open-source Python library, Dataproc ML, to streamline running ML and generative-AI inference from Apache Spark on Dataproc. The library uses a SparkML-style builder pattern so users can configure a model handler (for example, GenAiModelHandler) and call .transform() to apply Gemini or other Vertex AI models directly to DataFrames. It also supports loading PyTorch and TensorFlow model artifacts from GCS for large-scale batch inference and includes performance optimizations such as vectorized data transfer, connection reuse, and automatic retry/backoff.
read more →

Google Cloud Releases Generative Media Models on Vertex AI

🎨Google Cloud announced General Availability and feature updates for its generative media models on Vertex AI, including Gemini 2.5 Flash Image, Veo 3, Imagen 4, and Gemini 2.5 TTS. The release emphasizes production readiness and enterprise security while adding multi‑aspect ratio image generation, batch image processing, vertical 9:16 video formats with precise duration controls, and studio‑quality multi‑speaker text‑to‑speech across 70+ languages. These enhancements target teams seeking faster, controlled, and scalable cross‑format media workflows for sight, sound, and motion.
read more →

Accelerate AI with Agents: EMEA Developer Series and Labs

🚀 Google Cloud is hosting a regional event series across EMEA to help developers and tech practitioners learn to build and scale AI agents. The program combines immersive, hands-on labs and expert-led workshops covering technologies such as Cloud Run, Vertex AI, Gemini, and the Agent Development Kit (ADK). Participants receive step-by-step guidance and practical exercises designed to accelerate agent deployments and operational readiness within organizations.
read more →

Anthropic's Claude Sonnet 4.5 Now Available on Vertex AI

🚀 Anthropic’s Claude Sonnet 4.5 is now generally available on Vertex AI, delivering advanced long-horizon autonomy for agents across coding, finance, research, and cybersecurity. The model can operate independently for hours, orchestrating tools and coordinating multiple agents to complete complex, multi-step tasks. Vertex AI provides orchestration, provisioning, security controls, and developer tooling, and includes Claude Code upgrades like a VS Code extension and an improved terminal interface.
read more →

INDOT Used Google AI to Save 360 Hours and Meet Deadline

🚀 Indiana Department of Transportation built a week-long pilot on Google Cloud to meet a 30-day executive order, using a Retrieval-Augmented Generation workflow that combined rapid ETL, Vertex AI Search indexing, and Gemini. The system scraped and parsed decades of internal policies and manuals, produced draft reports across nine divisions with 98% fidelity, and saved an estimated 360 hours of manual effort, enabling INDOT to submit on time.
read more →

Achieve Agentic Productivity with Vertex AI Agent Builder

🛠️ Vertex AI Agent Builder is a unified platform for building, grounding, and deploying production-grade AI agents, designed to move organizations from prototype to scalable, secure services. It centers development on five pillars: Agent frameworks, Model choice, Tools for taking actions, Scalability and performance, and Built-in trust and security, and supports the Agent Development Kit (ADK) and third-party models including Gemini 2.5 Flash Pro. The platform offers managed runtime features such as sandboxed code execution, Agent-to-Agent collaboration, Bidirectional Streaming, and a streamlined one-line path from ADK prototype to Agent Engine deployment, while enterprise controls like VPC-SC and CMEK address compliance and data protection.
read more →