All news with #vertex ai tag
Mon, October 27, 2025
Vertex AI Training Expands Large-Scale Training Capabilities
🚀 Vertex AI Training introduces managed features designed for large-scale model development, simplifying cluster provisioning, job orchestration, and resiliency across hundreds to thousands of accelerators. The offering integrates Cluster Director, Dynamic Workload Scheduler, optimized checkpointing, and curated training recipes, including NVIDIA NeMo support. These capabilities reduce operational overhead and accelerate transitions from pretraining to fine-tuning while improving cost and uptime efficiency.
Fri, October 24, 2025
How Five Agencies Built Impossible Ads with Gemini
🎨 Google showcased how five agencies used Gemini 2.5 Pro and complementary generative media models to produce ambitious ad campaigns that blend nostalgia, personalization, and scalable visual storytelling. Projects ranged from a retro AI radio for Slice to personalized "postcard" ads for Virgin Voyages, AI co-hosts and party themes for Smirnoff, crowdsourced mascots for Visit Orlando, and cinematic short film work with Moncler. Results highlighted rapid production, measurable engagement lifts, and cross-product workflows across Imagen, Veo, Lyria, and Vertex AI. The post invites brands to explore these tools for creative scale and efficiency.
Thu, October 23, 2025
Google Gen AI .NET SDK Brings Gemini to C#/.NET Developers
🚀 Google has released the Google Gen AI .NET SDK, bringing unified access to Gemini on Google AI and Vertex AI for C#/.NET developers. The SDK is available via NuGet (dotnet add package Google.GenAI) and supports client creation with an API key or with project/location settings for Vertex AI. Examples demonstrate unary and streaming text generation, image generation, and configurable response schemas and generation settings. Google provides the API reference, GitHub source (googleapis/dotnet-genai) and a DemoApp with samples to help developers get started.
Wed, October 22, 2025
Model Armor and Apigee: Protecting Generative AI Apps
🔒 Google Cloud’s Model Armor integrates with Apigee to screen prompts, responses, and agent interactions, helping organizations mitigate prompt injection, jailbreaks, sensitive data exposure, malicious links, and harmful content. The model‑agnostic, cloud‑agnostic service supports REST APIs and inline integrations with Apigee, Vertex AI, Agentspace, and network service extensions. The article provides step‑by‑step setup: enable the API, create templates, assign service account roles, add SanitizeUserPrompt and SanitizeModelResponse policies to Apigee proxies, and review findings in the AI Protection dashboard.
Tue, October 21, 2025
SmarterX Builds Custom LLMs with Google Cloud Tools
🔍 SmarterX uses Google Cloud to build custom LLMs that help retailers, manufacturers, and logistics companies manage regulatory compliance across product lifecycles. Using BigQuery, Cloud Storage, Gemini, and Vertex AI, the company ingests, normalizes, and indexes unstructured regulatory and product data, applies RAG and grounding, and trains customer-specific models. The integrated platform empowers subject matter experts to evaluate, correct, and deploy model updates without heavy engineering overhead.
Mon, October 20, 2025
Google Named Leader in 2025 IDC MarketScape for GenAI
🏆 Google Cloud announced it was named a Leader in the 2025 IDC MarketScape for Worldwide GenAI Life-Cycle Foundation Model Software, spotlighting the Gemini model family and the Vertex AI platform. The post highlights Gemini 2.5’s expanded “thinking” capabilities and new cost controls such as thinking budgets and thought summaries for improved auditability. It also underscores native multimodality, creative variants like Nano Banana, developer tooling including the Gemini CLI, and enterprise features for customization, grounding, security, and governance.
Mon, October 20, 2025
Google Cloud G4 VMs: NVIDIA RTX PRO 6000 Blackwell GA
🚀 The G4 VM is now generally available on Google Cloud, powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and offering up to 768 GB of GDDR7 memory per instance class. It targets latency-sensitive and regulated workloads for generative AI, real-time rendering, simulation, and virtual workstations. Features include FP4 precision support, Multi-Instance GPU (MIG) partitioning, an enhanced PCIe P2P interconnect for faster multi‑GPU All-Reduce, and an NVIDIA Omniverse VMI on Marketplace for industrial digital twins.
Mon, October 20, 2025
Agent Factory Recap: Evaluating Agents, Tooling, and MAS
📡 This recap of the Agent Factory podcast episode, hosted by Annie Wang with guest Ivan Nardini, explains how to evaluate autonomous agents using a practical, full-stack approach. It outlines what to measure — final outcomes, chain-of-thought, tool use, and memory — and contrasts measurement techniques: ground truth, LLM-as-a-judge, and human review. The post demonstrates a 5-step debugging loop using the Agent Development Kit (ADK) and describes how to scale evaluation to production with Vertex AI.
Fri, October 17, 2025
Moloco and Google Cloud Power AI Vector Search in Retail
🔎 Moloco’s AI-native retail media platform, integrated with Vertex AI Vector Search on Google Cloud, delivers semantic, real-time ad retrieval and personalized recommendations. The joint architecture uses TPUs and GPUs for model training and scoring while vector search runs efficiently on CPUs, enabling outcomes-based bidding at scale. Internal benchmarks report ~10x capacity, up to ~25% lower p95 latency, and a ~4% revenue uplift. The managed service reduces operational overhead and accelerates time-to-value for retailers.
Thu, October 16, 2025
Vertex AI SDK Adds Prompt Management for Enterprises
🛠️ Google Cloud announced General Availability of Prompt Management in the Vertex AI SDK, enabling teams to programmatically create, version, and manage prompts as first-class assets. The capability bridges Vertex AI Studio’s visual prompt design with SDK-driven automation to improve collaboration, reproducibility, and lifecycle control. Enterprise security and compliance are supported via CMEK and VPCSC, and the SDK exposes simple Python methods to create, list, update, and delete prompt resources tied to models such as gemini-2.5-flash. Get started using the documented code examples to centralize prompt governance and scale generative AI workflows.
Wed, October 15, 2025
Ultimate Prompting Guide for Veo 3.1 on Vertex AI Preview
🎬 This guide introduces Veo 3.1, Google Cloud's improved generative video model available in preview on Vertex AI, and explains how to move beyond "prompt and pray" toward deliberate creative control. It highlights core capabilities—high-fidelity 720p/1080p output, variable clip lengths, synchronized dialogue and sound effects, and stronger image-to-video fidelity. The article presents a five-part prompting formula and detailed techniques for cinematography, soundstage direction, negative prompting, and timestamped scenes. It also describes advanced multi-step workflows that combine Gemini 2.5 Flash Image to produce consistent characters and controlled transitions, and notes SynthID watermarking and certain current limitations.
Wed, October 15, 2025
Vertex AI Context Caching: Reduce Cost and Latency
⚡ Vertex AI context caching saves and reuses precomputed input tokens so developers avoid repeatedly sending and recomputing long contextual content, reducing latency and cost for large-context AI applications. It provides implicit caching — automatic, default, short-lived KV caches (deleted within 24 hours) integrated with Provisioned Throughput — and explicit CachedContent objects that are paid once and then reused at a deep discount with optional CMEK protection. Caches support multimodal inputs and very large context windows.
Tue, October 14, 2025
Google Cloud Adds AI Annotations and Object Contexts
🧠 Google Cloud is introducing two Cloud Storage features—auto annotate and object contexts—that apply pretrained AI to generate metadata and attach custom key-value tags to stored objects. Auto annotate (experimental) produces image annotations such as object detection, labels, and objectionable-content signals tied to an object's lifecycle. Object contexts (preview) let teams add, manage, and query contextual tags with IAM controls and Storage Insights integration. Together they enable scalable discovery, curation, and governance of previously unanalyzed unstructured “dark data.”
Mon, October 13, 2025
Google Introduces LLM-Evalkit for Prompt Engineering
🧭 LLM-Evalkit is an open-source, lightweight application from Google that centralizes and streamlines prompt engineering using Vertex AI SDKs. It provides a no-code interface for creating, versioning, testing, and benchmarking prompts while tracking objective performance metrics. The tool promotes a dataset-driven evaluation workflow—define the task, assemble representative test cases, and score outputs against clear metrics—to replace ad-hoc iteration and subjective comparisons. Documentation and a guided console tutorial are available to help teams adopt the framework and reproduce experiments.
Thu, October 9, 2025
Google Introduces Gemini Enterprise for the Workplace
🚀 Gemini Enterprise is presented as Google’s unified, enterprise-grade AI front door that integrates advanced models, a no-code workbench, pre-built and customizable agents, secure data connectors, centralized governance, and an open partner ecosystem. The chat-first interface works across Google Workspace and Microsoft 365 and adds multimodal agents for text, image, video, and speech. Google highlights developer tooling, open agent protocols, agent monetization, and customer deployments to accelerate end-to-end workflow automation and auditable governance.
Tue, October 7, 2025
Startup Technical Guide: Building Production AI Agents
🤖 Google Cloud published the Startup technical guide: AI agents, a practical, operations-driven roadmap to design, build, and operate agentic systems for startups. The guide outlines three paths — build with the open-source Agent Development Kit (ADK), design no-code agents in Agentspace, or adopt managed and partner agents via Vertex AI and the Agent Garden marketplace. It details four development steps (identity, prime directive, tools, lifecycle), highlights operational rigor (AgentOps), and promotes interoperability through standards such as MCP and A2A, all aimed at safe production deployment.
Tue, October 7, 2025
150 AI Use Cases from Startups Leveraging Google Cloud
🤖 At the AI Builders Forum, Google Cloud highlighted 150 startups using its generative AI stack—Vertex AI, Gemini, GKE, and Cloud Storage—to build agentic systems, healthcare models, developer tools, and media pipelines. The post catalogs companies across sectors (healthcare, finance, retail, security, creative) and describes technical integrations such as fine-tuning with Gemini, inference on GKE, and scalable analytics with BigQuery. It encourages startups to join Google for Startups Cloud and references a new Startup Technical Guide: AI Agents for building and scaling agentic applications.
Mon, October 6, 2025
Vertex AI Model Garden Adds Self-Deploy Proprietary Models
🔐 Google Cloud’s Vertex AI now supports secure self-deployment of proprietary third-party models directly into customer VPCs via the Model Garden. Customers can discover, license, and deploy closed-source and restricted-license models from partners such as AI21 Labs, Mistral AI, Qodo and others, with one-click provisioning and managed inference. Deployments adhere to VPC-SC controls, selectable regions, autoscaling, and pay-as-you-go billing. This central catalog brings Google, open, and partner models together for enterprise-grade control and compliance.
Fri, October 3, 2025
Dataproc ML library: Connect Spark to Gemini and Vertex
🔗 Google has released an open-source Python library, Dataproc ML, to streamline running ML and generative-AI inference from Apache Spark on Dataproc. The library uses a SparkML-style builder pattern so users can configure a model handler (for example, GenAiModelHandler) and call .transform() to apply Gemini or other Vertex AI models directly to DataFrames. It also supports loading PyTorch and TensorFlow model artifacts from GCS for large-scale batch inference and includes performance optimizations such as vectorized data transfer, connection reuse, and automatic retry/backoff.
Thu, October 2, 2025
Google Cloud Releases Generative Media Models on Vertex AI
🎨Google Cloud announced General Availability and feature updates for its generative media models on Vertex AI, including Gemini 2.5 Flash Image, Veo 3, Imagen 4, and Gemini 2.5 TTS. The release emphasizes production readiness and enterprise security while adding multi‑aspect ratio image generation, batch image processing, vertical 9:16 video formats with precise duration controls, and studio‑quality multi‑speaker text‑to‑speech across 70+ languages. These enhancements target teams seeking faster, controlled, and scalable cross‑format media workflows for sight, sound, and motion.