Tag Banner

All news with #vertex ai tag

Fri, October 17, 2025

Moloco and Google Cloud Power AI Vector Search in Retail

🔎 Moloco’s AI-native retail media platform, integrated with Vertex AI Vector Search on Google Cloud, delivers semantic, real-time ad retrieval and personalized recommendations. The joint architecture uses TPUs and GPUs for model training and scoring while vector search runs efficiently on CPUs, enabling outcomes-based bidding at scale. Internal benchmarks report ~10x capacity, up to ~25% lower p95 latency, and a ~4% revenue uplift. The managed service reduces operational overhead and accelerates time-to-value for retailers.

read more →

Thu, October 16, 2025

Vertex AI SDK Adds Prompt Management for Enterprises

🛠️ Google Cloud announced General Availability of Prompt Management in the Vertex AI SDK, enabling teams to programmatically create, version, and manage prompts as first-class assets. The capability bridges Vertex AI Studio’s visual prompt design with SDK-driven automation to improve collaboration, reproducibility, and lifecycle control. Enterprise security and compliance are supported via CMEK and VPCSC, and the SDK exposes simple Python methods to create, list, update, and delete prompt resources tied to models such as gemini-2.5-flash. Get started using the documented code examples to centralize prompt governance and scale generative AI workflows.

read more →

Wed, October 15, 2025

Ultimate Prompting Guide for Veo 3.1 on Vertex AI Preview

🎬 This guide introduces Veo 3.1, Google Cloud's improved generative video model available in preview on Vertex AI, and explains how to move beyond "prompt and pray" toward deliberate creative control. It highlights core capabilities—high-fidelity 720p/1080p output, variable clip lengths, synchronized dialogue and sound effects, and stronger image-to-video fidelity. The article presents a five-part prompting formula and detailed techniques for cinematography, soundstage direction, negative prompting, and timestamped scenes. It also describes advanced multi-step workflows that combine Gemini 2.5 Flash Image to produce consistent characters and controlled transitions, and notes SynthID watermarking and certain current limitations.

read more →

Wed, October 15, 2025

Vertex AI Context Caching: Reduce Cost and Latency

⚡ Vertex AI context caching saves and reuses precomputed input tokens so developers avoid repeatedly sending and recomputing long contextual content, reducing latency and cost for large-context AI applications. It provides implicit caching — automatic, default, short-lived KV caches (deleted within 24 hours) integrated with Provisioned Throughput — and explicit CachedContent objects that are paid once and then reused at a deep discount with optional CMEK protection. Caches support multimodal inputs and very large context windows.

read more →

Tue, October 14, 2025

Google Cloud Adds AI Annotations and Object Contexts

🧠 Google Cloud is introducing two Cloud Storage features—auto annotate and object contexts—that apply pretrained AI to generate metadata and attach custom key-value tags to stored objects. Auto annotate (experimental) produces image annotations such as object detection, labels, and objectionable-content signals tied to an object's lifecycle. Object contexts (preview) let teams add, manage, and query contextual tags with IAM controls and Storage Insights integration. Together they enable scalable discovery, curation, and governance of previously unanalyzed unstructured “dark data.”

read more →

Mon, October 13, 2025

Google Introduces LLM-Evalkit for Prompt Engineering

🧭 LLM-Evalkit is an open-source, lightweight application from Google that centralizes and streamlines prompt engineering using Vertex AI SDKs. It provides a no-code interface for creating, versioning, testing, and benchmarking prompts while tracking objective performance metrics. The tool promotes a dataset-driven evaluation workflow—define the task, assemble representative test cases, and score outputs against clear metrics—to replace ad-hoc iteration and subjective comparisons. Documentation and a guided console tutorial are available to help teams adopt the framework and reproduce experiments.

read more →

Thu, October 9, 2025

Google Introduces Gemini Enterprise for the Workplace

🚀 Gemini Enterprise is presented as Google’s unified, enterprise-grade AI front door that integrates advanced models, a no-code workbench, pre-built and customizable agents, secure data connectors, centralized governance, and an open partner ecosystem. The chat-first interface works across Google Workspace and Microsoft 365 and adds multimodal agents for text, image, video, and speech. Google highlights developer tooling, open agent protocols, agent monetization, and customer deployments to accelerate end-to-end workflow automation and auditable governance.

read more →

Tue, October 7, 2025

Startup Technical Guide: Building Production AI Agents

🤖 Google Cloud published the Startup technical guide: AI agents, a practical, operations-driven roadmap to design, build, and operate agentic systems for startups. The guide outlines three paths — build with the open-source Agent Development Kit (ADK), design no-code agents in Agentspace, or adopt managed and partner agents via Vertex AI and the Agent Garden marketplace. It details four development steps (identity, prime directive, tools, lifecycle), highlights operational rigor (AgentOps), and promotes interoperability through standards such as MCP and A2A, all aimed at safe production deployment.

read more →

Tue, October 7, 2025

150 AI Use Cases from Startups Leveraging Google Cloud

🤖 At the AI Builders Forum, Google Cloud highlighted 150 startups using its generative AI stack—Vertex AI, Gemini, GKE, and Cloud Storage—to build agentic systems, healthcare models, developer tools, and media pipelines. The post catalogs companies across sectors (healthcare, finance, retail, security, creative) and describes technical integrations such as fine-tuning with Gemini, inference on GKE, and scalable analytics with BigQuery. It encourages startups to join Google for Startups Cloud and references a new Startup Technical Guide: AI Agents for building and scaling agentic applications.

read more →

Mon, October 6, 2025

Vertex AI Model Garden Adds Self-Deploy Proprietary Models

🔐 Google Cloud’s Vertex AI now supports secure self-deployment of proprietary third-party models directly into customer VPCs via the Model Garden. Customers can discover, license, and deploy closed-source and restricted-license models from partners such as AI21 Labs, Mistral AI, Qodo and others, with one-click provisioning and managed inference. Deployments adhere to VPC-SC controls, selectable regions, autoscaling, and pay-as-you-go billing. This central catalog brings Google, open, and partner models together for enterprise-grade control and compliance.

read more →

Fri, October 3, 2025

Dataproc ML library: Connect Spark to Gemini and Vertex

🔗 Google has released an open-source Python library, Dataproc ML, to streamline running ML and generative-AI inference from Apache Spark on Dataproc. The library uses a SparkML-style builder pattern so users can configure a model handler (for example, GenAiModelHandler) and call .transform() to apply Gemini or other Vertex AI models directly to DataFrames. It also supports loading PyTorch and TensorFlow model artifacts from GCS for large-scale batch inference and includes performance optimizations such as vectorized data transfer, connection reuse, and automatic retry/backoff.

read more →

Thu, October 2, 2025

Google Cloud Releases Generative Media Models on Vertex AI

🎨Google Cloud announced General Availability and feature updates for its generative media models on Vertex AI, including Gemini 2.5 Flash Image, Veo 3, Imagen 4, and Gemini 2.5 TTS. The release emphasizes production readiness and enterprise security while adding multi‑aspect ratio image generation, batch image processing, vertical 9:16 video formats with precise duration controls, and studio‑quality multi‑speaker text‑to‑speech across 70+ languages. These enhancements target teams seeking faster, controlled, and scalable cross‑format media workflows for sight, sound, and motion.

read more →

Thu, October 2, 2025

Accelerate AI with Agents: EMEA Developer Series and Labs

🚀 Google Cloud is hosting a regional event series across EMEA to help developers and tech practitioners learn to build and scale AI agents. The program combines immersive, hands-on labs and expert-led workshops covering technologies such as Cloud Run, Vertex AI, Gemini, and the Agent Development Kit (ADK). Participants receive step-by-step guidance and practical exercises designed to accelerate agent deployments and operational readiness within organizations.

read more →

Mon, September 29, 2025

Anthropic's Claude Sonnet 4.5 Now Available on Vertex AI

🚀 Anthropic’s Claude Sonnet 4.5 is now generally available on Vertex AI, delivering advanced long-horizon autonomy for agents across coding, finance, research, and cybersecurity. The model can operate independently for hours, orchestrating tools and coordinating multiple agents to complete complex, multi-step tasks. Vertex AI provides orchestration, provisioning, security controls, and developer tooling, and includes Claude Code upgrades like a VS Code extension and an improved terminal interface.

read more →

Wed, September 24, 2025

Enabling Data Scientists to Become Agentic Architects

🧭 Google outlines an AI-native stack to transform data scientists into agentic architects, unifying development, real-time data access, and production-grade agent deployment. Enhancements to Colab Enterprise notebooks add native SQL cells, editable visualizations, and an interactive Data Science Agent that can orchestrate BigQuery ML, DataFrames, and Spark workflows. The Lightning Engine is now generally available to accelerate Spark, while previews for stateful BigQuery continuous queries and autonomous embedding generation bring real-time streaming and vector search into analytics. A 'Build-Deploy-Connect' toolkit, including the Agent Development Kit, MCP Toolbox, and Gemini CLI extensions, helps move notebook prototypes into secure, scalable agent fleets.

read more →

Wed, September 24, 2025

INDOT Used Google AI to Save 360 Hours and Meet Deadline

🚀 Indiana Department of Transportation built a week-long pilot on Google Cloud to meet a 30-day executive order, using a Retrieval-Augmented Generation workflow that combined rapid ETL, Vertex AI Search indexing, and Gemini. The system scraped and parsed decades of internal policies and manuals, produced draft reports across nine divisions with 98% fidelity, and saved an estimated 360 hours of manual effort, enabling INDOT to submit on time.

read more →

Wed, September 24, 2025

JS Bank modernizes with Google stack and ChromeOS rollout

🚀 JS Bank migrated its distributed IT estate to a unified Google ecosystem—deploying 1,500 Chromebooks and Chromeboxes while adopting Google Workspace and Chrome Enterprise Premium. The change delivered nearly 90% endpoint standardization, cut device management time by 40%, and halved daily support tickets. Built-in ChromeOS protections simplified security and reduced reliance on multiple third-party antivirus and anti-malware tools.

read more →

Tue, September 23, 2025

Deutsche Bank launches DB Lumina for AI research platform

🤖 DB Lumina is Deutsche Bank Research’s AI-powered assistant, built on Google Cloud and integrating multimodal Gemini models, RAG retrieval, and vector search. It provides a conversational chat interface, reusable prompt templates, and document-grounded answers with inline citations and enterprise guardrails for compliance. Early deployment to roughly 5,000 analysts has yielded measurable time savings, deeper analysis, and improved editorial accuracy.

read more →

Mon, September 22, 2025

Protect AI Development Using Falcon Cloud Security

🔒 Falcon Cloud Security provides end-to-end protection for AI development pipelines by embedding AI detection into CI/CD workflows, scanning container images, and surfacing AI-related packages and CVEs in real time. It extends visibility to cloud model services — including AWS SageMaker and Bedrock, Azure AI, and Google Vertex AI — revealing model provenance, dependencies, and API usage. Runtime inventory ties build-time detections to live containers so teams can prioritize fixes, govern models, and maintain delivery velocity without compromising security.

read more →

Thu, September 18, 2025

Seattle Children’s Uses AI to Accelerate Pediatric Care

🤖 Seattle Children’s partnered with Google Cloud to build Pathway Assistant, a multimodal AI chatbot that turns thousands of pediatric clinical pathway PDFs into conversational, searchable guidance. Using Vertex AI and Gemini, the assistant extracts JSON metadata, parses diagrams and flowcharts, and returns cited answers in seconds. The tool logs clinician feedback to BigQuery and stores source documents in Cloud Storage, enabling continuous improvement of documentation and metadata.

read more →