All news with #open-weight models tag

Thu, November 20, 2025

CrowdStrike: Political Triggers Reduce AI Code Security

#AI Security #Open-Weight Models #DeepSeek #CrowdStrike #Training Data Leakage #Dataset Integrity #Safety Guardrails

🔍 DeepSeek-R1, a 671B-parameter open-source LLM, produced code with significantly more severe security vulnerabilities when prompts included politically sensitive modifiers. CrowdStrike found baseline vulnerable outputs at 19%, rising to 27.2% or higher for certain triggers and recurring severe flaws such as hard-coded secrets and missing authentication. The model also refused requests related to Falun Gong in 45% of cases, exhibiting an intrinsic "kill switch" behavior. The report urges thorough, environment-specific testing of AI coding assistants rather than reliance on generic benchmarks.

Wed, November 19, 2025

Amazon Bedrock Adds Support for OpenAI GPT OSS Models

#AWS #Amazon Bedrock #OpenAI #Open-Weight Models #Product Release

🚀 Amazon Bedrock now supports importing custom weights for gpt-oss-120b and gpt-oss-20b, allowing customers to bring tuned OpenAI GPT OSS models into a fully managed, serverless environment. This capability eliminates the need to manage infrastructure or model serving while enabling deployment of text-to-text models for reasoning, agentic, and developer tasks. gpt-oss-120b is optimized for production and high-reasoning use cases; gpt-oss-20b targets lower-latency or specialized scenarios. The feature is generally available in US‑East (N. Virginia).

Tue, November 18, 2025

Fine-tuning MedGemma for Breast Tumor Classification

#Google #Open-Weight Models #MedGemma #LoRA #bfloat16

🧬 This guide demonstrates step-by-step fine-tuning of MedGemma (a Gemma 3 variant) to classify breast histopathology images using the public BreakHis dataset and a notebook-based workflow. It highlights practical choices—using an NVIDIA A100 40 GB, switching from FP16 to BF16 to avoid numerical overflows, and employing LoRA adapters for efficient training. The tutorial reports dramatic accuracy gains after merging LoRA adapters and points readers to runnable notebooks for reproducibility.

Mon, November 17, 2025

Hands-on with Gemma 3: Deploying Open Models on GCP

#Google #GKE #Open-Weight Models #Cloud Run #Gemma 3

🚀 Google Cloud introduces hands-on labs for Gemma 3, a family of lightweight open models offering multimodal (text and image) capabilities and efficient performance on smaller hardware footprints. The labs present two deployment paths: a serverless approach using Cloud Run with GPU support, and a platform approach using GKE for scalable production environments. Choose Cloud Run for simplicity and cost-efficiency or GKE Autopilot for control and robust orchestration to move models from local testing to production.

Fri, November 14, 2025

Agent Factory Recap: Building Open Agentic Models End-to-End

#Agentic AI #Open-Weight Models #Google #Gemini #Model Evaluation Coverage #Dataset Integrity

🤖 This recap of The Agent Factory episode summarizes a conversation between Amit Maraj and Ravin Kumar (DeepMind) about building open-source agentic models. It highlights how agent training differs from standard ML, emphasizing trajectory-based data, a two-stage approach of supervised fine-tuning followed by reinforcement learning, and the paramount role of evaluation. Practical guidance includes defining a 50-example final exam up front and considering hybrid setups that use a powerful API like Gemini as a router alongside specialized open models.

Thu, November 13, 2025

Viasat KA-SAT Attack and Satellite Cybersecurity Lessons

#Ransomware #Open-Weight Models #Viasat #AcidRain

🛰️ Cisco Talos revisits the Feb. 24, 2022 KA‑SAT incident where attackers abused a VPN appliance vulnerability to access management systems and deploy the AcidRain wiper. The malware erased modem and router firmware and configs, disrupting satellite communications for many Ukrainian users and unexpectedly severing remote monitoring for ~5,800 German Enercon wind turbines. The piece highlights forensic gaps, links to VPNFilter-era tooling, and the operational choices defenders face when repair or replacement are on the table.

Thu, November 13, 2025

Google Cloud expands Hugging Face support for AI developers

#Hugging Face #Google #Vertex AI #TPU #Mandiant #AI Security #Open-Weight Models

🤝 Google Cloud and Hugging Face are deepening their partnership to speed developer workflows and strengthen enterprise model deployments. A new gateway will cache Hugging Face models and datasets on Google Cloud so downloads take minutes, not hours, across Vertex AI and Google Kubernetes Engine. The collaboration adds native TPU support for open models and integrates Google Cloud’s threat intelligence and Mandiant scanning for models served through Vertex AI.

Sat, November 8, 2025

Microsoft Reveals Whisper Leak: Streaming LLM Side-Channel

#AI Security #Inference Security #Open-Weight Models #Safety Guardrails #OpenAI #Mistral

🔒 Microsoft has disclosed a novel side-channel called Whisper Leak that can let a passive observer infer the topic of conversations with streaming language models by analyzing encrypted packet sizes and timings. Researchers at Microsoft (Bar Or, McDonald and the Defender team) demonstrate classifiers that distinguish targeted topics from background traffic with high accuracy across vendors including OpenAI, Mistral and xAI. Providers have deployed mitigations such as random-length response padding; Microsoft recommends avoiding sensitive topics on untrusted networks, using VPNs, or preferring non-streaming models and providers that implemented fixes.

Wed, October 29, 2025

Open-Source b3 Benchmark Boosts LLM Security Testing

#AI Security #Agentic AI #Model Evaluation Coverage #Open-Weight Models #System Prompt Exposure #Prompt Leakage

🛡️ The UK AI Security Institute (AISI), Check Point and Lakera have launched b3, an open-source benchmark to assess and strengthen the security of backbone LLMs that power AI agents. b3 focuses on the specific LLM calls within agent workflows where malicious inputs can trigger harmful outputs, using 10 representative "threat snapshots" combined with a dataset of 19,433 adversarial attacks from Lakera’s Gandalf initiative. The benchmark surfaces vulnerabilities such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service and unauthorized tool calls, making LLM security more measurable, reproducible and comparable across models and applications.

Tue, October 28, 2025

Prisma AIRS 2.0: Unified Platform for Secure AI Agents

#Palo Alto Networks #Product Release #AI Security #Agentic AI #AI Red Teaming #Open-Weight Models #AI Supply Chain #Model Backdooring

🔒 Prisma AIRS 2.0 is a unified AI security platform that delivers end-to-end visibility, risk assessment and automated defenses across agents, models and development pipelines. It consolidates Protect AI capabilities to provide posture and runtime protections for AI agents, model scanning and API-first controls for MLOps. The platform also offers continuous, autonomous red teaming and a managed MCP Server to embed threat detection into workflows.

Tue, October 21, 2025

DeepSeek Privacy and Security: What Users Should Know

#AI Security #AI Data Leakage #DeepSeek #LM Studio #Open-Weight Models #Data Retention

🔒 DeepSeek collects extensive interaction data — chats, images and videos — plus account details, IP address and device/browser information, and retains it for an unspecified period under a vague “retain as long as needed” policy. The service operates under Chinese jurisdiction, so stored chats may be accessible to local authorities and have been observed on China Mobile servers. Users can disable model training in web and mobile Data settings, export or delete chats (export is web-only), or run the open-source model locally to avoid server-side retention, but local deployment and deletion have trade-offs and require device protections.

Tue, October 21, 2025

The Signals Loop: Fine-tuning for AI Apps and Agents

#Azure #Azure AI Foundry #Fine-Tuning #Agentic AI #Open-Weight Models

🔁 Microsoft positions the signals loop — continuous capture of user interactions and telemetry with systematic fine‑tuning — as essential for building adaptive, reliable AI apps and agents. The post explains that simple RAG and prompting approaches often lack the accuracy and engagement needed for complex use cases, and that continuous learning drives sustained improvements. It highlights Dragon Copilot and GitHub Copilot as examples where telemetry‑driven fine‑tuning yielded substantial performance and experience gains, and presents Azure AI Foundry as a unified platform to operationalize these feedback loops at scale.

Tue, October 14, 2025

Microsoft launches ExCyTIn-Bench to benchmark AI security

#AI Security #Microsoft #AI Evals #Microsoft Security Copilot #Microsoft Sentinel #Open-Weight Models #Hugging Face

🛡️ Microsoft released ExCyTIn-Bench, an open-source benchmarking tool to evaluate how well AI systems perform realistic cybersecurity investigations. It simulates a multistage Azure SOC using 57 Microsoft Sentinel log tables and measures multistep reasoning, tool usage, and evidence synthesis. The benchmark offers fine-grained, actionable metrics for CISOs, product owners, and researchers.

Thu, September 25, 2025

Enabling AI Sovereignty Through Choice and Openness Globally

#Cloudflare #Workers AI #Open-Weight Models #AI Governance #Embeddings #Retrieval-Augmented Generation #Hugging Face

🌐 Cloudflare argues that AI sovereignty should mean choice: the ability for nations to control data, select models, and deploy applications without vendor lock-in. Through its distributed edge network and serverless Workers AI, Cloudflare promotes accessible, low-cost deployment and inference close to users. The company hosts regional open-source models—India’s IndicTrans2, Japan’s PLaMo-Embedding-1B, and Singapore’s SEA-LION v4-27B—and offers an AI Gateway to connect diverse models. Open standards, interoperability, and pay-as-you-go economics are presented as central to resilient national AI strategies.

Thu, September 18, 2025

Amazon Bedrock Adds Four Qwen3 Open-Weight Models Now

#AWS #Amazon Bedrock #Open-Weight Models #Agentic AI #Product Release

🤖 Amazon Web Services added four Qwen3 open-weight foundation models to Amazon Bedrock as fully managed, serverless offerings. The lineup—Qwen3-Coder-480B-A35B-Instruct, Qwen3-Coder-30B-A3B-Instruct, Qwen3-235B-A22B-Instruct-2507, and Qwen3-32B—covers both dense and Mixture-of-Experts (MoE) architectures. The coder variants specialize in agentic coding, function calling, and tool use, while the 235B and 32B models provide general reasoning and efficient dense computation. These models are available now across multiple AWS regions, enabling developers to build advanced AI applications without managing infrastructure.

Thu, September 18, 2025

DeepSeek-V3.1 Available as Fully Managed in Bedrock

#AWS #Amazon Bedrock #Open-Weight Models #Agentic AI #Function Calling

🔍 DeepSeek-V3.1 is now available as a fully managed foundation model in Amazon Bedrock, offering an open-weight option designed for enterprise deployment. The model supports a selectable 'thinking' mode for step-by-step analysis and a faster non-thinking mode for quicker replies, with improved multilingual accuracy and reduced hallucinations. Enhanced tool-calling, transparent reasoning, and strong coding and analytical performance make it well suited for building AI agents, automating workflows, and tackling complex technical tasks. DeepSeek-V3.1 is available in US West (Oregon), Asia Pacific (Tokyo, Mumbai), and Europe (London, Stockholm).

Thu, September 18, 2025

AWS Bedrock Adds OpenAI Open‑Weight Models in Eight Regions

#AWS #OpenAI #AWS Bedrock #Open-Weight Models #Data Residency

🚀 AWS has expanded availability of OpenAI open weight models on AWS Bedrock to eight additional AWS Regions worldwide. The update brings the models to US East (N. Virginia), Asia Pacific (Tokyo, Mumbai), Europe (Stockholm, Ireland, London, Milan) and South America (São Paulo), alongside existing US West (Oregon) support. This broader footprint aims to lower latency, improve model performance and help customers meet data residency requirements. To get started, use the Amazon Bedrock console or consult the documentation.

Tue, September 16, 2025

Gemini and Open-Source Text Embeddings Now in BigQuery ML

#Google Cloud #BigQuery #Vertex AI #Hugging Face #Gemini #Retrieval-Augmented Generation #Open-Weight Models

🚀 Google expanded BigQuery ML to generate embeddings from Gemini and over 13,000 open-source text-embedding models via Hugging Face, all callable with simple SQL. The post summarizes model tiers to help teams trade off quality, cost, and scalability, and introduces Gemini's Tokens Per Minute (TPM) quota for throughput control. It shows a practical workflow to deploy OSS models to Vertex AI endpoints, run ML.GENERATE_EMBEDDING for batch jobs, and undeploy to minimize idle costs, plus a Colab tutorial and cost/scale guidance.

Thu, August 28, 2025

Gemini Available On-Premises with Google Distributed Cloud

#Closed-Weight Models #Data Residency #Data Sovereignty #Google #Inference Security #Intel #NVIDIA #Open-Weight Models #Product Release #Vertex AI

🚀 Gemini on Google Distributed Cloud (GDC) is now generally available for customers, bringing Google’s advanced Gemini models on‑premises with GA for air‑gapped deployments and a connected preview. The solution provides managed Gemini endpoints with zero‑touch updates, automatic load balancing and autoscaling, and integrates with Vertex AI and preview agents. It pairs Gemini 2.5 Flash and Pro with NVIDIA Hopper and Blackwell accelerators and includes audit logging, access controls, and support for Confidential Computing (Intel TDX and NVIDIA) to meet strict data residency, sovereignty, and compliance requirements.

Wed, August 27, 2025

AI-Generated Ransomware 'PromptLock' Uses OpenAI Model

#AI Code Gen Risk #Data Exfil via Tools #Ollama #OpenAI #Open-Weight Models #Prompt Injection #Ransomware #Weak Cryptography

🔒 ESET disclosed a new proof-of-concept ransomware called PromptLock that uses OpenAI's gpt-oss:20b model via the Ollama API to generate malicious Lua scripts in real time. Written in Golang, the strain produces cross-platform scripts that enumerate files, exfiltrate selected data, and encrypt targets using SPECK 128-bit. ESET warned that AI-generated scripts can vary per execution, complicating detection and IoC reuse.