All news with #open-weight models tag
Thu, November 20, 2025
CrowdStrike: Political Triggers Reduce AI Code Security
🔍 DeepSeek-R1, a 671B-parameter open-source LLM, produced code with significantly more severe security vulnerabilities when prompts included politically sensitive modifiers. CrowdStrike found baseline vulnerable outputs at 19%, rising to 27.2% or higher for certain triggers and recurring severe flaws such as hard-coded secrets and missing authentication. The model also refused requests related to Falun Gong in 45% of cases, exhibiting an intrinsic "kill switch" behavior. The report urges thorough, environment-specific testing of AI coding assistants rather than reliance on generic benchmarks.
Wed, November 19, 2025
Amazon Bedrock Adds Support for OpenAI GPT OSS Models
🚀 Amazon Bedrock now supports importing custom weights for gpt-oss-120b and gpt-oss-20b, allowing customers to bring tuned OpenAI GPT OSS models into a fully managed, serverless environment. This capability eliminates the need to manage infrastructure or model serving while enabling deployment of text-to-text models for reasoning, agentic, and developer tasks. gpt-oss-120b is optimized for production and high-reasoning use cases; gpt-oss-20b targets lower-latency or specialized scenarios. The feature is generally available in US‑East (N. Virginia).
Tue, November 18, 2025
Fine-tuning MedGemma for Breast Tumor Classification
🧬 This guide demonstrates step-by-step fine-tuning of MedGemma (a Gemma 3 variant) to classify breast histopathology images using the public BreakHis dataset and a notebook-based workflow. It highlights practical choices—using an NVIDIA A100 40 GB, switching from FP16 to BF16 to avoid numerical overflows, and employing LoRA adapters for efficient training. The tutorial reports dramatic accuracy gains after merging LoRA adapters and points readers to runnable notebooks for reproducibility.
Mon, November 17, 2025
Hands-on with Gemma 3: Deploying Open Models on GCP
🚀 Google Cloud introduces hands-on labs for Gemma 3, a family of lightweight open models offering multimodal (text and image) capabilities and efficient performance on smaller hardware footprints. The labs present two deployment paths: a serverless approach using Cloud Run with GPU support, and a platform approach using GKE for scalable production environments. Choose Cloud Run for simplicity and cost-efficiency or GKE Autopilot for control and robust orchestration to move models from local testing to production.
Fri, November 14, 2025
Agent Factory Recap: Building Open Agentic Models End-to-End
🤖 This recap of The Agent Factory episode summarizes a conversation between Amit Maraj and Ravin Kumar (DeepMind) about building open-source agentic models. It highlights how agent training differs from standard ML, emphasizing trajectory-based data, a two-stage approach of supervised fine-tuning followed by reinforcement learning, and the paramount role of evaluation. Practical guidance includes defining a 50-example final exam up front and considering hybrid setups that use a powerful API like Gemini as a router alongside specialized open models.
Thu, November 13, 2025
Viasat KA-SAT Attack and Satellite Cybersecurity Lessons
🛰️ Cisco Talos revisits the Feb. 24, 2022 KA‑SAT incident where attackers abused a VPN appliance vulnerability to access management systems and deploy the AcidRain wiper. The malware erased modem and router firmware and configs, disrupting satellite communications for many Ukrainian users and unexpectedly severing remote monitoring for ~5,800 German Enercon wind turbines. The piece highlights forensic gaps, links to VPNFilter-era tooling, and the operational choices defenders face when repair or replacement are on the table.
Thu, November 13, 2025
Google Cloud expands Hugging Face support for AI developers
🤝 Google Cloud and Hugging Face are deepening their partnership to speed developer workflows and strengthen enterprise model deployments. A new gateway will cache Hugging Face models and datasets on Google Cloud so downloads take minutes, not hours, across Vertex AI and Google Kubernetes Engine. The collaboration adds native TPU support for open models and integrates Google Cloud’s threat intelligence and Mandiant scanning for models served through Vertex AI.
Sat, November 8, 2025
Microsoft Reveals Whisper Leak: Streaming LLM Side-Channel
🔒 Microsoft has disclosed a novel side-channel called Whisper Leak that can let a passive observer infer the topic of conversations with streaming language models by analyzing encrypted packet sizes and timings. Researchers at Microsoft (Bar Or, McDonald and the Defender team) demonstrate classifiers that distinguish targeted topics from background traffic with high accuracy across vendors including OpenAI, Mistral and xAI. Providers have deployed mitigations such as random-length response padding; Microsoft recommends avoiding sensitive topics on untrusted networks, using VPNs, or preferring non-streaming models and providers that implemented fixes.
Wed, October 29, 2025
Open-Source b3 Benchmark Boosts LLM Security Testing
🛡️ The UK AI Security Institute (AISI), Check Point and Lakera have launched b3, an open-source benchmark to assess and strengthen the security of backbone LLMs that power AI agents. b3 focuses on the specific LLM calls within agent workflows where malicious inputs can trigger harmful outputs, using 10 representative "threat snapshots" combined with a dataset of 19,433 adversarial attacks from Lakera’s Gandalf initiative. The benchmark surfaces vulnerabilities such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service and unauthorized tool calls, making LLM security more measurable, reproducible and comparable across models and applications.
Tue, October 28, 2025
Prisma AIRS 2.0: Unified Platform for Secure AI Agents
🔒 Prisma AIRS 2.0 is a unified AI security platform that delivers end-to-end visibility, risk assessment and automated defenses across agents, models and development pipelines. It consolidates Protect AI capabilities to provide posture and runtime protections for AI agents, model scanning and API-first controls for MLOps. The platform also offers continuous, autonomous red teaming and a managed MCP Server to embed threat detection into workflows.
Tue, October 21, 2025
DeepSeek Privacy and Security: What Users Should Know
🔒 DeepSeek collects extensive interaction data — chats, images and videos — plus account details, IP address and device/browser information, and retains it for an unspecified period under a vague “retain as long as needed” policy. The service operates under Chinese jurisdiction, so stored chats may be accessible to local authorities and have been observed on China Mobile servers. Users can disable model training in web and mobile Data settings, export or delete chats (export is web-only), or run the open-source model locally to avoid server-side retention, but local deployment and deletion have trade-offs and require device protections.
Tue, October 21, 2025
The Signals Loop: Fine-tuning for AI Apps and Agents
🔁 Microsoft positions the signals loop — continuous capture of user interactions and telemetry with systematic fine‑tuning — as essential for building adaptive, reliable AI apps and agents. The post explains that simple RAG and prompting approaches often lack the accuracy and engagement needed for complex use cases, and that continuous learning drives sustained improvements. It highlights Dragon Copilot and GitHub Copilot as examples where telemetry‑driven fine‑tuning yielded substantial performance and experience gains, and presents Azure AI Foundry as a unified platform to operationalize these feedback loops at scale.
Tue, October 14, 2025
Microsoft launches ExCyTIn-Bench to benchmark AI security
🛡️ Microsoft released ExCyTIn-Bench, an open-source benchmarking tool to evaluate how well AI systems perform realistic cybersecurity investigations. It simulates a multistage Azure SOC using 57 Microsoft Sentinel log tables and measures multistep reasoning, tool usage, and evidence synthesis. The benchmark offers fine-grained, actionable metrics for CISOs, product owners, and researchers.
Thu, September 25, 2025
Enabling AI Sovereignty Through Choice and Openness Globally
🌐 Cloudflare argues that AI sovereignty should mean choice: the ability for nations to control data, select models, and deploy applications without vendor lock-in. Through its distributed edge network and serverless Workers AI, Cloudflare promotes accessible, low-cost deployment and inference close to users. The company hosts regional open-source models—India’s IndicTrans2, Japan’s PLaMo-Embedding-1B, and Singapore’s SEA-LION v4-27B—and offers an AI Gateway to connect diverse models. Open standards, interoperability, and pay-as-you-go economics are presented as central to resilient national AI strategies.
Thu, September 18, 2025
Amazon Bedrock Adds Four Qwen3 Open-Weight Models Now
🤖 Amazon Web Services added four Qwen3 open-weight foundation models to Amazon Bedrock as fully managed, serverless offerings. The lineup—Qwen3-Coder-480B-A35B-Instruct, Qwen3-Coder-30B-A3B-Instruct, Qwen3-235B-A22B-Instruct-2507, and Qwen3-32B—covers both dense and Mixture-of-Experts (MoE) architectures. The coder variants specialize in agentic coding, function calling, and tool use, while the 235B and 32B models provide general reasoning and efficient dense computation. These models are available now across multiple AWS regions, enabling developers to build advanced AI applications without managing infrastructure.
Thu, September 18, 2025
DeepSeek-V3.1 Available as Fully Managed in Bedrock
🔍 DeepSeek-V3.1 is now available as a fully managed foundation model in Amazon Bedrock, offering an open-weight option designed for enterprise deployment. The model supports a selectable 'thinking' mode for step-by-step analysis and a faster non-thinking mode for quicker replies, with improved multilingual accuracy and reduced hallucinations. Enhanced tool-calling, transparent reasoning, and strong coding and analytical performance make it well suited for building AI agents, automating workflows, and tackling complex technical tasks. DeepSeek-V3.1 is available in US West (Oregon), Asia Pacific (Tokyo, Mumbai), and Europe (London, Stockholm).
Thu, September 18, 2025
AWS Bedrock Adds OpenAI Open‑Weight Models in Eight Regions
🚀 AWS has expanded availability of OpenAI open weight models on AWS Bedrock to eight additional AWS Regions worldwide. The update brings the models to US East (N. Virginia), Asia Pacific (Tokyo, Mumbai), Europe (Stockholm, Ireland, London, Milan) and South America (São Paulo), alongside existing US West (Oregon) support. This broader footprint aims to lower latency, improve model performance and help customers meet data residency requirements. To get started, use the Amazon Bedrock console or consult the documentation.
Tue, September 16, 2025
Gemini and Open-Source Text Embeddings Now in BigQuery ML
🚀 Google expanded BigQuery ML to generate embeddings from Gemini and over 13,000 open-source text-embedding models via Hugging Face, all callable with simple SQL. The post summarizes model tiers to help teams trade off quality, cost, and scalability, and introduces Gemini's Tokens Per Minute (TPM) quota for throughput control. It shows a practical workflow to deploy OSS models to Vertex AI endpoints, run ML.GENERATE_EMBEDDING for batch jobs, and undeploy to minimize idle costs, plus a Colab tutorial and cost/scale guidance.
Thu, August 28, 2025
Gemini Available On-Premises with Google Distributed Cloud
🚀 Gemini on Google Distributed Cloud (GDC) is now generally available for customers, bringing Google’s advanced Gemini models on‑premises with GA for air‑gapped deployments and a connected preview. The solution provides managed Gemini endpoints with zero‑touch updates, automatic load balancing and autoscaling, and integrates with Vertex AI and preview agents. It pairs Gemini 2.5 Flash and Pro with NVIDIA Hopper and Blackwell accelerators and includes audit logging, access controls, and support for Confidential Computing (Intel TDX and NVIDIA) to meet strict data residency, sovereignty, and compliance requirements.
Wed, August 27, 2025
AI-Generated Ransomware 'PromptLock' Uses OpenAI Model
🔒 ESET disclosed a new proof-of-concept ransomware called PromptLock that uses OpenAI's gpt-oss:20b model via the Ollama API to generate malicious Lua scripts in real time. Written in Golang, the strain produces cross-platform scripts that enumerate files, exfiltrate selected data, and encrypt targets using SPECK 128-bit. ESET warned that AI-generated scripts can vary per execution, complicating detection and IoC reuse.