All news with #open-weight models tag
Wed, December 3, 2025
Adversarial Poetry Bypasses AI Guardrails Across Models
✍️ Researchers from Icaro Lab (DexAI), Sapienza University of Rome, and Sant’Anna School found that short poetic prompts can reliably subvert AI safety filters, in some cases achieving 100% success. Using 20 crafted poems and the MLCommons AILuminate benchmark across 25 proprietary and open models, they prompted systems to produce hazardous instructions — from weapons-grade plutonium to steps for deploying RATs. The team observed wide variance by vendor and model family, with some smaller models surprisingly more resistant. The study concludes that stylistic prompts exploit structural alignment weaknesses across providers.
Wed, December 3, 2025
Amazon SageMaker HyperPod Adds Checkpointless Training
🚀 Amazon SageMaker HyperPod now supports checkpointless training, a foundational capability that eliminates the need for checkpoint-based, job-level restarts for distributed model training. Checkpointless training preserves forward training state across the cluster, automatically swaps out failed nodes, and uses peer-to-peer state transfer to resume progress, reducing recovery time from hours to minutes. The feature can deliver up to 95% training goodput at very large scale, is available in all Regions where HyperPod runs, and can be enabled with zero code changes for popular recipes or with minimal PyTorch modifications for custom models.
Tue, December 2, 2025
Mistral Large 3 Now Available in Microsoft Foundry
🚀 Microsoft has added Mistral Large 3 to Foundry on Azure, offering a high-capability, Apache 2.0–licensed open-weight model optimized for production workloads. The model focuses on reliable instruction following, extended-context comprehension, strong multimodal reasoning, and reduced hallucination for enterprise scenarios. Foundry packages unified governance, observability, and agent-ready tooling, and allows weight export for hybrid or on-prem deployment.
Tue, December 2, 2025
Mistral Large 3 and Ministral 3 Now on Amazon Bedrock
🚀 Amazon Bedrock now offers Mistral Large 3 and the Ministral 3 family alongside additional Mistral AI checkpoints, giving customers early access to open-weight multimodal models. Mistral Large 3 employs a granular Mixture-of-Experts architecture with 41B active and 675B total parameters and supports a 256K context window for long-form comprehension and agentic workflows. The Ministral 3 series (14B, 8B, 3B) plus Voxtral and Magistral small models let developers choose scales optimized for production assistants, RAG systems, single-GPU edge deployment, or low-resource environments.
Tue, December 2, 2025
Amazon Bedrock Adds 18 Fully Managed Open Models Today
🚀 Amazon Bedrock expanded its model catalog with 18 new fully managed open-weight models, the largest single addition to date. The offering includes Gemma 3, Mistral Large 3, NVIDIA Nemotron Nano 2, OpenAI gpt-oss variants and other vendor models. Through a unified API, developers can evaluate, switch, and adopt these models in production without rewriting applications or changing infrastructure. Models are available in supported AWS Regions.
Tue, November 25, 2025
Four Ways AI Is Strengthening Democracies Worldwide
🗳️ The essay argues that while AI poses risks to democratic processes, it is also being used to strengthen civic engagement and government function across diverse contexts. Four case studies—Japan, Brazil, Germany, and the United States—illustrate practical deployments: AI avatars for constituent engagement, judicial workflow automation, interactive voter guides, and investigative tools for watchdog journalism. The authors recommend public AI like Switzerland’s Apertus as a democratic alternative to proprietary models and stress governance, transparency, and scientific evaluation to mitigate bias.
Tue, November 25, 2025
Cloudflare Hosts Black Forest Lab FLUX.2 on Workers AI
🖼️ Cloudflare now hosts Black Forest Lab's FLUX.2 image model on the Workers AI inference platform. The licensed dev release builds on the popular FLUX.1 lineage with stronger physical-world grounding, improved fidelity for faces, hands and small objects, and advanced multi-reference editing to preserve character and product consistency. Workers AI exposes FLUX.2 via multipart form-data (up to four 512×512 inputs) and returns images up to 4 megapixels, while supporting JSON prompting, hex color controls, multilingual prompts, and a server-side binding for integration into production pipelines.
Thu, November 20, 2025
CrowdStrike: Political Triggers Reduce AI Code Security
🔍 DeepSeek-R1, a 671B-parameter open-source LLM, produced code with significantly more severe security vulnerabilities when prompts included politically sensitive modifiers. CrowdStrike found baseline vulnerable outputs at 19%, rising to 27.2% or higher for certain triggers and recurring severe flaws such as hard-coded secrets and missing authentication. The model also refused requests related to Falun Gong in 45% of cases, exhibiting an intrinsic "kill switch" behavior. The report urges thorough, environment-specific testing of AI coding assistants rather than reliance on generic benchmarks.
Wed, November 19, 2025
Amazon Bedrock Adds Support for OpenAI GPT OSS Models
🚀 Amazon Bedrock now supports importing custom weights for gpt-oss-120b and gpt-oss-20b, allowing customers to bring tuned OpenAI GPT OSS models into a fully managed, serverless environment. This capability eliminates the need to manage infrastructure or model serving while enabling deployment of text-to-text models for reasoning, agentic, and developer tasks. gpt-oss-120b is optimized for production and high-reasoning use cases; gpt-oss-20b targets lower-latency or specialized scenarios. The feature is generally available in US‑East (N. Virginia).
Tue, November 18, 2025
Fine-tuning MedGemma for Breast Tumor Classification
🧬 This guide demonstrates step-by-step fine-tuning of MedGemma (a Gemma 3 variant) to classify breast histopathology images using the public BreakHis dataset and a notebook-based workflow. It highlights practical choices—using an NVIDIA A100 40 GB, switching from FP16 to BF16 to avoid numerical overflows, and employing LoRA adapters for efficient training. The tutorial reports dramatic accuracy gains after merging LoRA adapters and points readers to runnable notebooks for reproducibility.
Mon, November 17, 2025
Hands-on with Gemma 3: Deploying Open Models on GCP
🚀 Google Cloud introduces hands-on labs for Gemma 3, a family of lightweight open models offering multimodal (text and image) capabilities and efficient performance on smaller hardware footprints. The labs present two deployment paths: a serverless approach using Cloud Run with GPU support, and a platform approach using GKE for scalable production environments. Choose Cloud Run for simplicity and cost-efficiency or GKE Autopilot for control and robust orchestration to move models from local testing to production.
Fri, November 14, 2025
Agent Factory Recap: Building Open Agentic Models End-to-End
🤖 This recap of The Agent Factory episode summarizes a conversation between Amit Maraj and Ravin Kumar (DeepMind) about building open-source agentic models. It highlights how agent training differs from standard ML, emphasizing trajectory-based data, a two-stage approach of supervised fine-tuning followed by reinforcement learning, and the paramount role of evaluation. Practical guidance includes defining a 50-example final exam up front and considering hybrid setups that use a powerful API like Gemini as a router alongside specialized open models.
Thu, November 13, 2025
Viasat KA-SAT Attack and Satellite Cybersecurity Lessons
🛰️ Cisco Talos revisits the Feb. 24, 2022 KA‑SAT incident where attackers abused a VPN appliance vulnerability to access management systems and deploy the AcidRain wiper. The malware erased modem and router firmware and configs, disrupting satellite communications for many Ukrainian users and unexpectedly severing remote monitoring for ~5,800 German Enercon wind turbines. The piece highlights forensic gaps, links to VPNFilter-era tooling, and the operational choices defenders face when repair or replacement are on the table.
Thu, November 13, 2025
Google Cloud expands Hugging Face support for AI developers
🤝 Google Cloud and Hugging Face are deepening their partnership to speed developer workflows and strengthen enterprise model deployments. A new gateway will cache Hugging Face models and datasets on Google Cloud so downloads take minutes, not hours, across Vertex AI and Google Kubernetes Engine. The collaboration adds native TPU support for open models and integrates Google Cloud’s threat intelligence and Mandiant scanning for models served through Vertex AI.
Sat, November 8, 2025
Microsoft Reveals Whisper Leak: Streaming LLM Side-Channel
🔒 Microsoft has disclosed a novel side-channel called Whisper Leak that can let a passive observer infer the topic of conversations with streaming language models by analyzing encrypted packet sizes and timings. Researchers at Microsoft (Bar Or, McDonald and the Defender team) demonstrate classifiers that distinguish targeted topics from background traffic with high accuracy across vendors including OpenAI, Mistral and xAI. Providers have deployed mitigations such as random-length response padding; Microsoft recommends avoiding sensitive topics on untrusted networks, using VPNs, or preferring non-streaming models and providers that implemented fixes.
Wed, October 29, 2025
Open-Source b3 Benchmark Boosts LLM Security Testing
🛡️ The UK AI Security Institute (AISI), Check Point and Lakera have launched b3, an open-source benchmark to assess and strengthen the security of backbone LLMs that power AI agents. b3 focuses on the specific LLM calls within agent workflows where malicious inputs can trigger harmful outputs, using 10 representative "threat snapshots" combined with a dataset of 19,433 adversarial attacks from Lakera’s Gandalf initiative. The benchmark surfaces vulnerabilities such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service and unauthorized tool calls, making LLM security more measurable, reproducible and comparable across models and applications.
Tue, October 28, 2025
Prisma AIRS 2.0: Unified Platform for Secure AI Agents
🔒 Prisma AIRS 2.0 is a unified AI security platform that delivers end-to-end visibility, risk assessment and automated defenses across agents, models and development pipelines. It consolidates Protect AI capabilities to provide posture and runtime protections for AI agents, model scanning and API-first controls for MLOps. The platform also offers continuous, autonomous red teaming and a managed MCP Server to embed threat detection into workflows.
Tue, October 21, 2025
DeepSeek Privacy and Security: What Users Should Know
🔒 DeepSeek collects extensive interaction data — chats, images and videos — plus account details, IP address and device/browser information, and retains it for an unspecified period under a vague “retain as long as needed” policy. The service operates under Chinese jurisdiction, so stored chats may be accessible to local authorities and have been observed on China Mobile servers. Users can disable model training in web and mobile Data settings, export or delete chats (export is web-only), or run the open-source model locally to avoid server-side retention, but local deployment and deletion have trade-offs and require device protections.
Tue, October 21, 2025
The Signals Loop: Fine-tuning for AI Apps and Agents
🔁 Microsoft positions the signals loop — continuous capture of user interactions and telemetry with systematic fine‑tuning — as essential for building adaptive, reliable AI apps and agents. The post explains that simple RAG and prompting approaches often lack the accuracy and engagement needed for complex use cases, and that continuous learning drives sustained improvements. It highlights Dragon Copilot and GitHub Copilot as examples where telemetry‑driven fine‑tuning yielded substantial performance and experience gains, and presents Azure AI Foundry as a unified platform to operationalize these feedback loops at scale.
Tue, October 14, 2025
Microsoft launches ExCyTIn-Bench to benchmark AI security
🛡️ Microsoft released ExCyTIn-Bench, an open-source benchmarking tool to evaluate how well AI systems perform realistic cybersecurity investigations. It simulates a multistage Azure SOC using 57 Microsoft Sentinel log tables and measures multistep reasoning, tool usage, and evidence synthesis. The benchmark offers fine-grained, actionable metrics for CISOs, product owners, and researchers.