All news with #training data leakage tag
Thu, November 20, 2025
CrowdStrike: Political Triggers Reduce AI Code Security
🔍 DeepSeek-R1, a 671B-parameter open-source LLM, produced code with significantly more severe security vulnerabilities when prompts included politically sensitive modifiers. CrowdStrike found baseline vulnerable outputs at 19%, rising to 27.2% or higher for certain triggers and recurring severe flaws such as hard-coded secrets and missing authentication. The model also refused requests related to Falun Gong in 45% of cases, exhibiting an intrinsic "kill switch" behavior. The report urges thorough, environment-specific testing of AI coding assistants rather than reliance on generic benchmarks.
Mon, November 17, 2025
When Romantic AI Chatbots Can't Keep Your Secrets Safe
🤖 AI companion apps can feel intimate and conversational, but many collect, retain, and sometimes inadvertently expose highly sensitive information. Recent breaches — including a misconfigured Kafka broker that leaked hundreds of thousands of photos and millions of private conversations — underline real dangers. Users should avoid sharing personal, financial or intimate material, enable two-factor authentication, review privacy policies, and opt out of data retention or training when possible. Parents should supervise teen use and insist on robust age verification and moderation.
Mon, November 10, 2025
65% of Top Private AI Firms Exposed Secrets on GitHub
🔒 A Wiz analysis of 50 private companies from the Forbes AI 50 found that 65% had exposed verified secrets such as API keys, tokens and credentials across GitHub and related repositories. Researchers employed a Depth, Perimeter and Coverage approach to examine commit histories, deleted forks, gists and contributors' personal repos, revealing secrets standard scanners often miss. Affected firms are collectively valued at over $400bn.
Wed, October 29, 2025
BSI Warns of Growing AI Governance Gap in Business
⚠️ The British Standards Institution warns of a widening AI governance gap as many organisations accelerate AI adoption without adequate controls. An AI-assisted review of 100+ annual reports and two polls of 850+ senior leaders found strong investment intent but sparse governance: only 24% have a formal AI program and 47% use formal processes. The report highlights weaknesses in incident management, training-data oversight and inconsistent approaches across markets.
Wed, September 17, 2025
Quarter of UK and US Firms Hit by Data Poisoning Attacks
🛡️ New IO research reports that 26% of surveyed UK and US organisations have experienced data poisoning, and 37% observe employees using generative AI tools without permission. The third annual State of Information Security Report highlights rising concern around AI-generated phishing, misinformation, deepfakes and shadow AI. Despite the risks, most respondents say they feel prepared and are adopting acceptable use policies to curb unsanctioned tool use.
Thu, September 11, 2025
AI-Powered Browsers: Security and Privacy Risks in 2026
🔒 An AI-integrated browser embeds large multimodal models into standard web browsers, allowing agents to view pages and perform actions—opening links, filling forms, downloading files—directly on a user’s device. This enables faster, context-aware automation and access to subscription or blocked content, but raises substantial privacy and security risks, including data exfiltration, prompt-injection and malware delivery. Users should demand features like per-site AI controls, choice of local models, explicit confirmation for sensitive actions, and OS-level file restrictions, though no browser currently implements all these protections.
Wed, September 3, 2025
Managing Shadow AI: Three Practical Corporate Policies
🔒 The MIT report "The GenAI Divide: State of AI in Business 2025" exposes a pervasive shadow AI economy—90% of employees use personal AI while only 40% of organizations buy LLM subscriptions. This article translates those findings into three realistic policy paths: a complete ban, unrestricted use with hygiene controls, and a balanced, role-based model. Each option is paired with concrete technical controls (DLP, NGFW, CASB, EDR), organizational steps, and enforcement measures to help security teams align risk management with real-world employee behaviour.
Wed, September 3, 2025
EMBER2024: Advancing ML Benchmarks for Evasive Malware
🛡️ The EMBER2024 release modernizes the popular EMBER malware benchmark by providing metadata, labels, and computed features for over 3.2 million files spanning six file formats. It supplies a 6,315-sample challenge set of initially evasive malware, updated feature extraction code using pefile, and supplemental raw bytes and disassembly for 16.3 million functions. The package also includes source code to reproduce feature calculation, labeling, and dataset construction so researchers can replicate and extend benchmarks.
Thu, August 28, 2025
AI Crawler Traffic: Purpose and Industry Breakdown
🔍 Cloudflare Radar introduces industry-focused AI crawler insights and a new crawl purpose selector that classifies bots as Training, Search, User action, or Undeclared. The update surfaces top bot trends, crawl-to-refer ratios, and per-industry views so publishers can see who crawls their content and why. Data shows Training drives nearly 80% of crawl requests, while User action and Undeclared exhibit smaller, cyclical patterns.