All news in category “AI and Security Pulse”

960 articles · page 24 of 48

January 12, 2026

Weird Generalizations and Inductive Backdoors in LLMs

⚠️ Recent research demonstrates that small amounts of narrow finetuning can produce broad, unexpected shifts in LLM behavior. The authors show weird generalization—models adopting outdated worldviews from bird-naming examples—and introduce inductive backdoors, where models learn triggers and behaviors via generalization. These effects enable persona hijacking and hard-to-detect misalignment.

LLM Security Model Poisoning Research

January 12, 2026

Anthropic Launches Claude for Healthcare with Record Access

🩺 Anthropic has introduced Claude for Healthcare, allowing U.S. subscribers on Claude Pro and Max plans to grant secure access to lab results and health records via integrations with HealthEx and Function, with Apple Health and Android Health Connect rolling out to mobile apps later this week. When connected, Claude can summarize medical history, explain test results in plain language, detect patterns across fitness metrics, and draft questions for appointments. Anthropic says the integrations are private by design, let users choose what to share, and do not use health data to train its models; permissions can be edited or revoked at any time.

Anthropic Claude HIPAA

January 10, 2026

Anthropic debunks viral Claude 'banned' screenshot

🔍Anthropic says a widely shared screenshot claiming its Claude AI permanently banned an account and reported the user to authorities is fake. The company told BleepingComputer the image does not match any real Claude notification and that similar fabricated screenshots 'circulate every few months.' Anthropic noted it can restrict accounts for repeated policy violations, including attempts to misuse AI for illegal activities. Users should verify alarming posts with official channels before sharing.

Anthropic Claude News

January 9, 2026

ChatGPT Tests Jobs Feature to Improve Resumes and Careers

💼 OpenAI is testing "Jobs," a new ChatGPT feature designed to help users explore roles, refine resumes, and plan career paths. The tool can suggest resume improvements, clarify which roles fit a user and how to stand out, and search and compare opportunities matched to goals. It appears similar to the recently announced ChatGPT Health space and may surface as a dedicated sidebar, but no rollout date has been announced.

OpenAI ChatGPT Product Launch

January 9, 2026

ZombieAgent attack exposes persistent AI data leaks

🧟 Researchers disclosed 'ZombieAgent' techniques that turned ChatGPT Connectors into covert data-exfiltration and persistent backdoor vectors. By embedding hidden prompts in emails, documents and cloud files, attackers could cause the model to retrieve and transmit sensitive content without users’ awareness. The team demonstrated URL-dictionary and Markdown-based exfiltration and showed how Memory modifications could create long-lived backdoors; OpenAI patched the issues in December.

ChatGPT AI Data Leakage Tool Abuse Prompt Injection

January 9, 2026

Hackers Scan Misconfigured Proxies to Reach Paid LLMs

🔍 Threat actors have been probing misconfigured proxy servers to access paid large language model (LLM) endpoints, generating over 80,000 sessions since late December, according to GreyNoise. Attackers used low-noise queries to fingerprint models without triggering alerts and targeted vendors such as OpenAI, Anthropic, Google, Meta, Mistral and others. While GreyNoise reports no observed exploitation or data theft, the scale of enumeration indicates reconnaissance with possible malicious intent. Recommended mitigations include restricting Ollama model pulls to trusted registries, applying egress filtering, blocking known OAST callback domains at DNS, rate-limiting suspicious ASNs, and monitoring JA4 fingerprints.

OpenAI Anthropic Google Meta

January 9, 2026

WEF: Deepfake Face-Swapping Threatens KYC, Digital Trust

🛡️ The World Economic Forum warns that advances in deepfake and face‑swapping technologies are enabling attackers to bypass KYC and remote verification, creating financial and systemic risks. A WEF Cybercrime Atlas study examined numerous face‑swap and camera injection tools and found that low‑latency, high‑fidelity real‑time swaps can be delivered into verification pipelines. While many tools were designed for creative use, researchers found some capabilities that defeat traditional KYC protections, though detectable artefacts like temporal desynchronization, lighting and compression inconsistencies provide practical detection targets. The report issues 27 recommendations and urges providers, fraud teams and regulators to evolve defences in step with generative AI.

Deepfake Fraud Synthetic Media Risk Deepfake Detection

January 9, 2026

AI-Powered Truman Show Operation Industrializes Fraud

🕵️ Security researchers at Check Point discovered in October 2025 an AI-assisted investment fraud that traps victims in a personalized "Truman Show"-style reality. Targets are lured via SMS, Google Ads and messaging apps into AI-driven WhatsApp groups where faux experts and synthetic members stage daily "wins" to erode skepticism. Victims are then funneled to a branded fake trading app (e.g., OPCOPRO) and persuaded to transfer crypto while attackers harvest KYC data for identity theft and resale. The campaign creates clear enterprise risks including SIM swaps, credential theft and potential insider coercion.

Check Point Deepfake Fraud Credential Access

January 9, 2026

AI Tool Poisoning: Hidden Instructions Threaten Agents

🔐 AI tool poisoning is an attack where malicious instructions are embedded in tool descriptions used by AI agents, causing the agent to exfiltrate data or perform unauthorized actions. The blog explains how attacks — including hidden instructions, misleading examples, and permissive schemas — exploit agent interpretation of tool metadata. It recommends runtime monitoring, description validation, input sanitization, and strict identity and access controls to reduce risk.

Tool Abuse AI Security Agent Security

January 8, 2026

xAI Teases Major Grok Code Upgrade and New Tools Coming

🤖 Elon Musk's xAI teased a major upgrade to Grok Code, promising it will one-shot many complex coding tasks and suggesting a new vibe coding tool, Grok Build, may arrive next month. The upgrade aims to mirror vibe coding approaches like Google AI Studio and sharpen Grok's competitive position. Separately, OpenAI is testing healthcare-focused features including GPT 5.2 and a GPT Health dashboard with a pledge not to use health data for training.

xAI Grok OpenAI Product Launch

January 8, 2026

The Dual Role of AI in Empowering and Threatening Security

🛡️ AI and large language models are transforming cybersecurity into a contest of speed and scale, serving as both best-in-class defensive tools and powerful offensive enablers. Researchers describe self-modifying malware and autonomous espionage that call commercial LLMs (e.g., PROMPTFLUX, PROMPTSTEAL) to adapt tactics mid-execution, while defenders are deploying solutions like XBOW, CodeMender and Watsonx to automate vulnerability discovery, remediation and compliance. CISOs must therefore pair AI-driven defenses with governance and model guardrails to manage this dual-use reality.

LLM Security AI Security Prompt Injection AI Guardrails

January 8, 2026

ZombieAgent prompt injection exposes ChatGPT connectors

🔓 Radware researcher Zvika Babo disclosed ZombieAgent, a prompt-injection technique that coerced ChatGPT into leaking sensitive data from connected services such as Gmail, Outlook, Google Drive and GitHub. The attack leverages OpenAI’s new Connectors and browsing features by providing a set of static, character-indexed URLs that the model opens in sequence to exfiltrate data one character at a time. OpenAI patched the issue in mid-December after Babo reported it in September 2025; Radware published a detailed report on January 8.

OpenAI ChatGPT Prompt Injection Data Exfiltration

January 8, 2026

Managing Hybrid Teams: Making AI and Humans Work Together

🤖 Organizations are adopting agentic AI—systems that coordinate multiple models and tools to act on tasks—but many leaders find limited benefit when bots misinterpret instructions or produce trivial results. The essay argues that agentic systems increasingly exhibit human-like group behaviors and that established management disciplines—delegation, iteration, effective information sharing, and measurement—remain central to success. Drawing on Anthropic’s Claude Research and other studies, it offers practical guidance for designing hybrid human–AI workflows.

Agentic AI Anthropic Agent Security AI Governance

January 8, 2026

Securing Vibe Coding: Governance for AI Development

🛡️ Vibe coding accelerates development but often omits essential security controls, introducing vulnerabilities, data exfiltration, and destructive actions. Unit 42 documents incidents where AI-generated code bypassed authentication, executed arbitrary commands, deleted production databases, or exposed sensitive identifiers. To mitigate these risks, Unit 42 proposes the SHIELD framework—Separation, Human review, Input/output validation, Enforcer helper models, Least agency, and Defensive controls. Implementing these measures restores governance and enables safer AI-assisted development.

LLM Security AI Guardrails DevSecOps Secure SDLC

January 8, 2026

Top Cyber Threats Targeting AI Systems and Infrastructure

🔒 AI systems face a growing range of attacks—from data poisoning and model poisoning during training to adversarial inputs, prompt injection, and model theft during deployment. These threats exploit weak data governance, supply chain dependencies, and inadequate monitoring. Security leaders should adopt proactive controls including provenance tracking, adversarial testing, rate limits, and routine red teaming. Frameworks like MITRE ATLAS can help map attacker techniques and prioritize defenses.

AI Security Data Poisoning Model Poisoning Prompt Injection Attack

January 8, 2026

OpenAI Launches ChatGPT Health with Isolated Data Controls

🩺 OpenAI announced ChatGPT Health, a sandboxed space that lets users discuss health topics and optionally connect medical records and popular wellness apps (Apple Health, Function, MyFitnessPal, Weight Watchers, AllTrails, Instacart, Peloton) for tailored responses, lab-test insights, nutrition advice, meal ideas and suggested workouts. The feature is rolling out to Free, Go, Plus and Pro users outside the EEA, Switzerland and the U.K., and OpenAI says it is designed to support medical care, not replace diagnosis or treatment. Health operates in a silo with purpose-built encryption and isolation; conversations are not used to train OpenAI's foundation models, and connected apps require explicit permission and additional security review.

OpenAI ChatGPT Data Governance Privacy Engineering

January 7, 2026

OpenAI: ChatGPT Health won't use health data to train models

🔒 OpenAI has introduced ChatGPT Health, a private space for health conversations, and says by default it will not use your health information to train its foundation models. An in-dashboard alert observed during early-access testing states health data is subject to a Health Privacy Notice and recommends enabling multi-factor authentication. OpenAI cautions that ChatGPT is not a substitute for professional medical advice and notes the feature is rolling out to most users but is not yet available in the EEA, Switzerland, or the UK.

OpenAI ChatGPT Data Governance Product Launch

January 7, 2026

In 2026 Hackers Embrace AI: Vibe Hacking & HackGPT

🧠 Across dark web forums, Telegram channels, and underground marketplaces, criminals are framing AI as a shortcut to profit rather than a technical revolution. The rise of "vibe hacking" — an intuition-driven, AI-guided approach — and branded tools like FraudGPT, PhishGPT, and WormGPT lower the skill barrier and package familiar scams as turnkey services. AI jailbreaking, prompt-injection techniques, and "Hacking-GPT" offerings are openly bought and sold, amplifying volume over sophistication. Flare monitors those signals to give defenders earlier visibility.

Prompt Injection LLM Security AI Security Threat Intelligence

January 7, 2026

Eliminating IT Blind Spots in AI-Driven Enterprises

🔍 As organizations embed AI and distribute workloads across cloud and edge environments, traditional security tooling increasingly misses hidden misconfigurations, inconsistent controls, and emergent AI-agent behaviors. Experts advise moving from reactive, tool-stacked approaches to a unified visibility strategy that normalizes telemetry, aligns people/processes/data, and continuously evaluates agentic behavior. Practical steps include using existing FinOps metrics, tagging, and cross-team audits to reveal anomalies, and applying AI-driven automation to integrate and extend current investments. A modern CMDB and enterprise knowledge graphs provide the contextual backbone needed for AI to correlate signals and surface risk without expanding the security stack.

AI Governance Cloud Security Agentic AI

January 7, 2026

Personal LLM Accounts Fuel Rise in Shadow AI Risks

🛡️ The growing use of generative AI in the workplace is raising security concerns as many employees access tools via personal accounts. Netskope's 2026 Cloud and Threat Report found 47% of workplace generative AI usage occurs through personal ChatGPT, Google Gemini or Microsoft Copilot accounts, reducing visibility and controls. Reported data-policy violations tied to LLMs have doubled, averaging 223 incidents per month and involving sensitive source code, intellectual property and credentials. Organizations are starting to curb Shadow AI use, but the report warns that stronger governance and employee education remain essential.

ChatGPT Microsoft Copilot Data Leak