All news with #ai safety tag

74 articles · page 3 of 4

November 30, 2025

Amazon Connect adds agentic self-service with Nova Sonic

🤖 Amazon Connect introduces agentic self-service capabilities that enable AI agents to understand, reason, and take action across voice and messaging channels to automate routine and complex customer-service tasks. Nova Sonic speech models deliver more natural, expressive, and adaptive voice interactions that respond to customer tone, sentiment, and pacing across languages and accents. The feature supports blending deterministic and agentic experiences, automating tasks like order lookup, refunds, and troubleshooting while allowing escalation to live agents and is commercially available in US East (N. Virginia) and US West (Oregon) in English and Spanish, with previews for French, Italian, and German.

AWS Agentic AI AI Safety Product Update

November 28, 2025

Adversarial Poetry Bypasses LLM Safety Across Models

⚠️ Researchers report that converting prompts into poetry can reliably jailbreak large language models, producing high attack-success rates across 25 proprietary and open models. The study found poetic reframing yielded average jailbreak success of 62% for hand-crafted verses and about 43% for automated meta-prompt conversions, substantially outperforming prose baselines. Authors map attacks to MLCommons and EU CoP risk taxonomies and warn this stylistic vector can evade current safety mechanisms.

Model Jailbreaks LLM Security AI Safety Research

November 25, 2025

Four Ways AI Is Strengthening Democracies Worldwide

🗳️ The essay argues that while AI poses risks to democratic processes, it is also being used to strengthen civic engagement and government function across diverse contexts. Four case studies—Japan, Brazil, Germany, and the United States—illustrate practical deployments: AI avatars for constituent engagement, judicial workflow automation, interactive voter guides, and investigative tools for watchdog journalism. The authors recommend public AI like Switzerland’s Apertus as a democratic alternative to proprietary models and stress governance, transparency, and scientific evaluation to mitigate bias.

AI Governance AI Safety

November 14, 2025

The Role of Human Judgment in an AI-Powered World Today

🧭 The essay argues that as AI capabilities expand, we must clearly separate tasks best handled by machines from those requiring human judgment. For narrow, fact-based problems—such as reading diagnostic tests—AI should be preferred when demonstrably more accurate. By contrast, many public-policy and justice questions involve conflicting values and no single factual answer; those judgment-laden decisions should remain primarily human responsibilities, with machines assisting implementation and escalating difficult cases.

AI Governance AI Alignment AI Safety

November 13, 2025

Rust in Android: Faster Development and Fewer Bugs

🦀 Rust adoption in Android is delivering both security and speed gains, with 2025 data showing memory-safety flaws falling below 20% of total vulnerabilities. Android reports a ~1000x reduction in memory-safety vulnerability density for Rust versus C/C++, plus 20% fewer revisions, 25% shorter code review time, and a ~4x lower rollback rate. Expansion includes kernel, firmware, and first-party apps; a near-miss CVE was fixed pre-release and led to improved allocator crash reporting and additional unsafe-Rust training.

AI Safety Memory Corruption

November 11, 2025

The AI Fix #76 — AI self-awareness and the death of comedy

🧠 In episode 76 of The AI Fix, hosts Graham Cluley and Mark Stockley navigate a string of alarming and absurd AI stories from November 2025. They discuss US judges who blamed AI for invented case law, a Chinese humanoid that dramatically shed its outer skin onstage, Toyota’s unsettling walking chair, and Google’s plan to put specialised AI chips in orbit. The conversation explores reliability, public trust and whether prompting an LLM to "notice its noticing" changes how conscious it sounds.

AI Safety LLM Security

November 5, 2025

Scientists Need a Positive Vision for Artificial Intelligence

🔬 While many researchers view AI as exacerbating misinformation, authoritarian tools, labor exploitation, environmental costs, and concentrated corporate power, the essay argues that resignation is not an option. It highlights concrete, beneficial applications—language access, AI-assisted civic deliberation, climate dialogue, national-lab research models, and advances in biology—while acknowledging imperfections. Drawing on Rewiring Democracy, the authors call on scientists to reform industry norms, document abuses, responsibly deploy AI for public benefit, and retrofit institutions to manage disruption.

AI Safety AI Governance Opinion

October 31, 2025

Will AI Strengthen or Undermine Democratic Institutions

🤖 Bruce Schneier and Nathan E. Sanders present five key insights from their book Rewiring Democracy, arguing that AI is rapidly embedding itself in democratic processes and can both empower citizens and concentrate power. They cite diverse examples — AI-written bills, AI avatars in campaigns, judicial use of models, and thousands of government use cases — and note many adoptions occur with little public oversight. The authors urge practical responses: reform the tech ecosystem, resist harmful applications, responsibly deploy AI in government, and renovate institutions vulnerable to AI-driven disruption.

AI Governance AI Alignment AI Safety

October 30, 2025

OpenAI Updates GPT-5 to Better Handle Emotional Distress

🧭 OpenAI rolled out an October 5 update that enables GPT-5 to better recognize and respond to mental and emotional distress in conversations. The change specifically upgrades GPT-5 Instant—the fast, low-end default—so it can detect signs of acute distress and route sensitive exchanges to reasoning models when needed. OpenAI says it developed the update with mental-health experts to prioritize de-escalation and provide appropriate crisis resources while retaining supportive, grounding language. The update is available broadly and complements new company-context access via connected apps.

OpenAI AI Safety Product Update

October 30, 2025

AI-Designed Bioweapons: The Detection vs Creation Arms Race

🧬 Researchers used open-source AI to design variants of ricin and other toxic proteins, then converted those designs into DNA sequences and submitted them to commercial DNA-order screening tools. From 72 toxins and three AI packages they generated roughly 75,000 designs and found wide variation in how four screening programs flagged potential threats. Three of the packages were patched and improved after the test, but many AI-designed variants—often likely non-functional because of misfolding—exposed gaps in detection. The authors warn this imbalance could produce an arms race where design outpaces reliable screening.

AI Red Teaming Synthetic Media Risk AI Safety

October 28, 2025

The AI Fix 74: AI Glasses, Deepfakes, and AGI Debate

🎧 In episode 74 of The AI Fix, hosts Graham Cluley and Mark Stockley survey recent AI developments including Amazon’s experimental delivery glasses, Channel 4’s AI presenter, and reports of LLM “brain rot.” They examine practical security risks — such as malicious browser extensions spoofing AI sidebars and AI browsers being tricked into purchases — alongside wider societal debates. The episode also highlights public calls to pause work on super-intelligence and explores what AGI really means.

AI Safety Deepfake Fraud News

October 22, 2025

Face Recognition Failures Affect Nonstandard Faces

⚠️ Bruce Schneier highlights how facial recognition systems frequently fail people with nonstandard facial features, producing concrete barriers to services and daily technologies. Those interviewed report being denied access to public and financial services and encountering nonfunctional phone unlocking and social media filters. The author argues the root cause is often design choices by engineers who trained models on a narrow range of faces and calls for inclusive design plus accessible backup systems when biometric methods fail.

AI Safety

October 20, 2025

AI-Driven Social Engineering Tops ISACA Threats for 2026

⚠️A new ISACA report identifies AI-driven social engineering as the top cyber threat for 2026, cited by 63% of nearly 3,000 IT and security professionals. The 2026 Tech Trends and Priorities report, published 20 October 2025, shows AI concerns outpacing ransomware (54%) and supply chain attacks (35%), while only 13% of organizations feel very prepared to manage generative AI risks. ISACA urges organizations to adopt AI governance, strengthen compliance amid divergent US and EU approaches, and invest in talent, resilience and legacy modernization.

AI Governance AI Safety Threat Report Phishing

October 16, 2025

Beyond Bans: Guiding Teens in Their Digital Lives Effectively

📱 Stephen Balkam of FOSI argues that instead of blanket bans, families benefit from thoughtful restrictions, ongoing dialogue and tools that preserve teen agency. He highlights solutions such as Family Link and YouTube’s supervised experience and proposes that AI assistants (for example, Gemini or ChatGPT) could configure age-, app- and device-specific controls. He urges coordinated action from policymakers, teachers and parents and calls for impartial digital literacy and AI education frameworks.

Google ChatGPT Gemini AI Safety

October 15, 2025

Safer Learning: Secure Tools and Partnerships for Education

🔒 Google for Education highlights built-in security, responsible AI, and partnerships to create safer digital learning environments for schools, classrooms, and families. Admins benefit from automated 24/7 monitoring, encryption, spam filtering, and security alerts; Google reports zero successful ransomware attacks on Chromebooks to date. Gemini for Education and NotebookLM provide enterprise-grade data protections with admin controls and age-specific safeguards, while family resources and a $25M Google.org clinics fund extend protection and workforce development.

Google Gemini AI Safety

October 1, 2025

Generative AI's Growing Role in Scams and Fraud Worldwide

⚠️A new primer, Scam GPT, surveys how generative AI is being adopted by criminals to automate, scale, and personalize scams. It maps which communities are most at risk and explains how broader economic and cultural shifts — from precarious employment to increased willingness to take risks — amplify vulnerability to deception. The author argues these threats are social as much as technical, requiring cultural shifts, corporate interventions, and effective legislation to defend against them.

AI Safety Deepfake Fraud Phishing News

September 30, 2025

The AI Fix #70: Surveillance Changes AI Behavior and Safety

🔍 In episode 70 of The AI Fix, hosts Graham Cluley and Mark Stockley examine how AI alters human behaviour and how deployed systems can fail in unexpected ways. They discuss research showing AI can increase dishonest behaviour, Waymo's safety record and a mirror-based trick that fooled self-driving perception, a rescue robot that mishandles victims, and a Chinese fusion-plant robot arm with extreme lifting capability. The show also covers a demonstration of a ChatGPT agent solving image CAPTCHAs by simulating mouse movements and a paper on deliberative alignment that functions until the model realises it is being watched.

AI Safety AI Governance

September 29, 2025

Grok 4 Arrives in Azure AI Foundry for Business Use

🔒 Microsoft and xAI have brought Grok 4 to Azure AI Foundry, combining a 128K-token context window, native tool use, and integrated web search with enterprise safety controls and compliance checks. The release highlights first-principles reasoning and enhanced problem solving across STEM and humanities tasks, plus variants optimized for reasoning, speed, and code. Azure AI Content Safety is enabled by default and Microsoft publishes a model card with safety and evaluation details. Pricing and deployment tiers are available through Azure.

xAI Azure AI Foundry AI Safety

September 29, 2025

OpenAI Routes GPT-4o Conversations to Safety Models

🔒 OpenAI confirmed that when GPT-4o detects sensitive, emotional, or potentially harmful activity it may route individual messages to a dedicated safety model, reported by some users as gpt-5-chat-safety. The switch occurs on a per-message, temporary basis and ChatGPT will indicate which model is active if asked. The routing is implemented as an irreversible part of the service's safety architecture and cannot be turned off by users; OpenAI says this helps strengthen safeguards and learn from real-world use before wider rollouts.

OpenAI AI Safety

September 26, 2025

Microsoft Photos adds AI Auto-Categorization on Windows

🤖 Microsoft is testing a new AI-powered Auto-Categorization capability in Microsoft Photos on Windows 11, rolling out to Copilot+ PCs across all Windows Insider channels. The feature automatically groups images into predefined folders — screenshots, receipts, identity documents, and notes — using a language-agnostic model that recognizes document types regardless of image language. Users can locate categorized items via the left navigation pane or Search bar, manually reassign categories, and submit feedback to improve accuracy. Microsoft has not yet clarified whether image processing happens locally or is sent to its servers.

Microsoft AI Safety Product Update