#ai-safety

12 articles

2026 Outlook: Disrupting Malicious Uses of AI – A Comprehensive Strategy
AI 人工智慧

2026 Outlook: Disrupting Malicious Uses of AI – A Comprehensive Strategy

As AI technology rapidly advances, so do its malicious applications. This article delves into the escalating global threats posed by AI misuse and analyzes how future efforts in technological innovation, policy-making, and international collaboration can build robust defenses to ensure AI's positive development.

OpenAI ChatGPT Foils Chinese Influence Operation: A New Front in AI Safety and Geopolitics
AI 人工智慧

OpenAI ChatGPT Foils Chinese Influence Operation: A New Front in AI Safety and Geopolitics

OpenAI confirmed its ChatGPT service refused to assist an individual linked to Chinese law enforcement in planning an online campaign to discredit the Japanese prime minister. This incident not only highlights the robust safety mechanisms of AI models but also underscores AI's escalating role in information warfare and geopolitical conflicts.

Decoding the Multimodal Mind: Google DeepMind Unveils Gemma Scope 2 for Gemma 3
AI 人工智慧

Decoding the Multimodal Mind: Google DeepMind Unveils Gemma Scope 2 for Gemma 3

Google DeepMind has released Gemma Scope 2, a comprehensive interpretability suite for the Gemma 3 model family. Utilizing Sparse Autoencoders (SAEs), it provides a 'microscopic' view of neural activations, marking a breakthrough as the first open-source tool to support multimodal AI interpretability across text and vision.