论文标题

AI研究的X风险分析

X-Risk Analysis for AI Research

论文作者

Hendrycks, Dan, Mazeika, Mantas

论文摘要

人工智能(AI)有可能极大地改善社会,但是与任何强大的技术一样,它的风险和责任提高。当前的AI研究缺乏有关如何管理AI系统(包括投机性长期风险)的长尾风险的系统讨论。考虑到AI的潜在好处,人们担心构建越来越聪明,更强大的AI系统最终可能会导致比我们更强大的系统。有人说这就像玩火,并推测这可能会产生存在风险(X风险)。为了增加这些讨论,我们提供了如何分析AI X风险的指南,其中包括三个部分:首先,我们回顾了如何使系统更安全,并利用危害分析和系统安全的经过时间测试的概念,这些概念旨在以更安全的方向引导大型流程。接下来,我们讨论对未来系统安全性产生长期影响的策略。最后,我们通过改善安全性和一般能力之间的平衡来讨论使AI系统更安全的关键概念。我们希望该文档以及提出的概念和工具是理解如何分析AI X风险的有用指南。

Artificial intelligence (AI) has the potential to greatly improve society, but as with any powerful technology, it comes with heightened risks and responsibilities. Current AI research lacks a systematic discussion of how to manage long-tail risks from AI systems, including speculative long-term risks. Keeping in mind the potential benefits of AI, there is some concern that building ever more intelligent and powerful AI systems could eventually result in systems that are more powerful than us; some say this is like playing with fire and speculate that this could create existential risks (x-risks). To add precision and ground these discussions, we provide a guide for how to analyze AI x-risk, which consists of three parts: First, we review how systems can be made safer today, drawing on time-tested concepts from hazard analysis and systems safety that have been designed to steer large processes in safer directions. Next, we discuss strategies for having long-term impacts on the safety of future systems. Finally, we discuss a crucial concept in making AI systems safer by improving the balance between safety and general capabilities. We hope this document and the presented concepts and tools serve as a useful guide for understanding how to analyze AI x-risk.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源