论文标题
谁首先去?人类工作流对临床成像中决策的影响
Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging
论文作者
论文摘要
在AI技术的现实世界中,必须考虑支持人类协作的设计和机制的细节。 AI辅助人类决策的互动设计的一个关键方面是有关在更大的决策工作流程中AI推论的显示和测序的政策。我们对在人类对手头的诊断任务进行审查之前与进行AI推断的影响有差的理解。我们探讨了在放射学诊断课程开始时提供AI援助的效果,而放射科医生做出了临时决定。我们进行了一项用户研究,其中19位兽医放射科医生在AI工具的帮助下确定了患者X射线图像中存在的放射线照相发现。我们采用了两种工作流程配置来分析(i)锚定效果,(ii)人类AI团队的诊断性能和协议,(iii)花费的时间和对决策做出的信心,以及(iv)AI的有用性。我们发现,在审查AI推论之前,要求注册临时响应的参与者不太可能与AI同意,无论建议是否准确,并且在与人工智能分歧的情况下,不太可能寻求同事的第二意见。这些参与者还报告了AI建议的有用性降低。令人惊讶的是,在显示AI推断之前,需要在案件上做出临时决定并没有延长参与者在任务上花费的时间。该研究为人类在人类系统中部署临床AI工具提供了可普遍且可操作的见解,并引入了一种研究人类协作替代设计的方法。我们将实验平台作为开源,以促进对替代设计对人类工作流程影响的未来研究。
Details of the designs and mechanisms in support of human-AI collaboration must be considered in the real-world fielding of AI technologies. A critical aspect of interaction design for AI-assisted human decision making are policies about the display and sequencing of AI inferences within larger decision-making workflows. We have a poor understanding of the influences of making AI inferences available before versus after human review of a diagnostic task at hand. We explore the effects of providing AI assistance at the start of a diagnostic session in radiology versus after the radiologist has made a provisional decision. We conducted a user study where 19 veterinary radiologists identified radiographic findings present in patients' X-ray images, with the aid of an AI tool. We employed two workflow configurations to analyze (i) anchoring effects, (ii) human-AI team diagnostic performance and agreement, (iii) time spent and confidence in decision making, and (iv) perceived usefulness of the AI. We found that participants who are asked to register provisional responses in advance of reviewing AI inferences are less likely to agree with the AI regardless of whether the advice is accurate and, in instances of disagreement with the AI, are less likely to seek the second opinion of a colleague. These participants also reported the AI advice to be less useful. Surprisingly, requiring provisional decisions on cases in advance of the display of AI inferences did not lengthen the time participants spent on the task. The study provides generalizable and actionable insights for the deployment of clinical AI tools in human-in-the-loop systems and introduces a methodology for studying alternative designs for human-AI collaboration. We make our experimental platform available as open source to facilitate future research on the influence of alternate designs on human-AI workflows.