你看到我所看到的吗？衡量图像识别服务输出的语义差异

Do you see what I see? Measuring the semantic differences in image‐recognition services' outputs

Journal of the Association for Information Science and Technology (JASIST) · 2023

被引 6

ABS 3

Anton Berg
Matti Nelimarkka 通讯

中文导读

研究了谷歌、微软和亚马逊的图像识别服务在标签上的分歧，发现它们对同一图像给出的标签不一致，并提出了两种缓解策略：接受所有标签或使用词嵌入方法筛选相似概念。

Abstract

Abstract As scholars increasingly undertake large‐scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross‐service label agreement concretized such issues: using eight datasets, spanning professionally produced and user‐generated images, the work showed that image‐recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image‐recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word‐embedding‐based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.

计算机科学人工智能图像识别数据科学信息检索

阅读原文 ↗