Practices for Managing Machine Learning Products: A Multivocal Literature Review
通过综述行业博客和学术论文,梳理出91项机器学习系统全生命周期的实践,并归类为六大核心类别,帮助组织识别流程缺口、优化管理。
Machine learning (ML) has grown in popularity in the software industry due to its ability to solve complex problems. Developing ML systems involves more uncertainty and risk because it requires identifying a business opportunity and managing source code, data, and trained models. Our research aims to identify the existing practices used in the industry for building ML applications and comprehending the organizational complexity of adopting ML systems. We conducted a multivocal literature review and then created a taxonomy of the practices applied to the ML system life cycle discussed among practitioners and researchers. The core of the study emerged from 41 selected posts from the grey literature and 37 selected scientific papers. Applying Initial Coding and Focused Coding techniques into these data, we mapped 91 practices into six core categories related to designing, developing, testing, and deploying ML systems. The results, including a taxonomy of practices, provide organizations with valuable insights to identify gaps in their current ML processes and practices and a roadmap for improving, optimizing, and managing ML systems.