生物制造过程中动态贝叶斯网络混合模型的策略优化

Policy Optimization in Dynamic Bayesian Network Hybrid Models of Biomanufacturing Processes

INFORMS journal on computing · 2022

被引 14

人大 BUTD24ABS 3

Hua Zheng · 东北大学（美国）
Wei Xie · 东北大学（美国）
Ilya O. Ryzhov · 马里兰大学
Dongming Xie · 马萨诸塞大学洛厄尔分校

中文导读

针对生物制药过程中数据稀缺、因素相互依赖的问题，提出了基于动态贝叶斯网络的模型强化学习框架，实现低数据环境下的类人控制，并用随机梯度方法优化策略。

Abstract

Biopharmaceutical manufacturing is a rapidly growing industry with impact in virtually all branches of medicine. Biomanufacturing processes require close monitoring and control, in the presence of complex bioprocess dynamics with many interdependent factors, as well as extremely limited data due to the high cost of experiments and the novelty of personalized bio-drugs. We develop a new model-based reinforcement learning framework that can achieve human-level control in low-data environments. A dynamic Bayesian network is used to capture causal interdependencies between factors and predict how the effects of different inputs propagate through the pathways of the bioprocess mechanisms. This model is interpretable and enables the design of process control policies that are robust against model risk. We present a computationally efficient, provably convergent stochastic gradient method for optimizing such policies. Validation is conducted on a realistic application with a multidimensional, continuous state variable. History: Accepted by Bruno Tuffin, Area Editor for Simulation. Funding: This work was partially supported by National Institute of Standards and Technology [Grant 70NANB17H002]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/ijoc.2022.1232 .

生物制造动态贝叶斯网络强化学习过程控制运筹学

阅读原文 ↗