用于连续仿真优化的演员-评论家式随机自适应搜索

Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization

Operations Research · 2021

被引 8

人大 AFT50UTD24ABS 4*

Qi Zhang · 纽约州立大学石溪分校
Jiaqiao Hu · 纽约州立大学石溪分校

中文导读

提出一种结合演员-评论家强化学习思想的自适应搜索算法，用于解决连续仿真优化问题，能充分利用历史仿真数据，并在每次迭代仅收集单个仿真观测时提供有限时间分析，在多种基准问题上表现良好。

Abstract

Many systems arising in applications from engineering design, manufacturing, and healthcare require the use of simulation optimization (SO) techniques to improve their performance. In “Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization,” Q. Zhang and J. Hu propose a randomized approach that integrates ideas from actor-critic reinforcement learning within a class of adaptive search algorithms for solving SO problems. The approach fully retains the previous simulation data and incorporates them into an approximation architecture to exploit knowledge of the objective function in searching for improved solutions. The authors provide a finite-time analysis for the method when only a single simulation observation is collected at each iteration. The method works well on a diverse set of benchmark problems and has the potential to yield good performance for complex problems using expensive simulation experiments for performance evaluation.

仿真优化强化学习自适应搜索算法随机优化

阅读原文 ↗