LP/POMDP结合：不完全信息下的优化

The LP/POMDP marriage: Optimization with imperfect information

Naval Research Logistics · 2000

被引 4

ABS 3

Kirk A. Yost
Alan R. Washburn 通讯

中文导读

提出一种新方法，用主线性规划分配控制策略，用部分可观测马尔可夫决策过程从对偶价格中找出改进策略，解决部分可观测状态下的资源分配问题，并以飞机分阶段攻击目标为例验证。

Abstract

A new technique for solving large-scale allocation problems with partially observable states and constrained action and observation resources is introduced. The technique uses a master linear program (LP) to determine allocations among a set of control policies, and uses partially observable Markov decision processes (POMDPs) to determine improving policies using dual prices from the master LP. An application is made to a military problem where aircraft attack targets in a sequence of stages, with information acquired in one stage being used to plan attacks in the next. © 2000 John Wiley & Sons, Inc., Naval Research Logistics 47: 607–619, 2000

运筹学马尔可夫决策过程线性规划人工智能军事运筹

阅读原文 ↗