动态定价的部分可观测马尔可夫决策过程

A Partially Observed Markov Decision Process for Dynamic Pricing

Management Science · 2005
被引 213
人大 A+FT50UTD24ABS 4*

中文导读

构建了一个部分可观测马尔可夫决策过程框架,研究时尚类商品零售商在有限销售季节内如何动态定价以最大化期望收入,并提出了一个主动学习启发式定价策略。

Abstract

In this paper, we develop a stylized partially observed Markov decision process (POMDP) framework to study a dynamic pricing problem faced by sellers of fashion-like goods. We consider a retailer that plans to sell a given stock of items during a finite sales season. The objective of the retailer is to dynamically price the product in a way that maximizes expected revenues. Our model brings together various types of uncertainties about the demand, some of which are resolvable through sales observations. We develop a rigorous upper bound for the seller’s optimal dynamic decision problem and use it to propose an active-learning heuristic pricing policy. We conduct a numerical study to test the performance of four different heuristic dynamic pricing policies in order to gain insight into several important managerial questions that arise in the context of revenue management.

动态定价部分可观测马尔可夫决策过程需求不确定性主动学习启发式策略