马尔可夫说服价值的单调性与收敛速度

On the Monotonicity and Rate of Convergence of the Markovian Persuasion Value

Mathematics of Operations Research · 2026

被引 0 · 同刊同年前 10%

ABS 3

Dimitry Shaiderman 通讯

中文导读

研究马尔可夫说服模型中，发送者通过信号控制接收者信念，证明折扣价值随折扣因子单调递减，并结合递增结果推导出收敛速度上界。

Abstract

We study a dynamic Bayesian persuasion model called Markovian persuasion, illustrated here with two players: the sender (he) and the receiver (she). In such a model, the belief of the receiver regarding the current state of a Markov chain [Formula: see text], over a finite state space K, is controlled through signals she obtains from a sender, who observes [Formula: see text] in real time. At each stage [Formula: see text], the receiver takes an action based on his current belief, which, together with the realized state of [Formula: see text], determines the n-th-stage payoff of the sender. The sender’s goal in a Markovian persuasion game is to find a signaling policy that maximizes her expected [Formula: see text]-discounted sum of stage payoffs for a discount factor [Formula: see text]. We show that starting from any invariant distribution [Formula: see text], the trajectory of the [Formula: see text]-discounted value is monotone decreasing in [Formula: see text]. By combining this result with the opposite increasing monotone trajectories found in Lehrer and Shaiderman [Lehrer E, Shaiderman D (2025) Markovian persuasion with stochastic revelations. Games Econom. Behav. 154:411–439], we are able to derive an upper bound on the rate of convergence of the [Formula: see text]-discounted values (as [Formula: see text]) in the case where [Formula: see text] is ergodic. The results for the Markovian persuasion model are then extended to the Markov chain games model of Renault (2006).

随机博弈马尔可夫链动态贝叶斯说服折扣因子收敛速度

阅读原文 ↗