Discrete‐item inventory control involving unknown censored demand and convex inventory costs
研究在需求分布未知且存在销售损失(需求被删失)的情况下,如何控制离散物品库存。提出一种分离学习与执行的政策,在理论上下界Ω(T^{2/3})和上界O(T^{2/3})之间达到匹配,数值实验验证了其竞争力。
We study inventory control involving lost sales and hence censored demand. In a long‐run average framework, the demand distribution is largely unknown. As long as the stationary inventory costs are strictly convex to the extent that the second lost item costs strictly more than the first one, the regret would be <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" overflow="scroll"> <mml:semantics definitionURL="" encoding=""> <mml:mrow> <mml:mi mathvariant="normal">Ω</mml:mi> <mml:mo stretchy="false">(</mml:mo> <mml:msup> <mml:mi>T</mml:mi> <mml:mrow> <mml:mn>2</mml:mn> <mml:mo>/</mml:mo> <mml:mn>3</mml:mn> </mml:mrow> </mml:msup> <mml:mo stretchy="false">)</mml:mo> </mml:mrow> <mml:annotation encoding="">$\Omega (T^{2/3})$</mml:annotation> </mml:semantics> </mml:math> . Our discrete‐item setting has rendered the presence or absence of strong censoring indicators or equivalently, being knowledgeable or ignorant of one more demand request after the depletion of the inventory, a critical issue and any gradient‐based method designed for the continuous‐item case ineffective. We propose a policy that deliberately orders up to very high levels in designated learning periods and in the remaining doing periods, uses base‐stock levels tailored to near‐empirical distributions formed over the learning periods. A matching <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" overflow="scroll"> <mml:semantics definitionURL="" encoding=""> <mml:mrow> <mml:mi>O</mml:mi> <mml:mo stretchy="false">(</mml:mo> <mml:msup> <mml:mi>T</mml:mi> <mml:mrow> <mml:mn>2</mml:mn> <mml:mo>/</mml:mo> <mml:mn>3</mml:mn> </mml:mrow> </mml:msup> <mml:mo stretchy="false">)</mml:mo> </mml:mrow> <mml:annotation encoding="">$O(T^{2/3})$</mml:annotation> </mml:semantics> </mml:math> upper bound can be achieved by this policy. The results can hold even when items are nonperishable. Numerical experiments further illustrate the relative competitiveness of our separate learning‐doing policy.