A restless bandit model for dynamic ride matching with reneging travelers
研究了大规模拼车匹配问题,考虑旅客因等待过久放弃服务,提出一种双变量指标策略以最大化平台长期平均收益,并通过真实数据模拟证明其优于基准策略。
This paper studies a large-scale ride-matching problem with a large number of travelers who are either drivers with vehicles or riders looking for sharing vehicles. Drivers can match riders that have similar itineraries and share the same vehicle; and reneging travelers, who become impatient and leave the service system after waiting a long time for shared rides, are considered in our model. The aim is to maximize the long-run average revenue of the ride service vendor, which is defined as the difference between the long-run average reward earned by providing ride services and the long-run average penalty incurred by reneging travelers. The problem is complicated by its scale, the heterogeneity of travelers (in terms of origins, destinations, and travel preferences), and the reneging behaviors. To this end, we formulate the ride-matching problem as a specific Markov decision process and propose a scalable ride-matching policy, referred to as Bivariate Index (BI) policy. The BI policy prioritizes travelers according to a ranking of their bivariate indices, which we prove, in a special case, leads to an optimal policy to the relaxed version of the ride-matching problem. For the general case, through extensive numerical simulations for systems with real-world travel demands, it is demonstrated that the BI policy significantly outperforms baseline policies.