一种用于变覆盖旅行商问题的深度强化学习超启发式算法

A deep reinforcement learning hyperheuristic for the covering tour problem with varying coverage

Computers and Operations Research · 2024

被引 13

ABS 3

Parisa Torabi 通讯
Ahmad Hemmati
Anna Oleynik
Guttorm Alendal

中文导读

针对变覆盖旅行商问题，提出一种深度强化学习超启发式算法，实验表明该算法在解质量和收敛速度上优于均匀随机选择和自适应元启发式方法。

Abstract

Covering Tour Problem (CTP) is a combinatorial optimization problem in which the objective is to identify a minimum-cost tour that satisfies the coverage of a certain subset of nodes in a graph. The Covering Tour Problem with Varying Coverage (CTP-VC) is an extension of this problem in which the coverage radius is dependent on the amount of time spent at each node. In this paper, we propose a novel approach to address the CTP-VC using a Deep Reinforcement Learning Hyperheuristic (DRLH). This study includes experiments on the existing Adaptive Metaheuristic to solve CTP-VC, to enhance its solution quality. Further, new heuristics and three selection methods, namely Uniform Random Selection (URS), adaptive Metaheuristic (AMH), and the proposed DRLH are introduced. We detail the computational setup, including the instance sets utilized, the training process for the DRLH agent, and the validation procedures for model selection. Through extensive experimentation and analysis, we evaluate the performance of different selection methods, assess the solution quality of the DRLH approach, investigate the robustness of selection methods, examine heuristic selection frequency, and analyze solution convergence. Our results demonstrate the efficacy of the DRLH approach in tackling the CTP-VC, offering promising insights for future research in the interface of combinatorial optimization and reinforcement learning methodologies. • A Deep Reinforcement Learning method is applied to the CTP with Varying Coverage. • Three selection methods are compared: Uniform Random, Adaptive Metaheuristic, and DRLH. • DRLH shows high-quality solutions, robust across different problem instances. • DRLH performs better with larger instances, favoring large-size problem-solving. • DRLH shows faster convergence compared to URS and Adaptive Meta-Heuristic.

组合优化强化学习超启发式算法旅行商问题

阅读原文 ↗