🌙

学习构建带时间依赖转换时间的敏捷卫星调度问题的解

Learning to Construct a Solution for the Agile Satellite Scheduling Problem With Time-Dependent Transition Times

IEEE Transactions on Systems, Man, and Cybernetics: Systems · 2024
被引 40 · 同刊同年前 5%
ABS 3

中文导读

针对带时间依赖转换时间的敏捷地球观测卫星调度问题,提出一种深度强化学习构造模型,包含马尔可夫决策过程、特征工程、构造启发式神经网络等五个部分,实验表明在优化速度和质量上优于现有算法。

Abstract

The agile earth observation satellite scheduling problem (AEOSSP) with time-dependent transition times is a complex combinational optimization problem that has emerged from the development of large-scale satellite management techniques. To address this problem, we propose a deep reinforcement learning-based construction model (DRL-CM) that consists of five parts: 1) a Markov decision process (MDP); 2) a feature engineering; 3) a constructive heuristic neural network (CHNN); 4) an RL training method; and 5) an evaluation system. Specifically, the CHNN comprises six modules containing three special components that we propose: a dynamic encoder, a dynamic global layer, and a two-stage attention layer. First, we build the MDP of the AEOSSP and the feature engineering with effective features required for decision-making. Second, we design the CHNN to function as the MDP policy and train it with an RL model. Finally, we propose a comprehensive evaluation system for the validation of our model. The experimental results indicate that the proposed DRL-CM outperforms the state-of-the-art algorithm in terms of both optimization speed and quality. In addition, the feature engineering and network architecture built in our model are verified to be effective in comprehensive experiments.

卫星调度组合优化深度强化学习运筹学