摘要: |
面向避障、绕飞等任务驱动的飞行器在线轨迹,为了提升制导性能,适应快速变化的复杂场景,聚焦于充分利用飞行器模型中的已知信息,基于iLQR这种有模型强化学习方法,设计了智能化的制导方式。与无模型强化学习相比,有模型强化学习的可解释性好,训练难度低。在单飞行器制导仿真中,相比TD3算法,iLQR方法飞行过程平均制导误差增加了28.07%,中末交班点误差降低到12.35%,提升幅度巨大;在多飞行器编队保持问题上,相比TD3算法,iLQR方法跟踪效果提升巨大,平均误差不超过TD3算法的22.67%,最大误差不超过TD3算法的15.44%。 |
关键词: iLQR算法 有模型强化学习 标准轨迹制导 强化学习制导 编队保持 |
DOI: |
|
基金项目: |
|
Research on Aircraft Guidance Technology Based on Model-Based Reinforcement Learning |
TENG Qinghua,HUI Junpeng,LI Tianren,YANG Ben |
(Research & Development Center, China Academy of Launch Vehicle Technology, Beijing 100076,China;Beijing Institute of Space Long March Vehicle, Beijing 100076,China) |
Abstract: |
For online-planned aircraft trajectory by tasks such as obstacle avoidance and detour flights,to improve guidance performance and adapt to rapidly changing complex scenarios, this paper designs an intelligent guidance method based on the iLQR model-based reinforcement learning approach, which could fully utilize the known information in the aircraft model.Compared to model-free reinforcement learning, model-based reinforcement learning has better interpretability and lower training difficulty.In single-aircraft guidance simulations, although the average guidance error during flight is 28.07% worse than the TD3 algorithm, the error at the final hand-over point reduce to 12.35%.In the multi-aircraft formation maintenance problem, the iLQR method shows a substantial improvement in tracking performance, whose average error is 22.67% and maximum error is 15.44% of TD3 algorithm. |
Key words: iLQR algorithm Model-based reinforcement learning Standard trajectory guidance Reinforcement learning guidance Formation keeping |