2022 Technical Report

論文代碼

標題 / 作者 / 摘要

檢視

論文代碼

TR-IIS-22-001

標題 / 作者 / 摘要

Time-optimized velocity trajectory of a bounded-input double integrator with uncertainties: a solution based on PILCO
Hsuan-Cheng Liao, Wen-ChiehTung, Han-Jung Chou, Jing-Sin Liu

Reinforcement learning(RL) is a promising framework for deeper investiga-tion of robotics and control on account of challenges from uncertainties. In this paper, we document a simulation and experiment in applying an existing model-based RL framework, PILCO, to the problem of state-to-state time-optimal control with bounded input in the presence of uncertainties. In par-ticular, Gaussian Process is employed to model dynamics, successfully re-ducing the effect of model biases. Evaluation of policy, which is implement-ed in Gaussian radial basis functions, is done through iterated prediction with Gaussian posteriors and deterministic approximate inference. Finally, analyt-ic gradients are used for policy improvement. A simulation has shown a suc-cessful learning of a double integrator completing a rest-to-rest nearly time-optimal locomotion for a prespecified stopping distance along a linear track with uniform viscous friction. Time-optimality and data efficiency of the learning are demonstrated in the results. In addition, an experimental valida-tion on an inexpensive robot car shows the generalization potential and con-sistency of the leveraging model-based RL to similar systems and similar tasks. Moreover, a rescaling transformation from the baseline learned triangle velocity profile to a set of safe trapezoid velocity profiles is presented, ac-commodating additional velocity limit.

檢視

Fulltext

中央研究院資訊科學研究所

圖書室

2022 Technical Report