RL-Enhanced Disturbance-Aware MPC for Robust UAV Trajectory Tracking
📄[Accepted at IEEE SMC 2025] — To appear
This research introduces ROAM, a novel RL-enhanced, disturbance-aware MPC framework for precise UAV trajectory tracking in uncertain and dynamic environments. The method combines the predictive strengths of MPC with the fast response of reinforcement learning (RL) and the robustness of an adaptive sliding mode observer (SMO).
Problem and Motivation
Traditional UAV controllers using MPC struggle under model mismatch, wind disturbances, and computational delays, resulting in residual tracking errors and slow convergence. This work addresses those challenges via two innovations:
- An offline-trained RL warm-start policy to accelerate MPC convergence
- An Adaptive Super-Twisting Sliding Mode Observer (AST-SMO) to estimate and reject real-time disturbances
Technical Contributions
1. RL-Based Warm Start
- A direction-conditioned policy is trained via imitation learning on expert MPC trajectories.
- During real-time control, it provides trajectory-consistent initial guesses to the MPC solver, reducing early-stage tracking error by 16.9% and computation time by 38.7%.
2. AST-SMO for Disturbance Estimation
- The SMO estimates external disturbances in real time using a smooth hyperbolic function to avoid chattering.
- An adaptive gain tuning mechanism adjusts sensitivity dynamically for better convergence.
3. Disturbance-Aware MPC
- MPC is reformulated to incorporate real-time estimates from AST-SMO: \[ x_{k+1} = Ax_k + Bu_k + E(\hat{d}_k) \]
- Objective: minimize both tracking error and control effort, while maintaining system constraints.
Simulation Results
- Evaluated on a 12-DOF quadrotor model under sinusoidal and noisy disturbances.
- ROAM achieved:
- 16.9% improvement in early-stage tracking accuracy
- 38.7% reduction in computation time
- Superior trajectory adherence under heavy external disturbances compared to classical MPC
Conclusion
ROAM demonstrates that deep integration of RL, observers, and MPC yields a control system with faster convergence, better stability, and higher resilience. Its lightweight and modular design makes it highly suitable for real-time deployment on embedded UAV platforms.