RL-Enhanced Disturbance-Aware MPC for Robust UAV Trajectory Tracking

📄[Accepted at IEEE SMC 2025] — To appear

This research introduces ROAM, a novel RL-enhanced, disturbance-aware MPC framework for precise UAV trajectory tracking in uncertain and dynamic environments. The method combines the predictive strengths of MPC with the fast response of reinforcement learning (RL) and the robustness of an adaptive sliding mode observer (SMO).

Problem and Motivation

Traditional UAV controllers using MPC struggle under model mismatch, wind disturbances, and computational delays, resulting in residual tracking errors and slow convergence. This work addresses those challenges via two innovations:

An offline-trained RL warm-start policy to accelerate MPC convergence
An Adaptive Super-Twisting Sliding Mode Observer (AST-SMO) to estimate and reject real-time disturbances

Technical Contributions

1. RL-Based Warm Start

A direction-conditioned policy is trained via imitation learning on expert MPC trajectories.
During real-time control, it provides trajectory-consistent initial guesses to the MPC solver, reducing early-stage tracking error by 16.9% and computation time by 38.7%.

2. AST-SMO for Disturbance Estimation

The SMO estimates external disturbances in real time using a smooth hyperbolic function to avoid chattering.
An adaptive gain tuning mechanism adjusts sensitivity dynamically for better convergence.

3. Disturbance-Aware MPC

MPC is reformulated to incorporate real-time estimates from AST-SMO: \[ x_{k+1} = Ax_k + Bu_k + E(\hat{d}_k) \]
Objective: minimize both tracking error and control effort, while maintaining system constraints.

Simulation Results

Evaluated on a 12-DOF quadrotor model under sinusoidal and noisy disturbances.
ROAM achieved:
- 16.9% improvement in early-stage tracking accuracy
- 38.7% reduction in computation time
- Superior trajectory adherence under heavy external disturbances compared to classical MPC

Conclusion

ROAM demonstrates that deep integration of RL, observers, and MPC yields a control system with faster convergence, better stability, and higher resilience. Its lightweight and modular design makes it highly suitable for real-time deployment on embedded UAV platforms.

Last updated on Nov 11, 2025

← PRISM: Privacy-Aware Routing for Adaptive Cloud–Edge LLM Inference with Semantic Sketch Collaboration Jul 30, 2025

Can Large Language Models Credibly Stand in for Humans in Game-Theoretic Experiments? Apr 17, 2025 →