CaMeRL: Collision-Aware and Memory-Enhanced Reinforcement Learning for UAV Navigation in Multi-Scale Obstacle Environments

Collision-Aware and Memory-Enhanced Reinforcement Learning
for UAV Navigation in Multi-Scale Obstacle Environments

Hong Hong^*, Feiyu Liao^*, Yongheng Liang, Boning Zhang, Haitao Wang^†, Hejun Wu^†

^*Equal contribution ^†Corresponding authors

Sun Yat-sen University

Abstract

In obstacle avoidance navigation of unmanned aerial vehicles (UAVs), variations in obstacle scale have received strangely less attention than obstacle number or density. Existing methods typically extract purely geometric features from single-frame depth observations. Such representations tend to neglect small obstacles and lose spatial context under occlusions caused by large obstacles, leading to noticeable degradation in environments with multi-scale obstacles.

To address this issue, we propose CaMeRL, a Collision-aware and Memory-enhanced Reinforcement Learning framework for UAV navigation. The collision-aware latent representation encodes risk-sensitive depth cues to preserve fine-grained obstacle structures, improving its sensitivity to small obstacles. The temporal memory module integrates observations across frames, mitigating partial observability caused by large-obstacle occlusions.

We evaluate CaMeRL with multi-scale obstacles, including ultra-small and extra-large obstacle settings. Results show that CaMeRL outperforms state-of-the-art baselines across all scales, with success rate gains of 0.48 and 0.28 in the ultra-small and extra-large settings, respectively. More importantly, CaMeRL shows its capability of reliable navigation in cluttered outdoor environments.

Method

CaMeRL combines two complementary representation modules. A VAE is trained to reconstruct collision-aware depth images, where obstacle boundaries are inflated by the UAV body size, embedding safety-relevant geometry into the latent space without runtime preprocessing. An LSTM then aggregates latent sequences and is supervised by jointly reconstructing both the current and a temporally offset collision-aware observation, encouraging the hidden state to maintain scene context under occlusion. The navigation policy is subsequently trained via RL with the representation modules frozen.

Multi-scale Obstacle Avoidance in Simulation

The policy is trained in a nominal-scale environment and evaluated zero-shot across six obstacle scales ranging from ultra-small (1–5 cm) to extra-large (400–500 cm). CaMeRL achieves consistently high success rates across all configurations, demonstrating robust generalization to obstacle scale variation without retraining.

Real-world Deployment

CaMeRL is deployed on a 250 mm quadrotor with an Intel RealSense D435, NVIDIA Jetson Orin NX, and Pixhawk 6C mini. The representation modules are fine-tuned on real-world depth data, while the policy transfers directly from simulation. Outdoor flight experiments confirm the feasibility of fully onboard deployment.

CaMeRL

Collision-Aware and Memory-Enhanced Reinforcement Learning for UAV Navigation in Multi-Scale Obstacle Environments

Abstract

Method

Overview of the CaMeRL architecture and training pipeline.

Multi-scale Obstacle Avoidance in Simulation

Real-world Deployment

Collision-Aware and Memory-Enhanced Reinforcement Learning
for UAV Navigation in Multi-Scale Obstacle Environments