LLM-Guided Aerial Navigation
Toward Embedded LLM-Guided Navigation and Object Detection for Aerial Robots
Toward Embedded LLM-Guided Navigation and Object Detection for Aerial Robots
Richie R. Suganda★, Bin Hu
🎥 Video
IEEE International Conference on Robotics and Automation (ICRA) 2025 — Late Breaking Session
Overview
A hierarchical framework integrating natural language commands with autonomous quadrotor navigation. Our system uses a fine-tuned LLaMA model (via LoRA) to interpret high-level natural language instructions into structured task goals, which are then executed by a ModalAI Seeker drone through onboard VIO-based control, path planning, and real-time object detection.
System architecture: Natural language commands are processed by a LoRA-tuned LLM to generate task specifications, which are then executed by the drone's onboard autonomy stack.
Key Contributions
- LLM-to-Robot Pipeline: End-to-end framework from natural language instructions to physical robot execution
- Efficient Fine-Tuning: LoRA-based adaptation of LLaMA for robotics-specific instruction following
- Hardware-in-the-Loop: Real-world testing on ModalAI Seeker drone with onboard compute
- Integrated Perception: Simultaneous VIO-based localization, path planning, and object detection
BibTeX
@inproceedings{suganda2025llm,
title={Toward Embedded LLM-Guided Navigation and Object Detection for Aerial Robots},
author={Suganda, Richie R. and Hu, Bin},
booktitle={2025 IEEE International Conference on Robotics and Automation (ICRA) -- Late Breaking Session},
year={2025},
address={Atlanta, USA}
}