Towards a Semantically Enriched Mapping System for Mobile Robots Using Large Language Models (LLMs)

Image showing one of our mobile robots in front of an empty background. (AI generated)

Illustration, AI generated (Google AI)

Mobile robots often rely on a geometric representation of their environment for navigation, typically a 2D occupancy grid map. However, this representation lacks the semantic context that humans use to describe and understand their surroundings (e.g., "living room," "kitchen," "table"). Recent advancements in Large Language Models (LLMs) and their ability to act as agents open up new possibilities for intuitive human-robot interaction through natural language commands. This master's thesis aims to develop a system that leverages LLM-based agents to create, query, and manipulate a semantically enriched map of a robot's environment.

The core objective is to enable the robot to understand and execute natural language instructions related to mapping, such as "point out the soldering iron in our lab". The resulting map should go beyond simple geometric data by associating objects and locations with meaningful semantic labels. The system will be based on ROS 2 and may utilize existing functionalities of the ROS Navigation Stack 2. A potential implementation approach is the use of a 3D Scene Graph, which provides a structured way to represent objects, their properties, and their relationships. A secondary research question addresses the challenge of handling environmental dynamics, specifically how the system should react if a previously mapped object changes its location or disappears. The focus of this work is on the mapping system itself, not the robot's locomotion or control.

Advisor:

Sven Lange, sven.lange@…

Requirements:

Basic knowledge in Linux/Ubuntu would be good.
Basic knowledge in ROS2 and Python programming.
Basic knowledge in LLMs.