Why Connect an LLM to a ROS2 Robot?
Here’s a scenario I found myself in a few weeks ago: my ROS2-based differential drive robot could navigate autonomously, avoid obstacles, and follow walls. But when I said “go to the kitchen and bring me a bottle of water,” it just stared at me with its blinking LED. The navigation stack had no idea what “kitchen” meant — it only understood waypoints and coordinates.
That’s the gap an LLM bridges. By integrating a large language model with your ROS2 robot, you give it natural language understanding, reasoning, and task-planning capabilities. Instead of hardcoding behaviors, your robot can interpret commands, break them into sub-tasks, and even handle novel situations it wasn’t explicitly programmed for.
In this tutorial, I’ll walk you through a practical integration using Ollama (running a local LLM) connected to a ROS2 node. By the end, you’ll have a working system where your robot can receive a natural language command, have the LLM parse it, and execute the corresponding ROS2 actions.
What You’ll Need
- A machine running ROS2 (Humble or later) — could be your robot’s onboard computer or a development workstation
- Python 3.10+ with the
requestslibrary installed - Ollama installed locally (we covered this in the Raspberry Pi Ollama guide — works on any Linux system the same way)
- A ROS2 workspace set up with
colcon
Step 1: Get a Model Ready in Ollama
If you haven’t already, pull a model that’s good at instruction-following and reasoning. I’ve had the best results with qwen2.5:7b for this use case — it’s fast enough for real-time interaction on a decent CPU and surprisingly good at parsing structured commands.
ollama pull qwen2.5:7b
Once it’s downloaded, test that the API is accessible:
curl http://localhost:11434/api/generate -d '{
"model": "qwen2.5:7b",
"prompt": "You are a robot command parser. Extract the action and target from this command: go to the kitchen",
"stream": false
}'
If you get a JSON response back with the generated text, you’re good. Your Ollama REST API is running at localhost:11434.
Step 2: Create a ROS2 Package for the LLM Integration
Let’s create a ROS2 package that will host our LLM integration node. Navigate to your ROS2 workspace’s src directory:
cd ~/ros2_ws/src
ros2 pkg create --build-type ament_python llm_integration
cd llm_integration/llm_integration
Now create two Python files: the main LLM node, and a simple talker node to simulate commands for testing.
Step 3: The Core LLM Client Node
This is where the magic happens. Create llm_command_node.py:
import rclpy
from rclpy.node import Node
from std_msgs.msg import String
import requests
import json
import re
class LLMCommandNode(Node):
def __init__(self):
super().__init__('llm_command_node')
self.publisher = self.create_publisher(String, 'robot_commands', 10)
self.subscription = self.create_subscription(
String,
'human_input',
self.listener_callback,
10)
self.ollama_url = 'http://localhost:11434/api/generate'
self.model = 'qwen2.5:7b'
self.get_logger().info('LLM Command Node ready. Waiting for human input...')
def listener_callback(self, msg):
user_command = msg.data
self.get_logger().info(f'Received: "{user_command}"')
parsed = self.query_llm(user_command)
if parsed:
cmd_msg = String()
cmd_msg.data = parsed
self.publisher.publish(cmd_msg)
self.get_logger().info(f'Published parsed command: {parsed}')
def query_llm(self, command):
system_prompt = (
"You are a robot command parser for a ROS2 robot. "
"Given a natural language command, output ONLY a JSON dict with keys: "
"'action' (one of: move_to, pick_up, follow, stop, explore, home) "
"and 'target' (a location name or object). "
"Examples:\n"
"'go to the kitchen' → {\"action\": \"move_to\", \"target\": \"kitchen\"}\n"
"'pick up the bottle' → {\"action\": \"pick_up\", \"target\": \"bottle\"}\n"
"'stop right now' → {\"action\": \"stop\", \"target\": \"none\"}\n"
"Respond ONLY with the JSON. No explanation."
)
payload = {
"model": self.model,
"prompt": f"{system_prompt}\n\nCommand: {command}\n\nJSON:",
"stream": False,
"temperature": 0.1
}
try:
resp = requests.post(self.ollama_url, json=payload, timeout=30)
resp.raise_for_status()
text = resp.json()['response'].strip()
# Clean up any markdown code fences the LLM might add
text = re.sub(r'^```(?:json)?\s*', '', text)
text = re.sub(r'\s*```$', '', text)
json.loads(text) # validate it's valid JSON
return text
except Exception as e:
self.get_logger().error(f'Ollama query failed: {str(e)}')
return None
def main(args=None):
rclpy.init(args=args)
node = LLMCommandNode()
rclpy.spin(node)
node.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()
This node subscribes to a /human_input topic, passes each command through Ollama, and publishes the structured JSON result to /robot_commands.
Step 4: A Simple Command Publisher for Testing
Create command_publisher.py — we’ll use this to send test commands from the terminal:
import rclpy
from rclpy.node import Node
from std_msgs.msg import String
import sys
class CommandPublisher(Node):
def __init__(self):
super().__init__('command_publisher')
self.publisher = self.create_publisher(String, 'human_input', 10)
def send_command(self, cmd):
msg = String()
msg.data = cmd
self.publisher.publish(msg)
self.get_logger().info(f'Sent: "{cmd}"')
def main(args=None):
rclpy.init(args=args)
node = CommandPublisher()
cmd = ' '.join(sys.argv[1:]) if len(sys.argv) > 1 else 'go to home'
node.send_command(cmd)
node.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()
Step 5: Wire Up the ROS2 Node and Action Execution
Now we need a node that receives the parsed JSON and maps it to actual ROS2 actions. Create action_executor.py:
import rclpy
from rclpy.node import Node
from std_msgs.msg import String
import json
class ActionExecutor(Node):
def __init__(self):
super().__init__('action_executor')
self.subscription = self.create_subscription(
String,
'robot_commands',
self.callback,
10)
self.get_logger().info('Action Executor ready.')
def callback(self, msg):
try:
data = json.loads(msg.data)
action = data.get('action')
target = data.get('target')
self.get_logger().info(f'Executing action: {action} on target: {target}')
if action == 'move_to':
self.move_to(target)
elif action == 'stop':
self.stop_robot()
elif action == 'explore':
self.start_exploration()
else:
self.get_logger().warn(f'Unknown action: {action}')
except json.JSONDecodeError:
self.get_logger().error(f'Failed to parse command JSON: {msg.data}')
def move_to(self, location):
# In a real robot, publish to /cmd_vel or use Nav2 action client
self.get_logger().info(f'→ Navigating to {location}...')
# Example: publish navigation goal
# goal_msg = PoseStamped()
# goal_msg.header.frame_id = 'map'
# goal_msg.pose.position.x = waypoints.get(location, [0.0, 0.0])[0]
# ...
def stop_robot(self):
self.get_logger().info('→ Stopping all motion.')
# Publish zero velocity to cmd_vel
def start_exploration(self):
self.get_logger().info('→ Starting autonomous exploration.')
# Trigger frontier exploration or SLAM
def main(args=None):
rclpy.init(args=args)
node = ActionExecutor()
rclpy.spin(node)
node.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()
Step 6: Run It!
Make sure to add the entry points in your setup.py, then build and run:
# In setup.py, under entry_points:
# 'console_scripts': [
# 'llm_command_node = llm_integration.llm_command_node:main',
# 'command_publisher = llm_integration.command_publisher:main',
# 'action_executor = llm_integration.action_executor:main',
# ],
cd ~/ros2_ws
colcon build --packages-select llm_integration
source install/setup.bash
In three separate terminals:
# Terminal 1: Start the LLM command node
ros2 run llm_integration llm_command_node
# Terminal 2: Start the action executor
ros2 run llm_integration action_executor
# Terminal 3: Send a command
ros2 run llm_integration command_publisher "go to the living room"
You should see the LLM parse the command into a structured action, and the action executor receiving it. The command publisher exits immediately, but the other two nodes keep running, waiting for more input.
Going Further: Multi-Turn Task Planning
For more complex commands like “clean the dining table and then water the plants,” you can extend the pipeline:
- Use a multi-turn prompt with a conversation history buffer
- Have the LLM output a sequence of actions instead of a single one
- Add a feedback loop: the action executor publishes status back, which gets fed into the LLM for re-planning
I’ve been experimenting with a simple state machine where the LLM generates a plan as a list of JSON actions, and the executor works through them one by one, reporting results back. It’s not perfect — the LLM can hallucinate actions — but for structured indoor tasks, it works surprisingly well.
Performance Notes for 2026 Hardware
If you’re running this on a Raspberry Pi 5 or a Jetson Orin, stick with qwen2.5:0.5b or phi-3.5:3.8b for usable latency (under 2 seconds per query). On a desktop with a GPU, bump up to qwen2.5:14b or llama3.1:8b for significantly better instruction-following.
The key bottleneck is the Ollama HTTP call — ROS2 callbacks don’t block, but if your LLM takes 10 seconds to respond, your robot will pause before acting. Consider running the LLM inference in a separate thread with a command queue.
What’s Next?
This is the foundation. From here, you can:
- Integrate with Nav2 — pass the parsed location to the Nav2 action server for full SLAM-based navigation
- Use MoveIt2 for arm manipulation when the LLM says “pick up”
- Add vision: connect a webcam, run a
yolonode, and have the LLM reason about detected objects - Build a voice interface: pipe speech-to-text (like Whisper) into
/human_input
I’ve been running this setup on my university’s TurtleBot4 for about two months now, and it’s completely changed how I prototype robot behaviors. Instead of writing state machines for every new scenario, I just tell the robot what I want. Try it — you’ll never want to go back to hardcoded commands.
