Building a WhatsApp AI Assistant with OpenAI and Python: Step-by-Step 2026 Guide

I’ve been tinkering with WhatsApp bots for years, and nothing beats the moment your own AI assistant actually replies to a friend’s message with a coherent, context-aware answer. In this guide, I’ll walk you through building a WhatsApp AI assistant using OpenAI’s GPT-4o (or GPT-4-turbo) and Python, step by step, with real code you can run today. No fluff, no crystal-ball gazing—just a working setup for 2026.

What You’ll Need

Before we dive in, let’s get the prerequisites straight. I’ve found that skipping this step leads to half an hour of debugging later. Here’s the exact stack I used:

Component	Version / Service	Notes
Python	3.11+	3.10 works, but 3.11 has better async support
OpenAI API	GPT-4o (latest)	You’ll need an API key with billing enabled
Twilio	WhatsApp Business API	Free trial gives you a sandbox number
Flask	2.3+	Lightweight web server for webhooks
ngrok	Latest	Exposes localhost to the internet for testing

I’m assuming you have a Twilio account (free tier is fine) and an OpenAI API key. If not, pause here and grab those—they’re both free to start.

Step 1: Set Up Your Python Environment

First, create a project folder and a virtual environment. I always do this to keep dependencies clean.

mkdir whatsapp-ai-assistant
cd whatsapp-ai-assistant
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Now install the packages we need:

pip install flask twilio openai python-dotenv requests

Create a .env file in the root directory. This keeps your secrets out of version control:

OPENAI_API_KEY=sk-your-openai-key-here
TWILIO_ACCOUNT_SID=your-twilio-account-sid
TWILIO_AUTH_TOKEN=your-twilio-auth-token
TWILIO_WHATSAPP_NUMBER=+14155238886  # Twilio sandbox number

Step 2: Configure Twilio WhatsApp Sandbox

Twilio’s WhatsApp sandbox is the easiest way to test. Log into your Twilio console, go to Messaging > Try it out > Send a WhatsApp message. You’ll get a sandbox number (usually +14155238886) and a join code like join your-code. Send that code to the number from your phone to activate the sandbox.

In the Twilio console, set the “When a message comes in” webhook URL to your ngrok URL (we’ll get that in Step 4) with the path /webhook. For now, just note the sandbox number.

Step 3: Write the Flask Webhook

This is the heart of the assistant. Create a file called app.py:

import os
from flask import Flask, request, jsonify
from twilio.twiml.messaging_response import MessagingResponse
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

app = Flask(__name__)
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

# Simple conversation memory (in-memory, resets on restart)
conversation_history = {}

@app.route('/webhook', methods=['POST'])
def webhook():
    incoming_msg = request.values.get('Body', '').strip()
    sender = request.values.get('From', '')
    
    # Initialize or get conversation history for this sender
    if sender not in conversation_history:
        conversation_history[sender] = [
            {"role": "system", "content": "You are a helpful WhatsApp assistant. Keep responses concise (under 500 characters) and friendly."}
        ]
    
    # Add user message to history
    conversation_history[sender].append({"role": "user", "content": incoming_msg})
    
    # Keep only last 10 messages to avoid token overflow
    if len(conversation_history[sender]) > 10:
        conversation_history[sender] = [conversation_history[sender][0]] + conversation_history[sender][-9:]
    
    try:
        # Call OpenAI
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=conversation_history[sender],
            max_tokens=300,
            temperature=0.7
        )
        reply = response.choices[0].message.content
    except Exception as e:
        reply = f"Sorry, I hit an error: {str(e)}"
    
    # Add assistant response to history
    conversation_history[sender].append({"role": "assistant", "content": reply})
    
    # Send reply via Twilio
    twilio_response = MessagingResponse()
    twilio_response.message(reply)
    return str(twilio_response)

if __name__ == '__main__':
    app.run(port=5000, debug=True)

A few things I’ve learned the hard way: the system prompt matters. I keep it short because WhatsApp messages are usually brief. The 10-message limit prevents your API costs from spiraling—trust me, I once had a 200-message conversation that cost $3.

Step 4: Expose Your Local Server with ngrok

Open a new terminal (keep the Flask server running) and run:

ngrok http 5000

You’ll see a forwarding URL like https://abc123.ngrok.io. Copy that. Now go back to your Twilio console and set the webhook URL to https://abc123.ngrok.io/webhook. Make sure the method is POST.

Test it by sending a message from your phone to the Twilio sandbox number. You should get an AI reply within 2-3 seconds.

Step 5: Add Context Awareness (Optional but Powerful)

The basic version works, but it forgets everything when the server restarts. For a production assistant, I recommend using a lightweight database like SQLite. Here’s a quick upgrade:

import sqlite3
from datetime import datetime

def init_db():
    conn = sqlite3.connect('conversations.db')
    c = conn.cursor()
    c.execute('''CREATE TABLE IF NOT EXISTS messages
                 (id INTEGER PRIMARY KEY AUTOINCREMENT,
                  sender TEXT,
                  role TEXT,
                  content TEXT,
                  timestamp DATETIME DEFAULT CURRENT_TIMESTAMP)''')
    conn.commit()
    conn.close()

def get_history(sender, limit=10):
    conn = sqlite3.connect('conversations.db')
    c = conn.cursor()
    c.execute('''SELECT role, content FROM messages 
                 WHERE sender = ? ORDER BY timestamp DESC LIMIT ?''', (sender, limit))
    rows = c.fetchall()
    conn.close()
    # Reverse to get chronological order
    messages = [{"role": row[0], "content": row[1]} for row in reversed(rows)]
    return messages

def save_message(sender, role, content):
    conn = sqlite3.connect('conversations.db')
    c = conn.cursor()
    c.execute('''INSERT INTO messages (sender, role, content) VALUES (?, ?, ?)''',
              (sender, role, content))
    conn.commit()
    conn.close()

Then modify the webhook to use these functions instead of the in-memory dictionary. I’ve found this makes the assistant feel much more personal—it remembers that you asked about pizza last week.

Step 6: Handle Media and Errors

WhatsApp messages can include images, documents, or locations. Here’s how to handle them gracefully:

media_url = request.values.get('MediaUrl0', None)
media_type = request.values.get('MediaContentType0', None)

if media_url:
    if media_type and 'image' in media_type:
        # Download image and send to OpenAI vision
        import requests
        img_data = requests.get(media_url).content
        # You'd need to base64 encode and send as a vision message
        # For simplicity, just acknowledge
        reply = "I received your image! I can analyze it if you ask a question about it."
    else:
        reply = "I can only process text and images right now."

I’ve intentionally kept this simple—full image analysis with GPT-4o vision requires a different API call structure, but this gives you a starting point.

Testing and Deployment

Once everything works locally, you can deploy to a cloud platform like Render, Railway, or a VPS. Just set the environment variables there and update the Twilio webhook URL to your production domain.

Here’s a quick comparison of deployment options I’ve tried:

Platform	Free Tier	Ease of Setup	Best For
Render	Yes (sleeps after inactivity)	Very easy	Quick prototyping
Railway	$5 credit	Easy	Always-on bots
VPS (DigitalOcean)	No	Moderate	Full control

I personally use Render for testing and Railway for production—the $5 credit lasts months for a low-traffic bot.

Final Thoughts

You now have a working WhatsApp AI assistant that uses OpenAI’s latest models. The whole setup takes about 30 minutes from scratch. I’ve been running a version of this for my family group chat, and it handles everything from recipe questions to homework help. The key is keeping the system prompt tight and the conversation history manageable.

One last tip: monitor your OpenAI usage. I set a hard monthly budget of $10 in the OpenAI dashboard. It’s easy to forget and let a chatty friend run up the bill. Happy building!