How to Give Your Telegram Bot Conversation Memory with python-telegram-bot and OpenAI
A Telegram bot that forgets every message the moment it replies is barely more useful than a search bar. If you're forwarding user messages to OpenAI's Chat Completions API, you need to maintain a per-user message history array — otherwise the model has zero context, and multi-turn conversations are impossible. This is a common gap in beginner implementations, and fixing it cleanly requires understanding both where to store state and how to structure the messages payload.
🧠 Why the Bot Forgets
OpenAI's Chat Completions API is stateless. Every request you send must include the full conversation history in the messages array. If you only send the latest user message, the model treats it as the first message in a brand-new conversation. Your bot isn't broken — it's just not accumulating history before each API call.
The fix has two parts:
- Maintain an in-memory (or persistent) list of message dicts per user
- Append each new user message before calling the API, then append the assistant's reply after
📐 Architecture Overview
Implementation: Per-User Conversation History
The cleanest approach for a single-process bot is a plain Python dictionary keyed by Telegram chat_id. Each value is the running messages list for that user. This lives in memory for the lifetime of the process — good enough for development and low-traffic bots.
Complete Working Example
import os
from openai import OpenAI
from telegram import Update
from telegram.ext import ApplicationBuilder, MessageHandler, filters, ContextTypes
# In-memory store: { chat_id: [ {role, content}, ... ] }
conversation_history: dict[int, list[dict]] = {}
SYSTEM_PROMPT = "You are a helpful assistant."
OPENAI_CLIENT = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def get_history(chat_id: int) -> list[dict]:
"""Return existing history or initialise with system prompt."""
if chat_id not in conversation_history:
conversation_history[chat_id] = [
{"role": "system", "content": SYSTEM_PROMPT}
]
return conversation_history[chat_id]
async def handle_message(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
chat_id = update.effective_chat.id
user_text = update.message.text
# 1. Retrieve this user's history
history = get_history(chat_id)
# 2. Append the new user turn
history.append({"role": "user", "content": user_text})
# 3. Call OpenAI with the full history
response = OPENAI_CLIENT.chat.completions.create(
model="gpt-4o-mini",
messages=history,
)
assistant_reply = response.choices[0].message.content
# 4. Append the assistant turn so future calls include it
history.append({"role": "assistant", "content": assistant_reply})
# 5. Reply on Telegram
await update.message.reply_text(assistant_reply)
if __name__ == "__main__":
token = os.environ["TELEGRAM_BOT_TOKEN"]
app = ApplicationBuilder().token(token).build()
app.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, handle_message))
app.run_polling()
Install dependencies with: pip install python-telegram-bot openai (use python-telegram-bot[job-queue] if you need scheduled tasks).
🔒 Capping History Length
Unbounded history will eventually exceed the model's context window and inflate your token costs. A simple sliding-window approach keeps the system prompt and the most recent N turns:
MAX_TURNS = 20 # each turn = 1 user + 1 assistant message = 2 entries
def trim_history(history: list[dict]) -> list[dict]:
"""Keep system prompt + last MAX_TURNS*2 messages."""
system = [m for m in history if m["role"] == "system"]
conversation = [m for m in history if m["role"] != "system"]
trimmed = conversation[-(MAX_TURNS * 2):]
return system + trimmed
Call history = trim_history(history) before the API call and reassign it back to conversation_history[chat_id]. This keeps memory bounded without losing the system prompt.
Resetting a Conversation
Add a /reset command handler so users can start fresh without restarting the bot:
from telegram.ext import CommandHandler
async def reset(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
chat_id = update.effective_chat.id
conversation_history.pop(chat_id, None)
await update.message.reply_text("Conversation reset. Starting fresh!")
# Register it:
# app.add_handler(CommandHandler("reset", reset))
Persisting History Across Restarts
In-memory state is wiped every time the process restarts. For production, serialize history to a database. A minimal approach with SQLite and Python's built-in json module works well:
- Store each user's history as a JSON blob keyed by
chat_id - Load on first access, save after every assistant reply
- For higher traffic, Redis with
redis-pyis a natural fit — store the list as a JSON string under a key likechat:{chat_id}:history
The in-memory dict pattern shown above is intentionally simple to swap out — just replace get_history and the append lines with your storage layer.
Key Takeaways
- OpenAI's API is stateless — you must send the full conversation history on every request; the bot is responsible for accumulating it.
- Key your history store by
chat_idso each user gets an independent conversation thread. - Always append both the user message and the assistant reply to history, in that order, before and after the API call respectively.
- Cap history length with a sliding window to control token usage and avoid context-window errors.
- For production, persist history to SQLite or Redis so conversations survive process restarts.
With this pattern in place, your bot can handle multi-turn reasoning, remember user preferences within a session, and feel like a real conversational agent rather than a stateless lookup tool. The next step is adding a persistence layer — start with SQLite if you're solo, Redis if you're expecting concurrent users or horizontal scaling.
Comments
Post a Comment