How I Built BikroyBuddy: AI Sales Agent (5k+ Users)
TL;DR
BikroyBuddy is an AI shopping agent for Bangladesh that handles discovery and orders over chat, serving 5,000+ users. Here is how the architecture scaled.
BikroyBuddy is an AI sales agent I built for social commerce in Bangladesh. It plugs into Facebook and WhatsApp, answers product questions, takes orders and checks inventory — currently serving 5,000+ users across small merchants. Here is the architecture and what I learned scaling it.
The problem
Bangladesh runs on social commerce. Shops sell on Facebook pages and WhatsApp groups, not e-commerce sites. The bottleneck is conversation — every inquiry needs a reply, and merchants can't keep up at scale.
BikroyBuddy is the conversational layer for that market.
Architecture overview
| Layer | Tech | Notes |
|---|---|---|
| Channel gateway | Go, Meta Graph API | Inbound messages from FB/WA |
| Intent + NER | Lightweight LLM (Qwen, Gemini Flash) | Detect intent, extract product/qty |
| Inventory | PostgreSQL + Redis | Source of truth for stock |
| Order orchestration | Go services, Temporal | Multi-step order flows |
| LLM router | openrouter-free-infer | Cheap fallback for non-critical calls |
| Storage | PostgreSQL, R2 for images | Cheap and predictable |
Cheap inference matters
Bangladesh users send a lot of messages. Burning $0.01 per turn on GPT-4 = unsustainable. We route 80% of turns through free OpenRouter models via openrouter-free-infer and reserve premium calls for ambiguous flows.
Why Go
Each conversation is light but concurrent. Go's goroutines let one box handle thousands of in-flight conversations on a tiny instance. Memory matters too — cheap VMs in Asia have 1-2 GB RAM, and a Python worker burns half of that idle.
Scaling to 5k+ users
The bottleneck was never compute — it was queueing. We pushed inbound messages onto Redis streams, processed by stateless Go workers. When traffic spikes (sales, holidays), we autoscale workers, not the gateway.
I went deeper on this pattern in Scaling an AI Agent to 300k+ Users on Kubernetes. BikroyBuddy uses a simpler version of the same playbook.
What didn't work
Hallucinated product names. Early on, the LLM invented SKUs that did not exist. Fix: enforce intent + NER as a strict step, then look up the product in PostgreSQL before any response goes out.
Polite small talk. Users greeted in Bangla; the bot replied in English. Localized prompt templates fixed it in a day.
Order races. Two customers ordering the last item simultaneously. Fix: optimistic locking on inventory rows + Temporal-managed compensating transactions.
What's next
Voice. WhatsApp voice notes are how Bangladeshi customers prefer to communicate; STT + native Bangla TTS is the next milestone.
FAQ
Q: How does BikroyBuddy work on WhatsApp and Facebook? A: It connects to a merchant's WhatsApp Business or Facebook page and replies automatically — product questions, orders, inventory checks.
Q: Does it support Bangla? A: Yes, natively. Most conversations are Bangla or Banglish.
Q: Who is it for? A: Small social-commerce sellers in Bangladesh who can't staff full-time chat support.
Written by Shihab Shahriar Antor — Founder of Shahriar Labs. More case studies in Mission Logs. Hire me.
Written by
Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. Creator of LetX, QuantumSketch, and more.