Building ComiKola: AI Comic & Webtoon Platform
TL;DR
ComiKola generates Bangla comics end-to-end — scripting, character design, panel images — on a Go + React + Temporal stack. Here is the architecture.
ComiKola is an end-to-end AI comic and webtoon platform I built for the Bengali-speaking world. It handles AI scripting, character design, panel image generation, and bandwidth-optimized delivery. Stack: Go (Gin), React, Python + Temporal workers, PostgreSQL, Redis, Cloudflare R2. Here's the architecture.
The end-to-end pipeline
idea -> script -> characters -> panels -> layout -> publish
Each step is its own service, and a Temporal workflow coordinates them. A finished webtoon chapter takes 15-25 minutes wall-clock.
Why the steps separate
I tried fusing scripting and image generation in one giant prompt. It worked for one-panel jokes; it failed at narrative continuity over 30 panels. Character consistency drifted; the story lost pacing. Splitting into discrete steps with structured outputs fixed both.
Architecture
| Layer | Tech |
|---|---|
| Frontend | React + TypeScript |
| API | Go (Gin) |
| Workflow | Temporal.io |
| LLMs | Gemini for narration, premium models for character design |
| Image gen | Per-style fine-tuned diffusion models |
| Storage | PostgreSQL + Cloudflare R2 |
| Cache | Redis |
Temporal.io for Long-Running GenAI Workflows covers the workflow patterns we reuse here.
Character consistency: the hard part
Faces and outfits must stay consistent panel to panel. We use a two-stage approach:
- Character sheet generation — one image per character, multiple angles, fixed seed.
- Panel conditioning — every panel image is conditioned on the character sheet plus the scene prompt.
This trades some creativity for consistency. The trade-off is right for narrative comics; not necessarily for one-shot art.
Bandwidth-optimized delivery
Bangladeshi readers often have slow mobile networks. Every panel ships as multiple sizes; the client picks the right one based on real connection speed. We use Cloudflare R2 because it is cheap and S3-compatible. Server-side AVIF + WebP saves another 30-50%.
Why Bangla first
Bangla webtoons are a thin market for incumbents. ComiKola fills the gap with localized UI, Bangla narration, and characters that look like the readers. Building World-Class Software From Dhaka covers the broader thesis.
What I'd do differently
Skip diffusion for the v1 and use a curated library of pre-drawn character assets composed by AI. Faster, cheaper, more consistent. We are migrating in that direction.
FAQ
Q: Is ComiKola open source? A: The platform is private. Try the live product at comikola.com.
Q: What languages does it support? A: Bangla first, English second. More languages on the roadmap.
Q: How long does a chapter take to generate? A: 15-25 minutes for ~30 panels, including narration and panel art.
Built by Shihab Shahriar Antor. Sister product: ComiKola Kids. Hire me.
Written by
Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. Creator of LetX, QuantumSketch, and more.