Marketing site
The site your customers see, included
Hero, pricing, blog, FAQ, changelog, contact. Themed light + dark, SEO + sitemaps wired. Deploy it as-is or fork it for your brand.
AgentBoiler is a production-ready FastAPI + Next.js boilerplate for shipping multi-tenant AI agent SaaS. Auth, streaming agents, RAG, Stripe billing and deploy - wired, typed and production-tested.
Built on auth, billing and RAG patterns proven in real production work - not a weekend demo.
See it run
Token-by-token responses, knowledge-base picker, model switching - captured straight from the running product. No mockups, no Figma frames.
Browse the product
Each block below is a real screen from the running app - not a Figma mockup, not a marketing render. Click any frame to enlarge.
Marketing site
Hero, pricing, blog, FAQ, changelog, contact. Themed light + dark, SEO + sitemaps wired. Deploy it as-is or fork it for your brand.
Auth & onboarding
JWT + refresh tokens, sessions, Google OAuth, API keys. Two-column layout with brand panel, themed across both modes.
Inside the app
The product ships with full light + dark theming. Flip the toggle to see how each surface renders in either mode.
Dashboard
Credits remaining, conversation count, agent calls, knowledge vectors. Usage trend, recent activity, credit spend by model. Every metric reads from live data.
Knowledge & RAG
Multiple knowledge bases, multiple sync sources per KB. Drag-drop upload of PDF, DOCX, MD or TXT, or schedule a pull from Google Drive or S3 / MinIO. Hybrid BM25 + vector search with reranking runs underneath.
Billing & credits
Free plan defaults so new users land in the product, paid plans with monthly credits, token-level metering against real LLM usage. Self-service plan switch, full transaction log, Customer Portal under the hood.
Multi-tenant & teams
Organizations with four-role RBAC (Owner / Admin / Member / Viewer), email invites with expiring tokens, row-level data isolation enforced through the whole stack.
Admin & ops
User management with suspend + impersonate. Cross-workspace conversation browser. Replay-able Stripe webhook log. Per-service health probes that auto-refresh every 30s.
AgentBoiler is a boilerplate for multi-tenant, billed, observable AI agents. The agent ships on LangChain + LangGraph and lives in one isolated module - everything around it (auth, orgs, Stripe credits, RAG, workers, admin) is the part you'd rather not rebuild, already wired and tested.
Token-by-token responses over WebSocket. Tool calls, citations, multi-turn memory and message rating - all running on LangChain + LangGraph. One make bootstrap and you have a real working chat in three minutes, not a hello-world stub.
See it runBundle docs into as many knowledge bases as you need - per user, per team, per project. PDF, DOCX, Markdown, plain text - chunked, embedded and indexed in Milvus with hybrid BM25 + vector search and reranking. Toggle which KBs each chat reaches, per message.
RAG pipeline
Connect a Google Drive folder or an S3 / MinIO bucket and AgentBoiler keeps the index fresh on a schedule. Multiple sources per knowledge base, configure once and forget - no manual re-uploads when docs change, no glue scripts to maintain.
Connector internals
Stripe Checkout, Customer Portal, subscriptions that bundle monthly credits, token-level metering against real LLM usage. Signed webhooks logged to a replay-able event store - debug failed billing flows without running ngrok at 2 AM, and refund disputes without grep.
Billing model explained
Organizations, members, four-role RBAC (Owner / Admin / Member / Viewer), email invitations with expiring tokens, per-org data isolation enforced at the database row level. The pattern that usually fails an audit - already wired through API, services and frontend.
How isolation works
Pre-built admin: user management with suspend and impersonate, browse every conversation across the workspace, replay Stripe events, monitor system health per service. LangSmith + Sentry wired, structured logs, per-user rate limiting, audit log - the operational posture buyers ask for, already in the ZIP.
Observability stack
No "coming soon" - it's all in the ZIP
How it's built, and why
You can fork the agent runtime, the parser, the auth, the vector store - any of it. Below: what each layer does, and the alternatives we passed on to choose it.
Async Python API with end-to-end type checking, auto-generated OpenAPI / Swagger / ReDoc, dependency injection, and middleware that scales from prototype to production.
Looked at Django (sync ORM, slower iteration), Flask (no async, no types, you build everything around it), Fastify (Node fights LLM Python libs every step). FastAPI is the only mainstream framework that delivers typed Python with async out of the box. Pydantic v2 is 5-50x faster than v1 and is what LangChain and the OpenAI SDK already speak natively.
App Router with server + client components, streaming chat UI, dashboard, admin, marketing pages and blog all in one repo. i18n built in.
SvelteKit and Remix have nicer DX but smaller pools of AI-app component libs. Vue / Nuxt loses the React landgrab in shadcn-style primitives. Next.js 15 + React 19 give server components (smaller bundles), Suspense streaming (token-by-token feels native), and the biggest selection of pre-built UI to drop in. App Router means marketing and product live behind one server.
Tailwind 4 with @theme blocks, CSS variables for every color and radius, light + dark themes from one source, shadcn primitives drop in cleanly.
Plain CSS-in-JS spirals into design debt at any scale. Bootstrap looks like Bootstrap. Tailwind 4 finally has CSS vars first-class, so a brand swap is one block in globals.css. We use shadcn primitives for Dialog, Select, Dropdown, Popover - no need to roll our own when the patterns are this well-trodden.
Async PostgreSQL via asyncpg, SQLAlchemy 2.0 ORM with typed Mapped[] columns, Alembic migrations checked into source, repository + service architecture.
SQLite is fine for hobby projects, dies at multi-tenant. MongoDB makes row-level isolation messy and the agent has heavy relational reads. Postgres with explicit org_id columns is the boring choice that scales to billions of rows. SQLAlchemy 2.0's typed mapper is the closest Python comes to Rails-grade migrations and introspection. asyncpg is roughly 3x faster than psycopg2 for our read-heavy workload.
Redis powers app cache, rate limiter and the Celery broker. Celery handles background ingestion, scheduled syncs, and long-running tool calls. Beat schedules per-source sync intervals.
Looked at RQ (too simple, no schedules), Dramatiq (smaller community, fewer integrations), Taskiq (still young). Celery has been the production default for 15+ years - boring, stable, battle-tested. fastapi-cache2 gives us per-route caching with TTLs in three lines. Redis pulls double duty as cache + broker so the infra stays small.
Streaming agent on LangChain with tool calls, multi-turn memory, conversation persistence. LangGraph for stateful multi-step flows. Tools include web search, URL fetch, chart rendering, RAG search.
Pydantic AI is cleaner but the pre-built tool ecosystem is shallow. CrewAI is great for multi-agent orchestration but overkill for a chat product. The Vercel AI SDK locks you to Node + serverless. LangChain has the deepest tool ecosystem and integrates with LangSmith out of the box. LangGraph handles state that LangChain alone can't. The agent module is isolated behind a single interface, swap runtimes if you want.
Self-hosted vector database for embeddings, hybrid BM25 + dense vector search with reranking on top, multi-tenant via per-org collections.
Pinecone is hosted-only and per-vector pricing punishes you at scale. Qdrant is great but smaller team. Chroma is fine for prototypes but slows past a few million vectors. Weaviate is Java-shaped and adds operational weight. Milvus has been proven at Zilliz, Bilibili and Walmart. Self-hostable via Docker Compose so your vector data never leaves your stack, and the hybrid retrieval beats pure-vector on factual queries.
Three layered parsers chosen by file shape: liteparse (fast Python) for plain documents, LlamaParse (via llama-cloud) for complex layouts and tables, python-docx + pymupdf for native format reads. Images inside docs go through an LLM describer.
Single-parser strategies fail on the long tail of file formats. liteparse handles ~80% of plain documents in milliseconds and zero API cost. LlamaParse via llama-cloud is the cleanest answer for multi-column PDFs, scanned tables, and messy academic papers - we route to it only when liteparse confidence drops. python-docx parses .docx natively (faster than going through PDF). pymupdf handles PDF text + layout. For images embedded in docs, we send them to a vision LLM, store the description in the index, and the agent answers "what's in the chart on page 4?" without OCR gymnastics.
Subscriptions that bundle monthly credits, token-level metering against real LLM usage, signed webhooks logged to a replay-able event store, self-service portal for cancel / payment-method / invoice download.
We tried building billing from scratch once - lost two months to refund edge cases, EU VAT and SCA. Stripe is the only billing layer worth wiring for a serious SaaS. Customer Portal solves the cancel-flow regulatorily in every market we know. We add token-level credit metering on top of subscriptions because per-LLM-call cost variance is too high to bill per seat alone.
PyJWT access + refresh tokens, bcrypt for password hashing, Authlib for Google OAuth and SSO patterns, session table in Postgres for revocation, API keys for service-to-service.
Auth0 / Clerk / Supabase Auth are great if you want a vendor lock and a per-MAU bill. Self-hosted auth on JWT + bcrypt is the boring path that costs nothing and ports anywhere. Authlib gives us OAuth provider integration without writing the dance ourselves. Session table in PostgreSQL means revocation is one SQL query, not a vendor support ticket.
Docker Compose configs for dev / stage / prod. Sentry for errors and performance, LangSmith for agent traces (every tool call replayable). Nginx in front for SSL + WebSocket upgrade. CI: ruff + pytest + Playwright + Trivy.
Kubernetes is overkill for the first three years - Docker Compose runs on any VPS for $5/mo. Sentry is the established default for app errors; LangSmith from LangChain's own team is the only agent-specific observability that doesn't require building yourself. Nginx reverse-proxies, terminates SSL, and handles the WebSocket upgrade headers without configuration drama.
Anonymized outcomes from real client & freelance projects built on these exact patterns - the boring 80% AgentBoiler now ships for you.
Replaced 4 weeks of auth + RAG plumbing with one make bootstrap. Shipped our support agent in 9 days, $1.4k MRR by month 2.
9 days to first revenue
Multi-tenant org isolation alone saved us a security audit. The pattern is exactly what compliance asked for.
Passed security review
Token-level Stripe billing was 2 weeks of work. Solved in a Saturday with the credits hybrid model.
2 weeks → one Saturday
We had a chat UI streaming token-by-token from LangChain in under an hour. That alone justified the price.
Streaming chat in <1h
Hybrid BM25 + reranking on Milvus meant RAG answers were accurate out of the box - we skipped the usual retrieval-tuning sprint entirely.
Skipped the RAG-tuning sprint
Admin dashboard pre-built with Stripe events, audit log, impersonation. Saved an engineer-week.
Saved an engineer-week
Shipped, not promised
AgentBoiler is not a freshly-forked starter. Below is the actual release history, version after version, since the first public preview in November 2025.
Every Basic and Extended buyer gets the entire history baked in, plus lifetime patches on the major they bought.
Where it lands
| AgentBoiler | From scratch | Free LC template | Next.js boilerplate | |
|---|---|---|---|---|
| Time to deployed agent | 3 min | 4 weeks | 1 week | - |
| FastAPI async backend | Yes | - | No (Next.js only) | No (Next.js only) |
| Agent runtime pre-wired | LangChain + LangGraph | - | LangChain only | None |
| RAG over your docs | Milvus + hybrid BM25 | - | 1 store | - |
| Multi-tenant + RBAC | 4 roles + invites | - | - | - |
| Token-level billing | Stripe + credits | - | Credits only | Stripe basic |
| Background jobs (Celery) | Yes + beat | - | - | - |
| Admin API + audit log | Full | - | - | Basic |
| Self-host (Docker + Nginx) | Yes | - | Vercel-only | Vercel-only |
| Price | $199 once | your time | Free (MIT) | $299 |
Honest answer
Cursor or Claude Code will give you a streaming chat UI in an evening. That's the easy 20%. The other 80% is where projects stall for weeks - and that's the layer that fails the audit, not the demo.
Will do
Won't do in a sprint
All of it, day one
Vibe-code the part you demo. Buy the part that keeps customers paying. You can absolutely build everything above yourself - it's just 67 hours of plumbing nobody enjoys. The trade is your time vs $199.
Honest disclosure
AgentBoiler is a boilerplate, not a finished product. Here's what's deliberately out of scope - so you know exactly what you're buying before you click.
Zero surprises
No support tickets to file, no onboarding form, no waiting list. The whole post-purchase flow runs on autopilot - here is exactly what hits your inbox and what you do with it.
One-time payment via Polar checkout. Card, link or wire. Takes about 30 seconds, no account required, no subscription created.
Source ZIP, commercial license, Discord invite, and the full documentation pack - setup guides, architecture map, CLI reference, deploy playbooks for Hetzner, Fly, Railway and Render.
make bootstrapThree minutes from unzip to a streaming chat in your browser. Stripe in test mode, sample data, working agent - the demo you tried earlier, running locally on your machine.
One payment, lifetime access
Just the template
$199
Get Basic - $199Template + hands-on help
$299
Get Extended - $299Bespoke requirements
Let's talk
Request a quoteLicense, plainly
Three more things buyers ask
make bootstrap fails after
purchase. We would rather fix the install than refund a frustrated customer.
And the 14-day money-back is still on the table either way.
14-day money-back guarantee - full refund if it's not for you. One-time payment, no subscription. Custom builds are scoped and quoted per project.
Before you buy
A ZIP archive with the full source code - FastAPI + Pydantic v2 backend, Next.js 15 frontend, LangChain + LangGraph agent, Milvus RAG pipeline, Stripe billing with credits, multi-tenant orgs, admin API, Celery workers, Docker Compose (dev/stage/prod) + Nginx, GitHub Actions CI. ~70k lines of production-tested code. Commercial license. Lifetime access to the version you bought.
Free starters give you a bare agent with minimal auth. AgentBoiler ships everything around it pre-wired: multi-tenant RBAC with invitations, Stripe billing + token-level credits, Milvus RAG with hybrid search and Drive/S3 sync, Celery workers, admin API with audit log, rate limiting, CI. The free template gets you a demo by week 1. AgentBoiler gets you to production by week 1.
Some familiarity helps. make bootstrap gives you a running chat UI with Stripe in test mode in ~3 minutes. If you've shipped any web app before, you'll be productive in a day.
Yes - the commercial license permits unlimited commercial use and unlimited projects on Basic and Extended. White-label and multi-developer licensing is part of a Custom build. The source itself cannot be redistributed or resold as a competing template.
AI model fine-tuning UI. Mobile native apps (wrap with Capacitor). Multi-region database replication. Compliance certifications (controls scaffolded, audit is on you). Your specific business logic - that's your product.
It ships on LangChain + LangGraph - streaming, tool calling and conversation persistence, fully wired. The agent lives in one isolated module behind the chat-session interface, so you can replace it with another runtime (Pydantic AI, CrewAI, your own). There is no pre-built adapter for those - that swap is yours to make, but auth, billing and RAG don't change when you do.
14-day money-back guarantee. If AgentBoiler isn't the right fit, email hello@agentboiler.com within 14 days of purchase for a full refund - no questions asked. If you hit an install snag, we'll hop on a free 30-minute debug call to get you unstuck.
Anywhere that runs Docker - Hetzner ($5/mo VPS), Fly.io, Railway, Render, DigitalOcean, AWS ECS, GCP Cloud Run, your own bare-metal. Docker Compose configs for dev/stage/prod and a production Nginx config are included. Zero AgentBoiler-hosted services required.
Hybrid: customers pay a subscription that includes monthly credits. Each agent call consumes credits proportional to LLM tokens used. Per-org quotas, anomaly detection, self-service top-ups via Customer Portal - all wired.
If your product is a multi-tenant chat or workflow agent with LLM tool calling and document context - yes. If you're building single-tenant, RAG-only, or fine-tuned-model-serving, AgentBoiler is overkill. Rule of thumb: if you'd build a chat page + admin + billing, it fits.
Stop rebuilding auth, billing and RAG on every project. Buy the boring 80% once - own it for life.