Multi-tenant credits system for AI SaaS - full implementation
How to wire a usage-tracked credits system across Stripe subscriptions, per-organization quotas, and anomaly detection - without hand-rolling 4 weeks of billing infrastructure.
Table of contents
A credits system for an AI SaaS is hybrid billing: customers pay a base subscription that includes a monthly credit allocation, and consume credits per agent action. Done right, it gives you predictable revenue plus usage-aligned costs. Done wrong, it’s a webhook signature debugging nightmare.
This post walks through the implementation patterns we ship in AgentBoiler.
Why credits, not per-seat
Per-seat pricing breaks for AI products because usage is wildly uneven across users in the same org. One power user can consume 10x what their teammates do. Per-seat pricing either undercharges the heavy users (losing margin) or overcharges the light users (driving them away).
Credits align cost to value. A user who runs 100 agent queries pays more than one who runs 5 - but they’re on the same plan, in the same org, with the same features. The org pays for total consumption.
The data model
Three tables anchor a credits system:
# Organization - the billing entityclass Organization(Base): id: UUID stripe_customer_id: str stripe_subscription_id: str | None plan_code: str # "hobby" | "pro" | "business" credits_balance: int # current available credits credits_renewed_at: datetime # last monthly grant
# Usage event - every credit-consuming actionclass UsageEvent(Base): id: UUID organization_id: UUID # FK to Organization user_id: UUID event_type: str # "agent_call" | "embedding" | "document_parse" cost_credits: int metadata: JSON # model used, token counts, etc. created_at: datetime
# Plan - what each tier includesclass Plan(Base): code: str # "hobby" | "pro" | "business" display_name: str stripe_price_id: str monthly_credits_base: int # credits granted on each renewal features: JSON # ["multi_workspace", "priority_queue", ...]The credits_balance on Organization is the single source of truth - fast lookups, cheap to decrement. UsageEvent is the audit trail and the source for analytics.
Decrementing credits atomically
Every agent call needs to:
- Check the org has enough credits
- Reserve / decrement the credits
- Execute the call
- Record a UsageEvent
- If the call failed, refund the credits
The naive approach (read → check → decrement) has a race condition under concurrent requests. The clean approach uses a database-side conditional update:
async def reserve_credits(org_id: UUID, amount: int, session: AsyncSession) -> bool: """Atomically reserve credits if the org has enough. Returns False if insufficient.""" result = await session.execute( update(Organization) .where( Organization.id == org_id, Organization.credits_balance >= amount, ) .values(credits_balance=Organization.credits_balance - amount) .returning(Organization.credits_balance) ) return result.scalar() is not NonePostgreSQL’s row-level locking handles the concurrent case. The function returns True if the credit reservation succeeded, False if the org didn’t have enough credits.
After the call completes, write a UsageEvent. If the call failed, refund:
async def refund_credits(org_id: UUID, amount: int, session: AsyncSession) -> None: await session.execute( update(Organization) .where(Organization.id == org_id) .values(credits_balance=Organization.credits_balance + amount) )Monthly credit renewal
Stripe sends a customer.subscription.updated webhook on each renewal. Handle it:
@webhook_handler("customer.subscription.updated")async def on_subscription_renewed(event: stripe.Event, session: AsyncSession): sub = event.data.object org = await get_org_by_stripe_subscription(sub.id, session) plan = await get_plan_by_stripe_price_id(sub.items.data[0].price.id, session)
# Grant monthly credits - reset balance to plan allocation org.credits_balance = plan.monthly_credits_base org.credits_renewed_at = datetime.now(UTC) org.plan_code = plan.code
session.add(org) await session.commit()Critical implementation detail: validate the webhook signature before processing. Stripe sends a Stripe-Signature header; your webhook handler must verify it matches STRIPE_WEBHOOK_SECRET or you have a forgery vulnerability.
def verify_stripe_signature(payload: bytes, sig_header: str, secret: str) -> stripe.Event: try: return stripe.Webhook.construct_event(payload, sig_header, secret) except (ValueError, stripe.SignatureVerificationError) as e: raise HTTPException(status_code=400, detail=f"Invalid webhook: {e}")Per-organization quotas
For Business / Enterprise tiers, you want soft and hard caps:
- Soft cap - at 80% of monthly allocation, send an email
- Hard cap - at 100%, block further calls until renewal or top-up
Soft caps run as a Celery task on a schedule. Hard caps are enforced inline by the reserve_credits function - if credits_balance < amount, the API returns 402 Payment Required and the frontend shows a “Top up” CTA.
Anomaly detection
In 30+ production AI agent deployments, the single most common bill-shock issue is a buggy agent loop that calls itself recursively, consuming credits in seconds. Detect this:
async def detect_usage_anomaly(org_id: UUID, session: AsyncSession) -> bool: """Returns True if the org has consumed >5x its rolling 1-hour average in the last 5 minutes.""" last_hour = datetime.now(UTC) - timedelta(hours=1) last_5min = datetime.now(UTC) - timedelta(minutes=5)
hourly_avg = await session.execute( select(func.avg(UsageEvent.cost_credits)) .where(UsageEvent.organization_id == org_id, UsageEvent.created_at >= last_hour) ) recent_total = await session.execute( select(func.sum(UsageEvent.cost_credits)) .where(UsageEvent.organization_id == org_id, UsageEvent.created_at >= last_5min) )
avg = hourly_avg.scalar() or 0 total = recent_total.scalar() or 0 expected = avg * 5 / 60 # 5 minutes' worth at hourly avg rate return total > 5 * expected and total > 100 # 5x threshold + min volumeTrigger an alert + auto-throttle when this fires. AgentBoiler ships this pattern with Slack alerting hooks.
Stripe Customer Portal - let users manage themselves
For self-service plan changes, top-ups, cancellations, and invoice downloads, don’t build your own billing UI. Wire up the Stripe Customer Portal:
@router.post("/billing/portal-session")async def create_portal_session(user: CurrentUser, session: AsyncSession): org = await get_user_org(user.id, session) portal = stripe.billing_portal.Session.create( customer=org.stripe_customer_id, return_url=f"{settings.FRONTEND_URL}/billing", ) return {"url": portal.url}Redirect the user to the portal URL. Stripe handles plan upgrades, downgrades, card updates, invoice history. Webhooks come back to your customer.subscription.updated handler.
What AgentBoiler ships
All of the above is in AgentBoiler - the Organization / UsageEvent / Plan models, the atomic reserve_credits function, the webhook handlers with signature validation, the Customer Portal integration, the anomaly detection task, the soft/hard quota enforcement, and the admin dashboard view that shows usage by org with 30-day sparklines.
Four weeks of billing infrastructure, pre-wired and production-tested.