Multi-tenant credits system for AI SaaS - full implementation

How to wire a usage-tracked credits system across Stripe subscriptions, per-organization quotas, and anomaly detection - without hand-rolling 4 weeks of billing infrastructure.

Multi-tenant credits system for AI SaaS - full implementation
Table of contents

A credits system for an AI SaaS is hybrid billing: customers pay a base subscription that includes a monthly credit allocation, and consume credits per agent action. Done right, it gives you predictable revenue plus usage-aligned costs. Done wrong, it’s a webhook signature debugging nightmare.

This post walks through the implementation patterns we ship in AgentBoiler.

Why credits, not per-seat

Per-seat pricing breaks for AI products because usage is wildly uneven across users in the same org. One power user can consume 10x what their teammates do. Per-seat pricing either undercharges the heavy users (losing margin) or overcharges the light users (driving them away).

Credits align cost to value. A user who runs 100 agent queries pays more than one who runs 5 - but they’re on the same plan, in the same org, with the same features. The org pays for total consumption.

The data model

Three tables anchor a credits system:

# Organization - the billing entity
class Organization(Base):
id: UUID
stripe_customer_id: str
stripe_subscription_id: str | None
plan_code: str # "hobby" | "pro" | "business"
credits_balance: int # current available credits
credits_renewed_at: datetime # last monthly grant
# Usage event - every credit-consuming action
class UsageEvent(Base):
id: UUID
organization_id: UUID # FK to Organization
user_id: UUID
event_type: str # "agent_call" | "embedding" | "document_parse"
cost_credits: int
metadata: JSON # model used, token counts, etc.
created_at: datetime
# Plan - what each tier includes
class Plan(Base):
code: str # "hobby" | "pro" | "business"
display_name: str
stripe_price_id: str
monthly_credits_base: int # credits granted on each renewal
features: JSON # ["multi_workspace", "priority_queue", ...]

The credits_balance on Organization is the single source of truth - fast lookups, cheap to decrement. UsageEvent is the audit trail and the source for analytics.

Decrementing credits atomically

Every agent call needs to:

  1. Check the org has enough credits
  2. Reserve / decrement the credits
  3. Execute the call
  4. Record a UsageEvent
  5. If the call failed, refund the credits

The naive approach (read → check → decrement) has a race condition under concurrent requests. The clean approach uses a database-side conditional update:

async def reserve_credits(org_id: UUID, amount: int, session: AsyncSession) -> bool:
"""Atomically reserve credits if the org has enough. Returns False if insufficient."""
result = await session.execute(
update(Organization)
.where(
Organization.id == org_id,
Organization.credits_balance >= amount,
)
.values(credits_balance=Organization.credits_balance - amount)
.returning(Organization.credits_balance)
)
return result.scalar() is not None

PostgreSQL’s row-level locking handles the concurrent case. The function returns True if the credit reservation succeeded, False if the org didn’t have enough credits.

After the call completes, write a UsageEvent. If the call failed, refund:

async def refund_credits(org_id: UUID, amount: int, session: AsyncSession) -> None:
await session.execute(
update(Organization)
.where(Organization.id == org_id)
.values(credits_balance=Organization.credits_balance + amount)
)

Monthly credit renewal

Stripe sends a customer.subscription.updated webhook on each renewal. Handle it:

@webhook_handler("customer.subscription.updated")
async def on_subscription_renewed(event: stripe.Event, session: AsyncSession):
sub = event.data.object
org = await get_org_by_stripe_subscription(sub.id, session)
plan = await get_plan_by_stripe_price_id(sub.items.data[0].price.id, session)
# Grant monthly credits - reset balance to plan allocation
org.credits_balance = plan.monthly_credits_base
org.credits_renewed_at = datetime.now(UTC)
org.plan_code = plan.code
session.add(org)
await session.commit()

Critical implementation detail: validate the webhook signature before processing. Stripe sends a Stripe-Signature header; your webhook handler must verify it matches STRIPE_WEBHOOK_SECRET or you have a forgery vulnerability.

def verify_stripe_signature(payload: bytes, sig_header: str, secret: str) -> stripe.Event:
try:
return stripe.Webhook.construct_event(payload, sig_header, secret)
except (ValueError, stripe.SignatureVerificationError) as e:
raise HTTPException(status_code=400, detail=f"Invalid webhook: {e}")

Per-organization quotas

For Business / Enterprise tiers, you want soft and hard caps:

  • Soft cap - at 80% of monthly allocation, send an email
  • Hard cap - at 100%, block further calls until renewal or top-up

Soft caps run as a Celery task on a schedule. Hard caps are enforced inline by the reserve_credits function - if credits_balance < amount, the API returns 402 Payment Required and the frontend shows a “Top up” CTA.

Anomaly detection

In 30+ production AI agent deployments, the single most common bill-shock issue is a buggy agent loop that calls itself recursively, consuming credits in seconds. Detect this:

async def detect_usage_anomaly(org_id: UUID, session: AsyncSession) -> bool:
"""Returns True if the org has consumed >5x its rolling 1-hour average in the last 5 minutes."""
last_hour = datetime.now(UTC) - timedelta(hours=1)
last_5min = datetime.now(UTC) - timedelta(minutes=5)
hourly_avg = await session.execute(
select(func.avg(UsageEvent.cost_credits))
.where(UsageEvent.organization_id == org_id, UsageEvent.created_at >= last_hour)
)
recent_total = await session.execute(
select(func.sum(UsageEvent.cost_credits))
.where(UsageEvent.organization_id == org_id, UsageEvent.created_at >= last_5min)
)
avg = hourly_avg.scalar() or 0
total = recent_total.scalar() or 0
expected = avg * 5 / 60 # 5 minutes' worth at hourly avg rate
return total > 5 * expected and total > 100 # 5x threshold + min volume

Trigger an alert + auto-throttle when this fires. AgentBoiler ships this pattern with Slack alerting hooks.

Stripe Customer Portal - let users manage themselves

For self-service plan changes, top-ups, cancellations, and invoice downloads, don’t build your own billing UI. Wire up the Stripe Customer Portal:

@router.post("/billing/portal-session")
async def create_portal_session(user: CurrentUser, session: AsyncSession):
org = await get_user_org(user.id, session)
portal = stripe.billing_portal.Session.create(
customer=org.stripe_customer_id,
return_url=f"{settings.FRONTEND_URL}/billing",
)
return {"url": portal.url}

Redirect the user to the portal URL. Stripe handles plan upgrades, downgrades, card updates, invoice history. Webhooks come back to your customer.subscription.updated handler.

What AgentBoiler ships

All of the above is in AgentBoiler - the Organization / UsageEvent / Plan models, the atomic reserve_credits function, the webhook handlers with signature validation, the Customer Portal integration, the anomaly detection task, the soft/hard quota enforcement, and the admin dashboard view that shows usage by org with 30-day sparklines.

Four weeks of billing infrastructure, pre-wired and production-tested.

Get AgentBoiler →