Overview
Cost attribution gives platform admins a defensible answer to "which workspace is using what, and what is it costing us?". It is delivered by the WorkspaceUsageSnapshot model (WD.11), a once-a-day Celery task that takes a measurement per workspace, a configurable rate card sourced from the VIGILO_USAGE_UNIT_COSTS environment variable, and a UI at /platform/cost with a leaderboard, time-series charts, and CSV export.
The page lists every workspace, the snapshot metrics for the chosen period, the computed cost, and the percentage of total org spend that workspace represents. Sort by any column, switch the period (7 / 30 / 90 / 365 days), and export the raw rows for finance to load into a chargeback spreadsheet.
Why it exists
Multi-tenant platforms get bigger by accident. A team spins up a workspace for a one-off project, the project succeeds, the workspace fills with members, audit logs grow, integrations multiply, and three quarters later the workspace is responsible for 40% of platform load with no one tracking it. Cost attribution exposes that drift, lets finance issue accurate cross-charges, and gives the platform team data to negotiate a hosting upgrade with leadership before capacity becomes a fire.
Key concepts
- WorkspaceUsageSnapshot — One row per
(workspace, snapshot_date). Fields includemember_count,api_call_count_24h,storage_bytes,change_count_24h,incident_count_24h,monitored_host_count,audit_event_count_24h,webhook_delivery_count_24h,celery_task_count_24h. All counts are deltas for the previous 24h; gauges (members, storage, hosts) are point-in-time. - Daily snapshot task —
vigilo.platform.usage_snapshotruns at 00:10 UTC via Celery Beat. It iterates every active workspace, computes the metrics under a temporary RLS context, and upserts a row. The task is idempotent — re-running it for today simply overwrites today's row. - Rate card —
VIGILO_USAGE_UNIT_COSTSis a JSON env var:{"member_count": 5.00, "storage_gb": 0.10, "api_call": 0.000002, ...}. The cost engine multiplies each metric by its unit price; missing keys default to 0. Change the rate card any time; historical snapshots are unchanged but the displayed cost recomputes on the next page load. - Leaderboard — Default view. One row per workspace, sortable by any column, percentage-of-total bar in the rightmost column.
- Time-series chart — Click any workspace row to expand a 90-day sparkline per metric. Useful for spotting the moment a workspace started growing.
- CSV export — Export CSV button gives you the raw snapshot rows for the selected period. Columns match the model field names, plus computed
cost_usdper row. Open in Excel or pipe into a chargeback script.
Common workflows
1. Find your most expensive workspace this month
- Open Platform → Cost.
- Set the period to 30 days.
- Sort by Cost (USD) descending.
- The top row is your biggest contributor. Click to expand the time-series — if the slope is steep, that workspace is still growing; if flat, it's just naturally large.
2. Tune your rate card
- SSH to a Django host and update
VIGILO_USAGE_UNIT_COSTSin the environment. - Restart the Django/Gunicorn process (the rate card is read at request time on the cost page, so a restart picks up the new prices immediately).
- Reload the cost page — the Cost column now reflects the new pricing for every historical snapshot.
A reasonable starting rate card for an internal chargeback model:
{
"member_count": 5.00,
"storage_gb": 0.10,
"api_call": 0.000002,
"celery_task": 0.0001,
"monitored_host": 0.50,
"webhook_delivery": 0.00005
}
Adjust until the org total matches your actual hosting bill.
3. Export for finance
- Set the period to the calendar month you are billing for (use Custom range).
- Click Export CSV.
- The download
vigilo-usage-{YYYYMMDD}-{YYYYMMDD}.csvcontains one row per workspace per snapshot day, with raw counters and the computedcost_usd. - Pivot in Excel by
workspace_slugand sumcost_usdto get monthly chargeback per team.
4. Spot a runaway metric
- Sort the leaderboard by API calls / 24h.
- Click the top workspace to expand the sparkline.
- A vertical step in the chart is usually an integration misconfiguration — a polling loop with a 1-second interval, or a webhook fan-out without backoff. Open the workspace's Integrations page to find the culprit.
Permissions
Cost attribution is platform-admin only. The endpoint returns 403 with code='platform_admin_required' for everyone else.
| Action | Required role |
|---|---|
| View cost page | platform_admin |
| Edit rate card | platform_admin + shell access (it's env, not UI) |
| CSV export | platform_admin |
Per-workspace owners can see their own workspace's snapshots from Settings → Usage but cannot see peer workspaces.
Troubleshooting
The leaderboard is empty for today. The daily Celery snapshot task has not yet run. Check Celery Beat logs for vigilo.platform.usage_snapshot; manually trigger it with celery -A vigilo_celery call vigilo.platform.usage_snapshot to backfill.
Costs look way off. Check VIGILO_USAGE_UNIT_COSTS. The most common mistake is using api_call: 0.002 (per-thousand pricing) when the formula expects per-call. Divide by 1000 and reload.
Storage bytes are zero for every workspace. The storage probe uses pg_total_relation_size against the workspace's RLS-filtered rows. If the database user does not have pg_read_all_stats, the probe falls back to zero. Grant the role and re-run the snapshot task.
A workspace I just created has no snapshot. The task only runs at 00:10 UTC daily; new workspaces appear after the next run. Trigger the task manually to backfill immediately.
Roadmap note: real cloud billing integration
The current cost engine uses a unit rate card multiplied by Vigilo-side counters — it does not pull real spend from your cloud provider. Direct integration with AWS Cost & Usage Reports (CUR) and Google Cloud Billing Export is on the roadmap and will land as a second cost stream alongside the snapshot-based one. Until then, treat the displayed cost as an internal chargeback estimate, not an actual cloud bill.
Related articles
- Executive dashboard — the cost tile on the executive view links back here.
- Workspace templates — controlling sprawl by standardising what a new workspace looks like.