RBAC and workspace tenancy

Overview

Vigilo enforces tenant isolation at two layers: the application layer (every Django ViewSet inherits WorkspaceScopedMixin from workspaces.mixins) and the database layer (Postgres row-level security policies on every workspace-scoped table). The two enforce the same invariant from different sides: a session bound to workspace A can never see, modify, or delete rows from workspace B, even if the application code has a bug.

On top of that isolation sits the role model: six built-in workspace roles plus an arbitrary number of CustomRole rows with their own permission grants. Roles determine what you can do inside the workspace; tenancy determines which workspace you can do anything in at all.

The six built-in workspace roles

Role	Intent	Typical permissions
owner	Workspace lifecycle, billing, all settings	Everything, including delete workspace
admin	Day-to-day config and member management	Everything except billing and delete workspace
approver	Sign off on changes; CAB attendance	Approve/reject approval steps, attend CAB
engineer	Build and execute work	Create changes, incidents, runbooks, hosts
viewer	Read-only	List + read on operational tables; no actions
platform_admin	Cross-workspace operator (installation level)	All workspaces; executive dashboard; templates

Roles are stored on WorkspaceMembership.role. A user can hold different roles in different workspaces — being an owner of acme-prod does not grant any access in acme-staging. The platform_admin role is special: when present on any membership row, it grants installation-wide read across all workspaces.

CustomRole inheritance

When the six built-ins are not granular enough, owners and admins can define CustomRole rows. Each CustomRole has:

name — e.g. "Release Manager", "Cert Operator"
inherits_from — one of the built-in roles. The custom role starts with that role's grants.
grants — JSON list of permission keys to add (e.g. cert.download, change.bulk_approve)
revokes — JSON list of permission keys to remove from the inherited base

A membership row references either a built-in role (role field) or a custom role (custom_role FK). The permission check resolves them in the same code path — UserProfile.has_perm(perm, workspace) walks the role definition and returns True/False.

WorkspaceScopedMixin — the app-layer enforcement

WorkspaceScopedMixin lives in workspaces.mixins and is the most important enforcement primitive in the codebase. Every Django ViewSet that touches workspace-scoped data MUST inherit from it (CLAUDE.md rule #1). It does three things:

Extracts the workspace slug from the URL kwargs (/ws/{slug}/api/v1/...) and resolves it to a Workspace instance, attached to the request as request.workspace.
Filters every queryset returned by get_queryset() to qs.filter(workspace=request.workspace).
Forces serializer.save(workspace=request.workspace) on create, so users cannot post a row claiming to belong to a different workspace.

Missing the mixin on a new ViewSet is the single most common cause of cross-workspace data leaks during development. The test suite includes a meta-test that fails CI if a new ViewSet is registered without the mixin.

Postgres RLS — the DB-layer enforcement

Row-level security is enabled per-table via migrations (workspaces/migrations/0002_enable_rls.py originally; 0005, 0006, 0016 add more tables as new apps land). Each policy checks the GUC app.current_workspace_id:

CREATE POLICY workspace_isolation ON change_request
  USING (workspace_id::text = current_setting('app.current_workspace_id', true));

Additionally, certain tables use app.current_user_id to enforce per-user self-visible rows — e.g. notifications, MFA secrets, personal API tokens. The policy looks like:

CREATE POLICY user_self ON user_notification
  USING (user_id::text = current_setting('app.current_user_id', true));

This means a Django process operating as the workspace's RLS role can SELECT only rows belonging to the active workspace and (where applicable) the active user — even if the ORM forgets a filter.

RLSContextMiddleware

RLSContextMiddleware runs early in the request pipeline. It:

Resolves the active workspace from the URL.
Resolves the active user from the auth session.
Issues SET LOCAL app.current_workspace_id = '<uuid>' and SET LOCAL app.current_user_id = '<uuid>' on the request's database connection.
On request teardown, the SET LOCAL values vanish (they're transaction-scoped), so the connection is safe to reuse.

For Celery tasks, the same context is set explicitly at task start via the with_rls_context(workspace_id, user_id) decorator. Tasks that talk to the database without this decorator hit a connection where the GUCs are empty, and RLS returns zero rows — which is the safe failure mode.

Why both layers

App-layer filtering is fast and gives nice error messages. DB-layer RLS is the safety net. The argument for keeping both:

App layer alone — A bug in a custom raw SQL query or a missed .filter() call leaks data across workspaces. Has happened in industry; root cause is human.
DB layer alone — An adversary who can issue raw SQL still respects RLS, but the application stops giving useful error messages (the same data just isn't there) and complex ORM queries get slower (RLS predicate runs on every plan).
Both — The app filter does the heavy lifting at planning time; the RLS policy catches the bug the app filter doesn't. The performance overhead is negligible because the app filter already returns rows that satisfy the policy.

Common workflows

1. Add a new workspace-scoped table

Define the model with workspace = models.ForeignKey(Workspace, on_delete=models.CASCADE) and the standard uuid_pk, created_at, updated_at fields.
Build the ViewSet inheriting WorkspaceScopedMixin.
Register the URL under /ws/<slug>/api/v1/{app}/.
Add the table to the RLS policy migration — open workspaces/migrations/0002_enable_rls.py (or the latest RLS-additions migration) and add an ALTER TABLE ... ENABLE ROW LEVEL SECURITY plus CREATE POLICY block for the new table.
Write a workspace isolation test that creates a row in workspace A, switches the session to workspace B, and confirms the row is invisible (CLAUDE.md rule #6).

2. Create a custom role

Settings → Roles → New custom role.
Name it (e.g. "Cert Operator"), pick engineer as the inherited base, add cert.download and cert.bulk_renew to grants.
Save. Assign the role to a member via Settings → Members → Edit role.

3. Verify isolation manually

In a shell with two browser sessions logged in as users in two different workspaces, navigate to the same URL pattern with each workspace's slug. Each session sees only its own data, regardless of which user is technically authenticated.

Permissions

Action	Required role
Read any row in workspace	Member (any role)
Create/edit operational rows	Engineer or higher
Approve changes	Approver or higher
Manage members and roles	Admin or owner
Manage custom roles	Owner
Cross-workspace read	Platform admin

Troubleshooting

A user reports "I can see another team's data". This is a P0. The most likely cause is a new ViewSet missing WorkspaceScopedMixin. Run the meta-test (pytest backend/tests/meta/test_viewsets_are_scoped.py) to find it. RLS will still prevent damage if the affected table has its policy enabled.

Celery task returns empty querysets in prod. The task isn't setting the RLS GUC. Wrap the task body with @with_rls_context(workspace_id) or set the context manually at the top.

A custom role can't do something it should. Check revokes — it overrides grants. Also remember that the built-in approver role still respects SoD; even with a permission grant, an approver cannot approve a change they requested.

A platform admin can't see another workspace's data via the API. Cross-workspace endpoints (/api/v1/platform/...) are the only path. Workspace-scoped URLs (/ws/{slug}/api/v1/...) still require explicit membership in that workspace.

SAML and SSO — federated identity that populates roles.
SCIM provisioning — automated membership management.
Audit log — every permission check that mattered is recorded.