Files
consentos/CLAUDE.md
James Cottrill fbf26453f2 feat: initial public release
ConsentOS — a privacy-first cookie consent management platform.

Self-hosted, source-available alternative to OneTrust, Cookiebot, and
CookieYes. Full standards coverage (IAB TCF v2.2, GPP v1, Google
Consent Mode v2, GPC, Shopify Customer Privacy API), multi-tenant
architecture with role-based access, configuration cascade
(system → org → group → site → region), dark-pattern detection in
the scanner, and a tamper-evident consent record audit trail.

This is the initial public release. Prior development history is
retained internally.

See README.md for the feature list, architecture overview, and
quick-start instructions. Licensed under the Elastic Licence 2.0 —
self-host freely; do not resell as a managed service.
2026-04-14 09:18:18 +00:00

13 KiB
Raw Blame History

ConsentOS

Project Overview

ConsentOS is a multi-tenant cookie consent management platform — a source-available alternative to OneTrust, Cookiebot, and CookieYes — that provides cookie scanning, consent collection, auto-blocking, and compliance checking across many sites with per-site configuration.

The platform delivers a single <script> tag that site owners embed. This script handles consent collection, cookie blocking, IAB TCF v2.2, and Google Consent Mode v2 signalling. A separate admin dashboard allows site owners to manage configurations, review scan results, and check compliance.

Public repo: github.com/consentos/consentos Domain: consentos.dev

Architecture Summary

CDN (static assets)
├── consent-loader.js      (~2KB gzipped, sync bootstrap)
├── consent-bundle-{v}.js  (~25KB gzipped, full banner + blocker)
├── site-config-{id}.json  (cached site configuration)
└── translations-{locale}.json

Client Browser
├── Script Interceptor (MutationObserver + createElement override)
├── Cookie Blocker (document.cookie proxy, Storage proxy)
├── Banner UI (Shadow DOM, customisable, a11y-compliant)
├── TCF v2.2 API (__tcfapi)
├── Google Consent Mode v2 (gtag integration)
├── Client-side Cookie Reporter
└── Consent State Manager
        │
        ▼
API Layer (FastAPI)
├── Config API    — site/org CRUD, banner config, allow-lists, CDN publishing
├── Consent API   — consent recording, retrieval, TC string generation, analytics
├── Scanner API   — scan management, client-side cookie reports
└── Admin BFF     — aggregates the above for the admin UI

Scanner Service (Python + Playwright)
├── Scheduled headless browser crawls
├── Cookie discovery and script attribution
└── Auto-categorisation via known cookies DB

PostgreSQL — all persistent state
Redis      — caching, rate limiting, Celery job queue

Admin UI (Vite + React + TypeScript)
├── Site management, configuration editor
├── Cookie manager, allow-list management
├── Banner builder (visual editor with live preview)
├── Compliance checker (GDPR, CNIL, CCPA, ePrivacy, LGPD)
└── Analytics dashboard (consent rates, trends, regional)

Technology Stack

Backend (apps/api/)

  • Language: Python 3.12+
  • Framework: FastAPI
  • ORM: SQLAlchemy 2.0 (async)
  • Migrations: Alembic
  • Database: PostgreSQL 16
  • Cache/Queue: Redis + Celery
  • Auth: JWT (org-scoped, role-based)
  • Validation: Pydantic v2

Scanner (apps/scanner/)

  • Language: Python 3.12+
  • Browser automation: Playwright
  • Job scheduling: Celery + Redis

Banner Script (apps/banner/)

  • Language: TypeScript
  • Build: Rollup (outputs IIFE bundles)
  • UI isolation: Shadow DOM
  • Standards: IAB TCF v2.2, Google Consent Mode v2

Admin UI (apps/admin-ui/)

  • Framework: Vite + React + TypeScript
  • Primary UI: shadcn/ui + TailwindCSS
  • Complex components: MUI (DataGrid for tables, charts)
  • Server state: TanStack Query
  • Client state: Zustand
  • Routing: React Router v6
  • Forms: React Hook Form + Zod
  • i18n: react-i18next

Infrastructure

  • Containerisation: Docker / Docker Compose
  • Orchestration: Kubernetes (Helm chart)
  • CDN: Cloud-agnostic (CloudFlare, Cloud CDN, or CloudFront)

Project Structure

consent-platform/
├── apps/
│   ├── api/                          # FastAPI backend
│   │   ├── src/
│   │   │   ├── config/               # Pydantic settings, environment
│   │   │   ├── models/               # SQLAlchemy models
│   │   │   ├── schemas/              # Pydantic request/response schemas
│   │   │   ├── routers/              # API route handlers
│   │   │   │   ├── config.py         # site/org config endpoints
│   │   │   │   ├── consent.py        # consent recording/retrieval
│   │   │   │   ├── scanner.py        # scan management
│   │   │   │   ├── analytics.py      # analytics endpoints
│   │   │   │   ├── compliance.py     # compliance checker
│   │   │   │   └── auth.py           # authentication
│   │   │   ├── services/             # Business logic
│   │   │   │   ├── consent.py
│   │   │   │   ├── tcf.py            # TC string encoding/decoding
│   │   │   │   ├── gcm.py            # Google Consent Mode logic
│   │   │   │   ├── compliance.py     # Compliance rule engine
│   │   │   │   ├── publisher.py      # CDN publishing
│   │   │   │   └── classification.py # Cookie auto-categorisation
│   │   │   ├── db/                   # Database connection, session
│   │   │   └── main.py
│   │   ├── tests/
│   │   ├── alembic/
│   │   ├── pyproject.toml
│   │   └── Dockerfile
│   │
│   ├── scanner/                      # Cookie scanner service
│   │   ├── src/
│   │   │   ├── crawler.py            # Playwright-based crawler
│   │   │   ├── classifier.py         # Cookie classification
│   │   │   ├── scheduler.py          # Scan job scheduling
│   │   │   └── worker.py             # Celery worker
│   │   ├── Dockerfile
│   │   └── pyproject.toml
│   │
│   ├── admin-ui/                     # Vite + React + TS admin dashboard
│   │   ├── src/
│   │   │   ├── components/
│   │   │   ├── pages/
│   │   │   ├── hooks/
│   │   │   ├── api/                  # TanStack Query hooks
│   │   │   ├── stores/               # Zustand stores
│   │   │   └── i18n/
│   │   ├── package.json
│   │   ├── vite.config.ts
│   │   ├── tsconfig.json
│   │   └── tailwind.config.ts
│   │
│   └── banner/                       # Client-side consent banner
│       ├── src/
│       │   ├── loader.ts             # Lightweight bootstrap (~2KB)
│       │   ├── banner.ts             # Banner UI engine
│       │   ├── blocker.ts            # Script/cookie interceptor
│       │   ├── tcf.ts                # TCF v2.2 API implementation
│       │   ├── gcm.ts                # Google Consent Mode v2
│       │   ├── reporter.ts           # Client-side cookie reporter
│       │   ├── consent.ts            # Consent state management
│       │   ├── i18n.ts               # Translation loader
│       │   ├── a11y.ts               # Accessibility utilities
│       │   └── types.ts
│       ├── rollup.config.js
│       ├── package.json
│       └── tsconfig.json
│
├── packages/
│   └── shared/                       # Shared types, constants, utils
│
├── helm/consentos/                  # Kubernetes deployment
├── docker-compose.yml
├── Makefile
└── README.md

Key Data Entities

  • organisations — multi-tenant root, each org has multiple sites
  • users — org-scoped with roles: owner, admin, editor, viewer
  • sites — a domain being managed (e.g. example.com), belongs to an org
  • site_configs — full configuration per site: blocking mode, TCF settings, GCM defaults, banner config JSON, scan schedule, consent expiry
  • cookie_categories — taxonomy (necessary, functional, analytics, marketing, personalisation) with TCF purpose and GCM consent type mappings
  • cookies — discovered cookies per site with metadata, vendor, category, review status
  • cookie_allow_list — approved cookies per site with category assignment
  • known_cookies — shared knowledge base of known cookie patterns for auto-categorisation
  • consent_records — audit trail of every consent event (partitioned by month)
  • scan_jobs / scan_results — scanning pipeline state and results
  • translations — i18n strings per site per locale

Configuration Hierarchy

Configuration resolves in this order (each level overrides the previous):

System Defaults (code) → Organisation Defaults → Site Config → Regional Overrides

The site_configs.regional_modes JSONB field allows per-region blocking mode:

{"EU": "opt_in", "GB": "opt_in", "US-CA": "opt_out", "BR": "opt_in", "DEFAULT": "opt_in"}
  1. Site loads consent-loader.js (sync, before other scripts)
  2. Loader reads existing consent cookie — if valid, applies consent state and exits
  3. If no consent: installs script interceptor, blocks non-essential scripts/cookies
  4. Sets Google Consent Mode defaults (gtag('consent', 'default', {...}))
  5. Installs __tcfapi stub for TCF v2.2
  6. Async-loads full banner bundle + site config from CDN
  7. Banner displays; user interacts
  8. On consent action: generates TC string, sets first-party cookie, calls gtag('consent', 'update', {...}), releases blocked scripts by category
  9. POSTs consent record to Consent API for server-side audit storage
  10. Fires consent-change custom event + dataLayer push for GTM

Banner Script Architecture

The banner is split into two files for performance:

  • consent-loader.js (~2KB gzipped) — synchronous critical path: consent cookie read, GCM defaults, TCF stub, script interceptor installation, async bundle load
  • consent-bundle-{version}.js (~25KB gzipped) — full UI, consent engine, TCF encoder, reporter

The banner UI renders inside Shadow DOM for complete style isolation from the host site.

Display modes: overlay (full-screen modal), bottom_banner, top_banner, corner_popup, inline (into specific DOM element)

Auto-blocking works by:

  • Overriding document.createElement to intercept <script> tag creation
  • MutationObserver on <head> and <body> for dynamically inserted scripts
  • Proxying document.cookie setter to block writes from non-essential categories
  • Wrapping localStorage.setItem and sessionStorage.setItem
  • Maintaining a queue of blocked scripts, released per-category when consent is granted

Compliance Frameworks

The compliance engine is rule-based. Each framework is a set of ComplianceRule objects:

  • GDPR — opt-in, reject = accept prominence, granular consent, proof of consent, no cookie walls, no pre-ticked boxes
  • CNIL — all GDPR rules plus: Tout refuser on first layer, max 13-month cookie lifetime, max 6-month consent retention, re-consent every 6 months
  • CCPA/CPRA — opt-out model, Do Not Sell link, honour GPC signal, under-16 opt-in
  • ePrivacy — consent for non-essential, strictly necessary exempt
  • LGPD — consent or legitimate interest basis, identify data controller

Rules output: severity (critical/warning/info), message, recommendation. Aggregated into per-framework scores.

Coding Conventions

  • Language: British English throughout (code comments, UI strings, documentation)
  • Python: Use pyproject.toml, type hints everywhere, async where possible
  • SQL: CTEs over subqueries, no SELECT *, explicit column lists
  • TypeScript: strict mode, explicit return types on exported functions
  • Git: conventional commits (feat:, fix:, chore:, docs:)
  • Testing: pytest for Python, Vitest for TypeScript, aim for >80% coverage on services
  • API design: RESTful, Pydantic schemas for all request/response bodies, consistent error format
  • Database: UUID primary keys, created_at/updated_at timestamps on all tables, soft deletes where appropriate

Development Environment

# Start everything
docker compose up -d

# Run migrations
make migrate

# Seed default data (cookie categories, known cookies)
make seed

# Run tests
make test

# Lint
make lint

Services in Docker Compose:

  • api — FastAPI on port 8000
  • scanner — Playwright scanner service
  • postgres — PostgreSQL 16 on port 5432
  • redis — Redis on port 6379
  • admin-ui — Vite dev server on port 5173 (also dog-foods the banner)

Implementation Phases

Phase Scope
1 (Weeks 13) DB schema, FastAPI scaffold, auth, site CRUD, basic banner, consent API, Docker Compose
2 (Weeks 46) TCF v2.2, Google Consent Mode v2, script interceptor/auto-blocking, cookie categories, allow-list, config hierarchy, admin UI scaffold
3 (Weeks 78) Playwright crawler, auto-categorisation, client-side reporter, scan scheduling, admin UI for scans
4 (Weeks 910) Compliance rule engine (GDPR/CNIL/CCPA/ePrivacy/LGPD), consent analytics API, compliance + analytics admin UI
5 (Weeks 1112) Banner builder (visual editor), all display modes, full i18n, a11y audit, GeoIP, multi-domain, Helm chart, security hardening, load testing

Key External Standards