feat: initial public release
ConsentOS — a privacy-first cookie consent management platform. Self-hosted, source-available alternative to OneTrust, Cookiebot, and CookieYes. Full standards coverage (IAB TCF v2.2, GPP v1, Google Consent Mode v2, GPC, Shopify Customer Privacy API), multi-tenant architecture with role-based access, configuration cascade (system → org → group → site → region), dark-pattern detection in the scanner, and a tamper-evident consent record audit trail. This is the initial public release. Prior development history is retained internally. See README.md for the feature list, architecture overview, and quick-start instructions. Licensed under the Elastic Licence 2.0 — self-host freely; do not resell as a managed service.
This commit is contained in:
291
CLAUDE.md
Normal file
291
CLAUDE.md
Normal file
@@ -0,0 +1,291 @@
|
||||
# ConsentOS
|
||||
|
||||
## Project Overview
|
||||
|
||||
ConsentOS is a multi-tenant cookie consent management platform — a source-available alternative to OneTrust, Cookiebot, and CookieYes — that provides cookie scanning, consent collection, auto-blocking, and compliance checking across many sites with per-site configuration.
|
||||
|
||||
The platform delivers a single `<script>` tag that site owners embed. This script handles consent collection, cookie blocking, IAB TCF v2.2, and Google Consent Mode v2 signalling. A separate admin dashboard allows site owners to manage configurations, review scan results, and check compliance.
|
||||
|
||||
**Public repo:** [github.com/consentos/consentos](https://github.com/consentos/consentos)
|
||||
**Domain:** [consentos.dev](https://consentos.dev)
|
||||
|
||||
## Architecture Summary
|
||||
|
||||
```
|
||||
CDN (static assets)
|
||||
├── consent-loader.js (~2KB gzipped, sync bootstrap)
|
||||
├── consent-bundle-{v}.js (~25KB gzipped, full banner + blocker)
|
||||
├── site-config-{id}.json (cached site configuration)
|
||||
└── translations-{locale}.json
|
||||
|
||||
Client Browser
|
||||
├── Script Interceptor (MutationObserver + createElement override)
|
||||
├── Cookie Blocker (document.cookie proxy, Storage proxy)
|
||||
├── Banner UI (Shadow DOM, customisable, a11y-compliant)
|
||||
├── TCF v2.2 API (__tcfapi)
|
||||
├── Google Consent Mode v2 (gtag integration)
|
||||
├── Client-side Cookie Reporter
|
||||
└── Consent State Manager
|
||||
│
|
||||
▼
|
||||
API Layer (FastAPI)
|
||||
├── Config API — site/org CRUD, banner config, allow-lists, CDN publishing
|
||||
├── Consent API — consent recording, retrieval, TC string generation, analytics
|
||||
├── Scanner API — scan management, client-side cookie reports
|
||||
└── Admin BFF — aggregates the above for the admin UI
|
||||
|
||||
Scanner Service (Python + Playwright)
|
||||
├── Scheduled headless browser crawls
|
||||
├── Cookie discovery and script attribution
|
||||
└── Auto-categorisation via known cookies DB
|
||||
|
||||
PostgreSQL — all persistent state
|
||||
Redis — caching, rate limiting, Celery job queue
|
||||
|
||||
Admin UI (Vite + React + TypeScript)
|
||||
├── Site management, configuration editor
|
||||
├── Cookie manager, allow-list management
|
||||
├── Banner builder (visual editor with live preview)
|
||||
├── Compliance checker (GDPR, CNIL, CCPA, ePrivacy, LGPD)
|
||||
└── Analytics dashboard (consent rates, trends, regional)
|
||||
```
|
||||
|
||||
## Technology Stack
|
||||
|
||||
### Backend (`apps/api/`)
|
||||
- **Language:** Python 3.12+
|
||||
- **Framework:** FastAPI
|
||||
- **ORM:** SQLAlchemy 2.0 (async)
|
||||
- **Migrations:** Alembic
|
||||
- **Database:** PostgreSQL 16
|
||||
- **Cache/Queue:** Redis + Celery
|
||||
- **Auth:** JWT (org-scoped, role-based)
|
||||
- **Validation:** Pydantic v2
|
||||
|
||||
### Scanner (`apps/scanner/`)
|
||||
- **Language:** Python 3.12+
|
||||
- **Browser automation:** Playwright
|
||||
- **Job scheduling:** Celery + Redis
|
||||
|
||||
### Banner Script (`apps/banner/`)
|
||||
- **Language:** TypeScript
|
||||
- **Build:** Rollup (outputs IIFE bundles)
|
||||
- **UI isolation:** Shadow DOM
|
||||
- **Standards:** IAB TCF v2.2, Google Consent Mode v2
|
||||
|
||||
### Admin UI (`apps/admin-ui/`)
|
||||
- **Framework:** Vite + React + TypeScript
|
||||
- **Primary UI:** shadcn/ui + TailwindCSS
|
||||
- **Complex components:** MUI (DataGrid for tables, charts)
|
||||
- **Server state:** TanStack Query
|
||||
- **Client state:** Zustand
|
||||
- **Routing:** React Router v6
|
||||
- **Forms:** React Hook Form + Zod
|
||||
- **i18n:** react-i18next
|
||||
|
||||
### Infrastructure
|
||||
- **Containerisation:** Docker / Docker Compose
|
||||
- **Orchestration:** Kubernetes (Helm chart)
|
||||
- **CDN:** Cloud-agnostic (CloudFlare, Cloud CDN, or CloudFront)
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
consent-platform/
|
||||
├── apps/
|
||||
│ ├── api/ # FastAPI backend
|
||||
│ │ ├── src/
|
||||
│ │ │ ├── config/ # Pydantic settings, environment
|
||||
│ │ │ ├── models/ # SQLAlchemy models
|
||||
│ │ │ ├── schemas/ # Pydantic request/response schemas
|
||||
│ │ │ ├── routers/ # API route handlers
|
||||
│ │ │ │ ├── config.py # site/org config endpoints
|
||||
│ │ │ │ ├── consent.py # consent recording/retrieval
|
||||
│ │ │ │ ├── scanner.py # scan management
|
||||
│ │ │ │ ├── analytics.py # analytics endpoints
|
||||
│ │ │ │ ├── compliance.py # compliance checker
|
||||
│ │ │ │ └── auth.py # authentication
|
||||
│ │ │ ├── services/ # Business logic
|
||||
│ │ │ │ ├── consent.py
|
||||
│ │ │ │ ├── tcf.py # TC string encoding/decoding
|
||||
│ │ │ │ ├── gcm.py # Google Consent Mode logic
|
||||
│ │ │ │ ├── compliance.py # Compliance rule engine
|
||||
│ │ │ │ ├── publisher.py # CDN publishing
|
||||
│ │ │ │ └── classification.py # Cookie auto-categorisation
|
||||
│ │ │ ├── db/ # Database connection, session
|
||||
│ │ │ └── main.py
|
||||
│ │ ├── tests/
|
||||
│ │ ├── alembic/
|
||||
│ │ ├── pyproject.toml
|
||||
│ │ └── Dockerfile
|
||||
│ │
|
||||
│ ├── scanner/ # Cookie scanner service
|
||||
│ │ ├── src/
|
||||
│ │ │ ├── crawler.py # Playwright-based crawler
|
||||
│ │ │ ├── classifier.py # Cookie classification
|
||||
│ │ │ ├── scheduler.py # Scan job scheduling
|
||||
│ │ │ └── worker.py # Celery worker
|
||||
│ │ ├── Dockerfile
|
||||
│ │ └── pyproject.toml
|
||||
│ │
|
||||
│ ├── admin-ui/ # Vite + React + TS admin dashboard
|
||||
│ │ ├── src/
|
||||
│ │ │ ├── components/
|
||||
│ │ │ ├── pages/
|
||||
│ │ │ ├── hooks/
|
||||
│ │ │ ├── api/ # TanStack Query hooks
|
||||
│ │ │ ├── stores/ # Zustand stores
|
||||
│ │ │ └── i18n/
|
||||
│ │ ├── package.json
|
||||
│ │ ├── vite.config.ts
|
||||
│ │ ├── tsconfig.json
|
||||
│ │ └── tailwind.config.ts
|
||||
│ │
|
||||
│ └── banner/ # Client-side consent banner
|
||||
│ ├── src/
|
||||
│ │ ├── loader.ts # Lightweight bootstrap (~2KB)
|
||||
│ │ ├── banner.ts # Banner UI engine
|
||||
│ │ ├── blocker.ts # Script/cookie interceptor
|
||||
│ │ ├── tcf.ts # TCF v2.2 API implementation
|
||||
│ │ ├── gcm.ts # Google Consent Mode v2
|
||||
│ │ ├── reporter.ts # Client-side cookie reporter
|
||||
│ │ ├── consent.ts # Consent state management
|
||||
│ │ ├── i18n.ts # Translation loader
|
||||
│ │ ├── a11y.ts # Accessibility utilities
|
||||
│ │ └── types.ts
|
||||
│ ├── rollup.config.js
|
||||
│ ├── package.json
|
||||
│ └── tsconfig.json
|
||||
│
|
||||
├── packages/
|
||||
│ └── shared/ # Shared types, constants, utils
|
||||
│
|
||||
├── helm/consentos/ # Kubernetes deployment
|
||||
├── docker-compose.yml
|
||||
├── Makefile
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Key Data Entities
|
||||
|
||||
- **organisations** — multi-tenant root, each org has multiple sites
|
||||
- **users** — org-scoped with roles: owner, admin, editor, viewer
|
||||
- **sites** — a domain being managed (e.g. example.com), belongs to an org
|
||||
- **site_configs** — full configuration per site: blocking mode, TCF settings, GCM defaults, banner config JSON, scan schedule, consent expiry
|
||||
- **cookie_categories** — taxonomy (necessary, functional, analytics, marketing, personalisation) with TCF purpose and GCM consent type mappings
|
||||
- **cookies** — discovered cookies per site with metadata, vendor, category, review status
|
||||
- **cookie_allow_list** — approved cookies per site with category assignment
|
||||
- **known_cookies** — shared knowledge base of known cookie patterns for auto-categorisation
|
||||
- **consent_records** — audit trail of every consent event (partitioned by month)
|
||||
- **scan_jobs** / **scan_results** — scanning pipeline state and results
|
||||
- **translations** — i18n strings per site per locale
|
||||
|
||||
## Configuration Hierarchy
|
||||
|
||||
Configuration resolves in this order (each level overrides the previous):
|
||||
|
||||
```
|
||||
System Defaults (code) → Organisation Defaults → Site Config → Regional Overrides
|
||||
```
|
||||
|
||||
The `site_configs.regional_modes` JSONB field allows per-region blocking mode:
|
||||
```json
|
||||
{"EU": "opt_in", "GB": "opt_in", "US-CA": "opt_out", "BR": "opt_in", "DEFAULT": "opt_in"}
|
||||
```
|
||||
|
||||
## Consent Flow
|
||||
|
||||
1. Site loads `consent-loader.js` (sync, before other scripts)
|
||||
2. Loader reads existing consent cookie — if valid, applies consent state and exits
|
||||
3. If no consent: installs script interceptor, blocks non-essential scripts/cookies
|
||||
4. Sets Google Consent Mode defaults (`gtag('consent', 'default', {...})`)
|
||||
5. Installs `__tcfapi` stub for TCF v2.2
|
||||
6. Async-loads full banner bundle + site config from CDN
|
||||
7. Banner displays; user interacts
|
||||
8. On consent action: generates TC string, sets first-party cookie, calls `gtag('consent', 'update', {...})`, releases blocked scripts by category
|
||||
9. POSTs consent record to Consent API for server-side audit storage
|
||||
10. Fires `consent-change` custom event + dataLayer push for GTM
|
||||
|
||||
## Banner Script Architecture
|
||||
|
||||
The banner is split into two files for performance:
|
||||
|
||||
- **consent-loader.js** (~2KB gzipped) — synchronous critical path: consent cookie read, GCM defaults, TCF stub, script interceptor installation, async bundle load
|
||||
- **consent-bundle-{version}.js** (~25KB gzipped) — full UI, consent engine, TCF encoder, reporter
|
||||
|
||||
The banner UI renders inside **Shadow DOM** for complete style isolation from the host site.
|
||||
|
||||
**Display modes:** overlay (full-screen modal), bottom_banner, top_banner, corner_popup, inline (into specific DOM element)
|
||||
|
||||
**Auto-blocking works by:**
|
||||
- Overriding `document.createElement` to intercept `<script>` tag creation
|
||||
- `MutationObserver` on `<head>` and `<body>` for dynamically inserted scripts
|
||||
- Proxying `document.cookie` setter to block writes from non-essential categories
|
||||
- Wrapping `localStorage.setItem` and `sessionStorage.setItem`
|
||||
- Maintaining a queue of blocked scripts, released per-category when consent is granted
|
||||
|
||||
## Compliance Frameworks
|
||||
|
||||
The compliance engine is rule-based. Each framework is a set of `ComplianceRule` objects:
|
||||
|
||||
- **GDPR** — opt-in, reject = accept prominence, granular consent, proof of consent, no cookie walls, no pre-ticked boxes
|
||||
- **CNIL** — all GDPR rules plus: Tout refuser on first layer, max 13-month cookie lifetime, max 6-month consent retention, re-consent every 6 months
|
||||
- **CCPA/CPRA** — opt-out model, Do Not Sell link, honour GPC signal, under-16 opt-in
|
||||
- **ePrivacy** — consent for non-essential, strictly necessary exempt
|
||||
- **LGPD** — consent or legitimate interest basis, identify data controller
|
||||
|
||||
Rules output: severity (critical/warning/info), message, recommendation. Aggregated into per-framework scores.
|
||||
|
||||
## Coding Conventions
|
||||
|
||||
- **Language:** British English throughout (code comments, UI strings, documentation)
|
||||
- **Python:** Use `pyproject.toml`, type hints everywhere, async where possible
|
||||
- **SQL:** CTEs over subqueries, no `SELECT *`, explicit column lists
|
||||
- **TypeScript:** strict mode, explicit return types on exported functions
|
||||
- **Git:** conventional commits (`feat:`, `fix:`, `chore:`, `docs:`)
|
||||
- **Testing:** pytest for Python, Vitest for TypeScript, aim for >80% coverage on services
|
||||
- **API design:** RESTful, Pydantic schemas for all request/response bodies, consistent error format
|
||||
- **Database:** UUID primary keys, `created_at`/`updated_at` timestamps on all tables, soft deletes where appropriate
|
||||
|
||||
## Development Environment
|
||||
|
||||
```bash
|
||||
# Start everything
|
||||
docker compose up -d
|
||||
|
||||
# Run migrations
|
||||
make migrate
|
||||
|
||||
# Seed default data (cookie categories, known cookies)
|
||||
make seed
|
||||
|
||||
# Run tests
|
||||
make test
|
||||
|
||||
# Lint
|
||||
make lint
|
||||
```
|
||||
|
||||
Services in Docker Compose:
|
||||
- `api` — FastAPI on port 8000
|
||||
- `scanner` — Playwright scanner service
|
||||
- `postgres` — PostgreSQL 16 on port 5432
|
||||
- `redis` — Redis on port 6379
|
||||
- `admin-ui` — Vite dev server on port 5173 (also dog-foods the banner)
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
| Phase | Scope |
|
||||
|-------|-------|
|
||||
| 1 (Weeks 1–3) | DB schema, FastAPI scaffold, auth, site CRUD, basic banner, consent API, Docker Compose |
|
||||
| 2 (Weeks 4–6) | TCF v2.2, Google Consent Mode v2, script interceptor/auto-blocking, cookie categories, allow-list, config hierarchy, admin UI scaffold |
|
||||
| 3 (Weeks 7–8) | Playwright crawler, auto-categorisation, client-side reporter, scan scheduling, admin UI for scans |
|
||||
| 4 (Weeks 9–10) | Compliance rule engine (GDPR/CNIL/CCPA/ePrivacy/LGPD), consent analytics API, compliance + analytics admin UI |
|
||||
| 5 (Weeks 11–12) | Banner builder (visual editor), all display modes, full i18n, a11y audit, GeoIP, multi-domain, Helm chart, security hardening, load testing |
|
||||
|
||||
## Key External Standards
|
||||
|
||||
- **IAB TCF v2.2:** [IAB TCF Technical Specification](https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/TCFv2/IAB%20Tech%20Lab%20-%20Consent%20string%20and%20vendor%20list%20formats%20v2.md)
|
||||
- **Google Consent Mode v2:** [Google Developer Docs](https://developers.google.com/tag-platform/security/guides/consent)
|
||||
- **Global Vendor List (GVL):** Loaded from IAB, cached, updated regularly
|
||||
- **WCAG 2.1 AA:** Accessibility target for the banner UI
|
||||
Reference in New Issue
Block a user