Commit Graph

9 Commits

Author SHA1 Message Date
James Cottrill
d8e0a34e04 feat: account management — change email, password, and CLI reset (#10)
API:
- PATCH /auth/me — update email and display name
- PATCH /auth/me/password — change password (requires current)
- GET /auth/me now returns full profile (email, full_name, role)

CLI:
- python -m src.cli.reset_password --email <email> --password <pw>
  for recovery when locked out (run via docker exec)

Admin UI:
- User menu dropdown on the top nav (click username → Account /
  Sign out) replaces the inline sign-out link
- /account page with profile form (email + display name) and
  change password form (current + new + confirm)
2026-04-18 21:53:32 +01:00
James Cottrill
142e2373d3 feat: consent records page, tab persistence, and snippet copy fix (#9)
feat: consent records list endpoint and top-level admin page
2026-04-18 21:22:06 +01:00
James Cottrill
bebcf901f4 chore: remove compliance UI from admin dashboard (#8) 2026-04-18 20:33:20 +01:00
James Cottrill
e0f1dd43e8 fix(scanner): reliable cookie discovery, auto-categorisation, and scan scheduling UI (#7)
Scanner fixes:
- Remove conflicting ``path`` from consent pre-seed cookie (Playwright
  rejects cookies with both ``url`` and ``path``).
- Switch to ``networkidle`` + 5s + 2s delayed second-pass for reliable
  cookie capture.
- Check sitemap Content-Type to skip SPA HTML fallbacks.
- Propagate ``auto_category`` from scan results to the cookies table
  during sync (was silently dropped).
- Add ``_gcl_ls`` to the Open Cookie Database CSV.
- Classify ``_consentos_*`` cookies as necessary directly in the
  classification engine.
- Add ``seed_known_cookies`` to the bootstrap init container command.

Admin UI:
- Add scan schedule control to the Scans tab — preset options
  (disabled/daily/weekly/fortnightly/monthly) plus custom cron input.
  Saves ``scan_schedule_cron`` on the site config.
2026-04-18 20:14:32 +01:00
James Cottrill
bd465008e5 fix(banner): bridge blocker state loader↔bundle and sweep stale cookies on consent change (#4)
* fix(banner): bridge blocker state between loader and bundle

``consent-loader.js`` and ``consent-bundle.js`` are built as two
separate rollup IIFEs, so each inlines a private copy of
``blocker.ts``. The loader's copy is the one that actually matters
— it installs the ``document.cookie`` / ``Storage.prototype.setItem``
/ ``MutationObserver`` proxies, and those proxies close over the
loader's private ``acceptedCategories`` set and its
``blockedScripts`` queue.

The bundle used to ``import { updateAcceptedCategories } from './blocker'``
and call it from ``handleConsent``. That updated the **bundle's**
dead-end copy of the blocker module — a different ``Set`` object in
a different scope from the one the proxies read — so after the user
granted consent the loader's proxies stayed stuck on
``Set(['necessary'])``, any non-necessary cookie write kept being
silently dropped, and the loader's queue of blocked scripts was
never released.

Fix by exposing the loader's live ``updateAcceptedCategories`` on
``window.__consentos._updateBlocker`` right after
``installBlocker()``, and replacing the bundle's dead-end import
with a helper that calls through the bridge. The bundle no longer
imports from ``./blocker`` at all; rollup tree-shakes the bundle's
copy out, so ``consent-bundle.js`` gets slightly smaller.

Tests (``__tests__/blocker-bridge.test.ts``) cover:
* Bridge is called with the exact accepted list.
* Bridge is forwarded verbatim (no slug filtering).
* Missing bridge logs a warning and doesn't throw.
* Missing ``window.__consentos`` logs a warning and doesn't throw.

``vi.hoisted`` seeds ``window.__consentos`` before banner.ts's
module-level ``init()`` runs so the import-time IIFE doesn't throw
on the empty global.

* fix(banner): sweep disallowed cookies + storage on consent update

When the loader runs it now proactively deletes any pre-existing
cookies (and localStorage / sessionStorage keys) that classify into
a category the visitor hasn't consented to. Fixes the common case
of an ``_ga`` that slipped through before the loader was installed
— e.g. on a host page that loads the loader with ``async`` or
places another tracker above it in ``<head>`` — sitting there
forever because the blocker's only defence was a setter proxy on
future writes.

Sweep semantics:

- Runs on every ``updateAcceptedCategories`` call (consent narrowed
  → the just-revoked categories' cookies are wiped) plus an
  explicit call from the loader's "no consent yet" branch (initial
  visit with pre-existing trackers from elsewhere).
- Only deletes cookies / keys whose classifier hits a known
  pattern (``_ga``, ``_fbp``, ``intercom-*`` etc. — same lists as
  the proxied setters). Unknown / unclassified cookies are left
  alone: we can't attribute them and won't risk clobbering
  first-party session state.
- ``_consentos_*`` is always preserved.
- For cookies, we don't know the original ``domain`` / ``path``
  (``document.cookie`` doesn't expose them), so we fire deletes
  against every plausible domain variant — bare hostname, leading
  ``.``, and every parent domain walked up from the left. Covers
  the GA "set on ``.example.com`` from a subdomain page" case
  without over-deleting.
- Deletions bypass the proxied setter and go directly through the
  cached ``originalCookieDescriptor`` captured before the proxy was
  installed, so the blocker doesn't eat its own expiry writes.
- Storage access is wrapped in ``safeStorage`` — sandboxed /
  cross-origin iframes that throw on ``window.localStorage`` are
  skipped rather than crashing the loader.

Tests in ``__tests__/blocker.test.ts`` cover: non-consented analytics
cookies are deleted, consented ones are preserved, unknown cookies
survive, ``_consentos_*`` is untouched, revoking a category after
seeding new cookies triggers a follow-up sweep, and localStorage
cleanup follows the same rules. 6 new cases, 373 passing total in
the banner suite.
2026-04-14 17:30:02 +01:00
James Cottrill
0fbe2717f2 fix(scanner): pre-seed ConsentOS consent so crawls see post-consent state (#2)
* fix(scanner): pre-seed accepted ConsentOS consent before crawling

A site running ConsentOS exposes one set of cookies before consent
(strictly necessary only) and a much larger set after the visitor
accepts analytics/marketing/personalisation. The scanner is meant to
answer "what does this site actually load?" — but because the crawler
clears cookies and navigates without ever interacting with the
banner, every scan returned the pre-consent view. Useful for spotting
trackers that fire before consent (which is what
``consent_validator.py`` does), useless for the cookie inventory the
admin UI exists to display.

Plant ``_consentos_consent`` on the browser context with all
categories accepted before ``page.goto``. The cookie payload mirrors
``apps/banner/src/consent.ts:writeConsent`` exactly (URL-encoded
``ConsentState`` JSON, ``Lax`` SameSite, year-long expiry) so the
loader's ``readConsent`` short-circuits straight to
``updateAcceptedCategories(['necessary','functional','analytics',
'marketing','personalisation'])`` — the blocker is bypassed and the
crawl sees what the visitor would see.

Pre-consent compliance checks live in ``consent_validator.py`` and
use a separate code path; this change only touches the cookie
inventory crawl.

* style: ruff format crawler.py
2026-04-14 14:05:35 +01:00
James Cottrill
8d15ec4398 Per-site configurable cookie categories (#3)
* feat: per-site configurable cookie categories

Operators can now choose which cookie categories the banner displays
for a given site — useful for sites that genuinely don't use
e.g. marketing cookies and shouldn't be forced to show the toggle.

**Backend**

* New ``enabled_categories`` JSONB column on ``site_configs``,
  ``site_group_configs``, and ``org_configs`` (migration 0003).
  NULL at a level means "inherit"; an explicit list overrides.
* ``config_resolver`` merges ``enabled_categories`` through the
  existing cascade (system → org → group → site) and normalises
  the result via ``_normalise_enabled_categories``:
  - Unknown slugs stripped.
  - ``necessary`` is forced in regardless of the operator's input
    — it's never optional.
  - Empty / invalid values fall back to the full five-category
    default so a cleared field doesn't silently hide the banner.
  - Output is returned in canonical display order so insertion
    order from the cascade doesn't leak into the UI.
* ``build_public_config`` surfaces ``enabled_categories`` to the
  banner-facing public config endpoint.
* Schemas for site/group/org config create + update + response all
  include the new field.

**Banner**

* ``apps/banner/src/banner.ts`` replaces the hard-coded
  ``ALL_CATEGORIES`` / ``NON_ESSENTIAL`` constants with a runtime
  ``resolveEnabledCategories(config)`` helper. ``renderCategories``
  takes the enabled list and only renders toggles for those
  categories; ``nonEssentialFor(enabled)`` derives the user-toggleable
  subset. Falls back to all five when the field is missing in the
  config payload so older banner bundles against newer APIs (and
  vice versa) don't break.
* ``SiteConfig`` type in ``apps/banner/src/types.ts`` has
  ``enabled_categories?: CategorySlug[]`` to match.

**Admin UI**

* New ``SiteCategoriesTab`` component — five checkboxes, ``necessary``
  locked on, with "Reset to inherited" to clear the site override.
  Wired in as a new core tab on ``SiteDetailPage`` between
  Configuration and Cookies.
* ``SiteConfig`` type in ``types/api.ts`` declares ``enabled_categories``
  and a new ``ALL_COOKIE_CATEGORIES`` constant exposing label/description
  metadata shared between the tab component and any future display of
  the list.

**Semantics of a disabled category**

When the operator unticks e.g. ``marketing`` for a site:

* The toggle is not rendered in the banner.
* A visitor can never grant consent for ``marketing``.
* Any cookie or script that classifies into ``marketing`` stays
  blocked permanently by the auto-blocker.

That's the correct behaviour for sites that genuinely don't use a
category: declare it, hide it from the visitor, have the blocker
enforce it.

**Tests**

* ``test_config_resolver.py`` — 13 new cases covering the full
  cascade, ``necessary`` forcing, unknown-slug stripping, empty /
  non-list values, canonical display order, and the public-config
  surface. 37 passed total.
* ``test_SiteCategoriesTab.test.tsx`` — renders all five, locks
  ``necessary``, pre-fills from an override, saves the explicit
  list, and resets to inherited by sending NULL. 6 cases.
* Full API suite (610) and admin-ui suite (139) both green;
  banner bundle builds cleanly with 363 tests passing.

* style: ruff format config_resolver.py
2026-04-14 14:05:31 +01:00
James Cottrill
84e41857c3 Bundle banner into admin-ui image and add prod docker-compose (#1)
* fix: bundle banner into admin-ui image and serve at origin root

The loader at apps/banner/src/loader.ts derives the bundle URL from
its own origin, not its directory, so ``consent-loader.js`` and
``consent-bundle.js`` must live at the web root rather than under a
sub-path. The upstream admin-ui image never bundled the banner at
all, forcing deployment overlays to paper over the gap — and those
overlays misplaced the files under ``/banner/``.

Fold the banner build into ``apps/admin-ui/Dockerfile`` as an extra
stage, move its output to ``public/`` so Vite emits it at the image
root, and add CORS + caching rules for the two scripts in
``nginx.conf`` ahead of the SPA fallback. Switch the root
``docker-compose.yml`` build context to the repo root (with the
dockerignore trimmed accordingly) so one image now covers admin + CDN.

Also drop the published sourcemap for ``consent-bundle.js`` — the
bundle is minified and cross-origin, shipping a map to anyone
inspecting a customer page isn't something we want.

* feat: add docker-compose.prod.yml for single-host deployment

Add a production-targeted compose file alongside the existing dev one.
Operators running ConsentOS on a single host (the OSS quick-start
path) now have a canonical compose to point ``-f`` at, instead of
hand-rolling overlays in their deployment repo.

Differences from ``docker-compose.yml`` (dev) — see the file header
for the full list, but the load-bearing ones are:

* A one-shot ``consentos-bootstrap`` init container owns alembic
  migrations and the initial-admin provisioning. Every long-running
  service that touches the database waits for it via
  ``service_completed_successfully``.
* Postgres credentials and Redis password come from the ``.env``
  file rather than being hardcoded; the dev compose keeps the
  ``consentos:consentos`` defaults so ``make up`` still just works.
* All host-bound ports are scoped to ``127.0.0.1`` so a reverse
  proxy on the host (Caddy in the reference deployment) can
  terminate TLS in front of them.
* The scanner gets a scoped ``environment:`` block instead of
  ``env_file: .env``. Sharing the env file caused vars like
  ``PORT`` to leak into ``ScannerSettings`` and rebind the service
  off its default ``8001``, which silently broke
  ``SCANNER_SERVICE_URL`` for the worker.
* ``shm_size: 1gb`` on the scanner — Playwright/Chromium crashes
  under the default 64 MB ``/dev/shm`` on heavy pages.
* ``consentos-admin`` builds with the repo root as the context so
  the upstream ``apps/admin-ui/Dockerfile`` (added in the previous
  commit) can pull ``apps/banner/`` in alongside ``apps/admin-ui/``
  and bundle ``consent-loader.js`` / ``consent-bundle.js`` at the
  nginx root.
* Per-service ``mem_limit`` and dependency-aware healthchecks so
  ``docker compose up -d`` gives a consistent, observable start.
2026-04-14 13:03:36 +01:00
James Cottrill
fbf26453f2 feat: initial public release
ConsentOS — a privacy-first cookie consent management platform.

Self-hosted, source-available alternative to OneTrust, Cookiebot, and
CookieYes. Full standards coverage (IAB TCF v2.2, GPP v1, Google
Consent Mode v2, GPC, Shopify Customer Privacy API), multi-tenant
architecture with role-based access, configuration cascade
(system → org → group → site → region), dark-pattern detection in
the scanner, and a tamper-evident consent record audit trail.

This is the initial public release. Prior development history is
retained internally.

See README.md for the feature list, architecture overview, and
quick-start instructions. Licensed under the Elastic Licence 2.0 —
self-host freely; do not resell as a managed service.
2026-04-14 09:18:18 +00:00