Architecture

Technical architecture of the Digital Sovereignty Heatmap: monorepo layout, scoring engine, data model.

Inhoud in het Engels — pagina-navigatie blijft Nederlands.

Overview

Digital Sovereignty Heatmap is a web application that visualises sovereignty posture data. It is structured as a pnpm monorepo.

Workspace Layout

apps/web        → Next.js 15 (App Router) — UI and API routes
packages/core   → Shared types, domain constants, utility functions

Tech Stack

Layer	Technology
Frontend	React 19, Next.js 15 (App Router), Tailwind CSS, shadcn/ui
Data	SQLite (better-sqlite3), Zod schemas
Language	TypeScript (strict)
Package manager	pnpm (workspace)
Containerisation	Docker (multi-stage, non-root)
Reverse proxy	Caddy (auto-HTTPS)
CI/CD	GitHub Actions

Data Model

ER Diagram

┌─────────────┐       ┌─────────────┐       ┌──────────────────┐       ┌─────────────┐
│  categories │       │   vendors   │       │ scenario_vendors │       │  scenarios  │
├─────────────┤       ├─────────────┤       ├──────────────────┤       ├─────────────┤
│ slug (PK)   │──1──*─│ slug (PK)   │──*──*─│ vendor_slug (FK) │──*──1─│ slug (PK)   │
│ name        │       │ name        │       │ scenario_slug(FK)│       │ name        │
│ description │       │ description │       └──────────────────┘       │ description │
│ block       │       │ website     │                                   └─────────────┘
└─────────────┘       │ hq_country  │
                      │ hq_region   │
                      │ category_   │
                      │   slug (FK) │
                      │ data_hosting│
                      │   _regions  │
                      │ gdpr_       │
                      │   compliant │
                      │ open_source │
                      └─────────────┘

Relationships

Category → Vendors: One-to-many (a vendor belongs to one category)
Scenario ↔ Vendors: Many-to-many (via scenario_vendors junction table)

Key Design Choices

Decision	Choice	Rationale
SQLite library	`better-sqlite3`	Synchronous, fast, zero-config, ideal for read-only data
Primary keys	Slug-based (TEXT)	Human-readable, URL-friendly, stable
DB initialisation	Lazy singleton, auto-seeds on first access	No separate startup step needed
Schema validation	Zod (in `@ds-heatmap/core`)	Shared across monorepo, validates seed data at import time

Seeding the Database

Seed data lives in curated JSON files under apps/web/data/:

categories.seed.json — 20 categories (9 business + 11 personal)
vendors.seed.json — ~130 vendors
scenarios.seed.json — 5 scenarios (3 business + 2 personal)

Auto-seed

The database auto-seeds on first access when empty. Simply start the dev server or build.

Manual seed

# Seed (creates DB if it doesn't exist)
pnpm seed

# Force re-seed (deletes and recreates DB)
pnpm seed --force

The DB_PATH environment variable controls where the database file is stored (default: ./data/knowledgebase.db).

API Routes

Method	Path	Description
`GET`	`/api/health`	Health check
`GET`	`/api/categories`	List all categories
`GET`	`/api/vendors`	List vendors (supports `?category=` and `?q=` filters)
`GET`	`/api/scenarios`	List scenarios with vendor_slugs
`POST`	`/api/memo`	Generate board memo (Dutch) from assessment
`GET`	`/api/telemetry`	Check if telemetry collection is enabled
`POST`	`/api/telemetry`	Submit a telemetry event (increments counters)
`GET`	`/api/trends`	Aggregated k-anonymous trend data

Data Flow

Seed JSON → Zod validation → SQLite DB → API route handlers → JSON responses
                                              ↓
                              computeAssessment() → scores, heatmap, drivers

Scoring Engine

Pure TypeScript functions in packages/core/src/scoring/. No randomness, no side effects, no external calls. Deterministic: same input always produces the same output.

Dimensions (0 = sovereign, 100 = high risk)

ID	Name	Per-vendor logic
`jes`	Jurisdictional Exposure	EU→0, US→90, OTHER→50
`drs`	Data Residency	EU-only→0, mixed→50, no-EU→100
`cks`	Cryptographic Key Sovereignty	open-source→10; closed: EU→40, US→80, OTHER→60
`pls`	Platform Lock-in	open-source→20, closed→60
`srs`	Source & Runtime Sovereignty	Composite of open_source + HQ region + hosting (0–85)

Aggregation

Score each selected vendor per dimension
Average vendor scores within each category
Weight category scores using category weights (normalised)
Total = equal-weighted average of 5 dimension scores

Category Weights (default business)

Category	Weight
`cloud-iaas`	0.187
`identity`	0.168
`collaboration`	0.140
`cloud-paas`	0.112
`cybersecurity`	0.112
`devops`	0.093
`crm`	0.075
`project-management`	0.066
`analytics`	0.047

Weights are renormalised when not all categories are covered by selected vendors.

Risk Drivers (max 5, sorted by severity)

ID	Severity	Condition
`us-jurisdiction`	high	Any US-headquartered vendor selected
`core-control-outside-eu`	high	Non-EU vendor in IAM, Cloud, or Email categories
`vendor-managed-keys`	medium	Any non-open-source vendor (can't self-host keys)
`global-replication`	medium	Any vendor with non-EU-only hosting regions
`high-switching-cost`	medium	Non-open-source vendor in critical category

Blocks (Business vs Personal)

Categories are grouped into two blocks via the block field:

Block	Categories	Weights
`business`	cloud-iaas, cloud-paas, collaboration, crm, identity, analytics, devops, cybersecurity, project-management	Weighted by criticality (see above)
`personal`	email-personal, messaging, cloud-storage-personal, vpn, password-manager, browser, social-media, os-desktop, music-streaming, video-streaming, notes-tasks	Equal (11 × ~0.0909)

The useAssessment hook calls computeAssessment twice (once per block) to produce independent scores, heatmaps, and drivers.

Vendor Recommendation Engine

After scoring, generateVendorRecommendations() produces direct per-vendor recommendations. For each selected vendor with meaningful risk (score > 30), it finds the best alternative in the same category and generates a concrete Dutch-language reason.

generateVendorRecommendations(
  selectedVendors: Vendor[],
  allVendors: Vendor[],
): VendorRecommendation[]

Reasons are built by comparing selected vs alternative vendor attributes:

Condition	Reason fragment
US → EU	"Europees bedrijf, buiten bereik CLOUD Act"
US → CH	"Zwitsers bedrijf, sterke privacywetgeving"
closed → open-source	"open-source en zelf te hosten"
non-EU hosting → EU-only	"data blijft in de EU"
non-GDPR → GDPR	"volledig GDPR-compliant"

Fragments are combined with · separator.

Entry Point

computeAssessment(input: AssessmentInput): AssessmentResult

Returns: dimensions[], totalScore, heatmapCells[] (category × dimension), topDrivers[], metadata.

Key Decisions

Decision	Rationale
Next.js App Router	Server components by default, streaming, route handlers
pnpm workspace	Fast, disk-efficient, monorepo-native
Caddy	Zero-config HTTPS with automatic certificate management
Non-root container	Principle of least privilege
`output: "standalone"`	Minimal production image without dev dependencies
SQLite over Postgres	Read-only knowledgebase, zero-config, file-based — ideal for the MVP

Board Memo Generation

The /api/memo endpoint generates a structured Dutch-language board memo from an assessment.

Data Flow

POST /api/memo → Zod validation → resolve vendors from DB
  → computeAssessment() → generateMemo()
  → (Claude if enabled, else template) → JSON response
  → sessionStorage → /memo page (print-friendly)

Claude Integration (feature-flagged)

Setting	Value
Feature flag	`CLAUDE_ENABLED=true` env var
Model	`claude-haiku-4-5-20251001`
SDK	`@anthropic-ai/sdk` (dynamic import)
Fallback	Deterministic Dutch template (always available)

When CLAUDE_ENABLED=true and ANTHROPIC_API_KEY is set, the endpoint calls Claude to generate the memo. On any failure (network, malformed JSON, wrong structure), it silently falls back to the template. Prompt payloads are never logged.

Rate Limiting

In-memory token-bucket rate limiter (no Redis needed):

Parameter	Value
Burst capacity	5
Refill rate	0.1 tokens/sec (1 per 10s)
Key	Client IP (`x-forwarded-for` or `x-real-ip`)
Stale bucket cleanup	Every 5 minutes

Memo Structure

The memo contains: executive summary (5 lines), top risk drivers (up to 5), recommendations (3 quick wins, 3 mid-term, 2 strategic), and a disclaimer (2 lines). All text is in Dutch.

Privacy-Preserving Telemetry

Optional aggregated counters for community trend data. Disabled by default.

Data Flow

User opts in (localStorage) → client fires event → POST /api/telemetry
  → Zod validation against strict allowlists → increment metrics_counters
  → GET /api/trends → k-anonymity filter → /trends page

Design Principles

Principle	Implementation
No raw events	Server increments counters only; no event log table
Strict allowlists	Every field validated against fixed enums (event name, vendor slug, category slug, context values)
No PII	No IPs stored, no user agents, no sessions, no device fingerprints
Double opt-in	`TELEMETRY_ENABLED=true` (server) + localStorage toggle (user)
k-anonymity	Buckets with count < `K_ANON_THRESHOLD` (default 25) grouped as "Other" on /trends
Score bands	Raw scores bucketed into 3 bands (0-33, 34-66, 67-100) before transmission

Database Table

metrics_counters (
  metric_name TEXT NOT NULL,     -- e.g. "score_band", "vendor_selection", "sector"
  bucket_key  TEXT NOT NULL,     -- e.g. "0-33", "cloud-iaas:aws", "Finance"
  count       INTEGER NOT NULL,  -- aggregated counter
  updated_at  TEXT NOT NULL,     -- last increment timestamp
  PRIMARY KEY (metric_name, bucket_key)
)

Metric Types

metric_name	bucket_key format	Source event
`event`	event name	All events
`score_band`	`0-33` / `34-66` / `67-100`	`assessment_completed`
`sector`	sector name	`assessment_completed`
`data_classification`	classification name	`assessment_completed`
`system_criticality`	criticality level	`assessment_completed`
`eu_residency`	`yes` / `no`	`assessment_completed`
`vendor_selection`	`{category}:{vendor}`	`vendor_selected`
`score_band_memo`	`0-33` / `34-66` / `67-100`	`memo_generated`

Security Headers

Security headers are applied via Next.js middleware (src/middleware.ts):

CSP: restricts scripts, styles, fonts, images, and connections to 'self' and required font origins
X-Frame-Options: DENY (prevents clickjacking)
X-Content-Type-Options: nosniff
Referrer-Policy: strict-origin-when-cross-origin
Permissions-Policy: disables camera, microphone, geolocation

API routes additionally receive Cache-Control: no-store.

Request Limits

Endpoint	Body limit	Rate limit
`POST /api/memo`	16 KB	5 burst, 1 per 10s per IP
`POST /api/telemetry`	2 KB	20 burst, 1 per sec per IP

Release Workflow

Triggered by pushing a semver tag (v*):

git tag v0.1.0 → GitHub Actions:
  1. Build Docker image
  2. Trivy scan (fail on CRITICAL/HIGH)
  3. Push to ghcr.io
  4. Create GitHub Release with CHANGELOG excerpt

Future Considerations

Authentication
API versioning strategy
Caching strategy
Full-text search