docs: define Team Huddle contract and activity acceptance criteria

This commit is contained in:
ي
2026-03-02 22:04:45 +05:30
parent cb6a3a8042
commit 79c2327861
2 changed files with 127 additions and 0 deletions

View File

@@ -79,3 +79,60 @@ The `RealTimeSemanticMonitor` service runs periodically (default: daily or on-de
1. **Polls SIF**: Checks for new indexed documents.
2. **Runs Agents**: Executes agent logic against the fresh index.
3. **Generates Alerts**: If a critical threshold is breached (e.g., Health < 50%), it sends a system notification.
---
## 🤝 Team Huddle
The SEO Dashboard includes a dedicated **Team Huddle** stream that translates agent orchestration into a user-readable operational timeline.
### Data Contract
Each huddle item conforms to a normalized event envelope so the widget, activity page, and notification system render the same source of truth.
| Contract Block | Required Fields | Notes |
|---|---|---|
| `status` | `agent_id`, `state`, `started_at`, `last_heartbeat_at` | `state` enum: `idle`, `running`, `blocked`, `waiting_approval`, `degraded`. |
| `run` | `run_id`, `workflow_type`, `trigger`, `started_at`, `ended_at`, `duration_ms`, `outcome` | `trigger` enum: `scheduled`, `manual`, `event_driven`. |
| `event` | `event_id`, `run_id`, `agent_id`, `event_type`, `severity`, `summary`, `created_at` | `event_type` enum: `insight`, `task`, `system`, `handoff`. |
| `alert` | `alert_id`, `event_id`, `threshold_key`, `threshold_value`, `observed_value`, `created_at`, `is_acknowledged` | Used by in-product banners and digest notifications. |
| `approval` | `approval_id`, `run_id`, `action_label`, `requested_by`, `requested_at`, `expires_at`, `approval_state` | `approval_state` enum: `pending`, `approved`, `rejected`, `expired`. |
### Refresh + Stream Semantics
- **Initial load**: fetch the latest 50 Team Huddle rows for the active workspace.
- **Near real-time stream**: server-sent events (SSE) push deltas every 1-3 seconds when new events exist.
- **Polling fallback**: if SSE disconnects, poll every 15 seconds with `since=<last_event_timestamp>`.
- **Ordering rule**: sort by `created_at DESC`, break ties using monotonically increasing `event_id`.
- **Idempotency**: clients de-duplicate using `event_id` to prevent duplicate cards during reconnect.
### Latency Targets
- **P50 ingest-to-display**: <= 2 seconds for `status` and `event` updates.
- **P95 ingest-to-display**: <= 5 seconds under normal load.
- **Critical alerts**: banner render in <= 3 seconds P95 after alert creation.
- **Approval state changes**: reflected in UI in <= 2 seconds P95.
### Failure + Fallback Behavior
- If stream transport fails, show a non-blocking "Live updates paused" badge and automatically switch to polling.
- If both stream and polling fail, keep last known data, mark timestamp as stale, and expose a "Retry" action.
- If huddle payload validation fails, quarantine invalid records and render a generic "system event" row instead of crashing the feed.
- If agent status heartbeats are missing for >2 intervals, render agent as `degraded` with tooltip context.
### User-Visible Detail Tiers + Security Constraints
- **Tier 1 (Overview)**: summary text, agent name, timestamp, severity color.
- **Tier 2 (Operational)**: run metadata (`run_id`, trigger, duration, outcome), alert thresholds, approval state.
- **Tier 3 (Debug/Admin)**: correlation IDs, raw payload excerpt, retry metadata, trace IDs.
- Access controls:
- Tier 1 is available to all workspace members.
- Tier 2 requires analyst/editor role.
- Tier 3 requires admin role and is excluded from exported reports by default.
- Sensitive fields (tokens, secrets, external auth headers, personal identifiers) must be redacted prior to persistence and never emitted in SSE payloads.
### Acceptance Criteria: View Full Team Activity
- "View Full Team Activity" opens a full-page activity timeline filtered to the currently selected date range and workspace.
- Expected row fields: `event_id`, `created_at`, `agent_id`, `event_type`, `severity`, `summary`, `run_id`, `workflow_type`, `outcome`, `approval_state` (if present), `alert_id` (if present).
- Interaction flow:
1. User clicks **View Full Team Activity** from Team Huddle widget.
2. System opens Activity page and preserves dashboard filters (date, agent, severity).
3. User expands a row to view Tier 2 details; admins can toggle Tier 3 diagnostics.
4. User can acknowledge alerts inline and approve/reject pending approvals where authorized.
5. Returning to Dashboard restores previous scroll position and active widget tab.
- Empty state behavior: show "No team activity in this range" plus quick actions to clear filters or jump to last 24 hours.

View File

@@ -114,6 +114,76 @@ The agents are visible to the user in three key areas:
---
## 🤝 Team Huddle (System Contract)
The Team Huddle is the canonical operational surface for multi-agent coordination. It must stay consistent across dashboard widget, notifications, and the full activity view.
### Event/Data Contract
All orchestration updates are emitted as typed records under a shared schema:
- **`status`**
- `agent_id`, `state`, `started_at`, `last_heartbeat_at`, `run_id?`
- State enum: `idle`, `running`, `blocked`, `waiting_approval`, `degraded`.
- **`run`**
- `run_id`, `workflow_type`, `trigger`, `started_at`, `ended_at`, `duration_ms`, `outcome`
- Trigger enum: `scheduled`, `manual`, `event_driven`.
- **`event`**
- `event_id`, `run_id`, `agent_id`, `event_type`, `severity`, `summary`, `created_at`, `metadata`
- Event type enum: `insight`, `task`, `system`, `handoff`.
- **`alert`**
- `alert_id`, `event_id`, `threshold_key`, `threshold_value`, `observed_value`, `created_at`, `is_acknowledged`
- **`approval`**
- `approval_id`, `run_id`, `action_label`, `requested_by`, `requested_at`, `expires_at`, `approval_state`
- Approval state enum: `pending`, `approved`, `rejected`, `expired`.
### Refresh + Stream Semantics
- Primary transport is SSE with incremental delivery for each record type.
- Clients bootstrap with latest N (default 50) records, then subscribe for deltas.
- On disconnect: exponential backoff reconnect; if retries exhausted, switch to 15-second polling.
- Feed ordering is deterministic by `created_at DESC`, tie-broken by `event_id`.
- Duplicate prevention uses idempotency key = `event_id` (`status` events key by `agent_id + last_heartbeat_at`).
### Latency SLOs
- P50 ingest-to-UI: <= 2s for status/event.
- P95 ingest-to-UI: <= 5s for all non-bulk events.
- Critical alert propagation: <= 3s P95.
- Approval decision reflection: <= 2s P95.
### Failure + Fallback Behavior
- If ingestion pipeline lags, emit synthetic `system` event with severity `warning` to inform users.
- If an agent misses two heartbeat windows, transition status to `degraded` and suspend dependent handoffs.
- If schema validation fails, route to dead-letter queue and emit sanitized `system` placeholder event.
- If transport unavailable, UI remains functional in read-only cached mode with manual refresh controls.
### User Detail Tiers + Security Constraints
- **Tier 1: Summary** — agent, summary, timestamp, severity.
- **Tier 2: Operational** — run context, thresholds, workflow outcome, approval state.
- **Tier 3: Diagnostic** — trace/correlation IDs, retry counters, raw sanitized metadata.
- Role mapping:
- Workspace Member -> Tier 1
- Analyst/Editor -> Tier 1-2
- Admin/Owner -> Tier 1-3
- Security rules:
- Secrets, credentials, API keys, and personal identifiers are redacted before persistence.
- Tier 3 data is never included in default exports or external webhook mirrors.
- Approval actions require explicit authorization and audit logging of actor + timestamp.
### Acceptance Criteria: View Full Team Activity
- The "View Full Team Activity" control navigates from widget to a dedicated timeline route and preserves filters.
- The timeline supports filtering by agent, event type, severity, status, and approval state.
- Minimum visible fields per row:
- `event_id`, `created_at`, `agent_id`, `event_type`, `severity`, `summary`
- `run_id`, `workflow_type`, `outcome`
- `alert_id` (when present), `approval_id` + `approval_state` (when present)
- Row expansion reveals Tier 2 details; Tier 3 panel is visible only for admin/owner roles.
- Inline interactions:
1. Acknowledge/unacknowledge alerts.
2. Approve/reject pending approval requests.
3. Jump from event row to related task/insight detail.
- Navigation continuity: returning to dashboard restores previous Team Huddle scroll position and active filters.
---
## 🚀 Future Roadmap
* **Inter-Agent Chat**: Allow agents to debate strategy (e.g., SEO Agent vs. Creative Agent).