docs: define Team Huddle contract and activity acceptance criteria
This commit is contained in:
@@ -79,3 +79,60 @@ The `RealTimeSemanticMonitor` service runs periodically (default: daily or on-de
|
||||
1. **Polls SIF**: Checks for new indexed documents.
|
||||
2. **Runs Agents**: Executes agent logic against the fresh index.
|
||||
3. **Generates Alerts**: If a critical threshold is breached (e.g., Health < 50%), it sends a system notification.
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Team Huddle
|
||||
|
||||
The SEO Dashboard includes a dedicated **Team Huddle** stream that translates agent orchestration into a user-readable operational timeline.
|
||||
|
||||
### Data Contract
|
||||
Each huddle item conforms to a normalized event envelope so the widget, activity page, and notification system render the same source of truth.
|
||||
|
||||
| Contract Block | Required Fields | Notes |
|
||||
|---|---|---|
|
||||
| `status` | `agent_id`, `state`, `started_at`, `last_heartbeat_at` | `state` enum: `idle`, `running`, `blocked`, `waiting_approval`, `degraded`. |
|
||||
| `run` | `run_id`, `workflow_type`, `trigger`, `started_at`, `ended_at`, `duration_ms`, `outcome` | `trigger` enum: `scheduled`, `manual`, `event_driven`. |
|
||||
| `event` | `event_id`, `run_id`, `agent_id`, `event_type`, `severity`, `summary`, `created_at` | `event_type` enum: `insight`, `task`, `system`, `handoff`. |
|
||||
| `alert` | `alert_id`, `event_id`, `threshold_key`, `threshold_value`, `observed_value`, `created_at`, `is_acknowledged` | Used by in-product banners and digest notifications. |
|
||||
| `approval` | `approval_id`, `run_id`, `action_label`, `requested_by`, `requested_at`, `expires_at`, `approval_state` | `approval_state` enum: `pending`, `approved`, `rejected`, `expired`. |
|
||||
|
||||
### Refresh + Stream Semantics
|
||||
- **Initial load**: fetch the latest 50 Team Huddle rows for the active workspace.
|
||||
- **Near real-time stream**: server-sent events (SSE) push deltas every 1-3 seconds when new events exist.
|
||||
- **Polling fallback**: if SSE disconnects, poll every 15 seconds with `since=<last_event_timestamp>`.
|
||||
- **Ordering rule**: sort by `created_at DESC`, break ties using monotonically increasing `event_id`.
|
||||
- **Idempotency**: clients de-duplicate using `event_id` to prevent duplicate cards during reconnect.
|
||||
|
||||
### Latency Targets
|
||||
- **P50 ingest-to-display**: <= 2 seconds for `status` and `event` updates.
|
||||
- **P95 ingest-to-display**: <= 5 seconds under normal load.
|
||||
- **Critical alerts**: banner render in <= 3 seconds P95 after alert creation.
|
||||
- **Approval state changes**: reflected in UI in <= 2 seconds P95.
|
||||
|
||||
### Failure + Fallback Behavior
|
||||
- If stream transport fails, show a non-blocking "Live updates paused" badge and automatically switch to polling.
|
||||
- If both stream and polling fail, keep last known data, mark timestamp as stale, and expose a "Retry" action.
|
||||
- If huddle payload validation fails, quarantine invalid records and render a generic "system event" row instead of crashing the feed.
|
||||
- If agent status heartbeats are missing for >2 intervals, render agent as `degraded` with tooltip context.
|
||||
|
||||
### User-Visible Detail Tiers + Security Constraints
|
||||
- **Tier 1 (Overview)**: summary text, agent name, timestamp, severity color.
|
||||
- **Tier 2 (Operational)**: run metadata (`run_id`, trigger, duration, outcome), alert thresholds, approval state.
|
||||
- **Tier 3 (Debug/Admin)**: correlation IDs, raw payload excerpt, retry metadata, trace IDs.
|
||||
- Access controls:
|
||||
- Tier 1 is available to all workspace members.
|
||||
- Tier 2 requires analyst/editor role.
|
||||
- Tier 3 requires admin role and is excluded from exported reports by default.
|
||||
- Sensitive fields (tokens, secrets, external auth headers, personal identifiers) must be redacted prior to persistence and never emitted in SSE payloads.
|
||||
|
||||
### Acceptance Criteria: View Full Team Activity
|
||||
- "View Full Team Activity" opens a full-page activity timeline filtered to the currently selected date range and workspace.
|
||||
- Expected row fields: `event_id`, `created_at`, `agent_id`, `event_type`, `severity`, `summary`, `run_id`, `workflow_type`, `outcome`, `approval_state` (if present), `alert_id` (if present).
|
||||
- Interaction flow:
|
||||
1. User clicks **View Full Team Activity** from Team Huddle widget.
|
||||
2. System opens Activity page and preserves dashboard filters (date, agent, severity).
|
||||
3. User expands a row to view Tier 2 details; admins can toggle Tier 3 diagnostics.
|
||||
4. User can acknowledge alerts inline and approve/reject pending approvals where authorized.
|
||||
5. Returning to Dashboard restores previous scroll position and active widget tab.
|
||||
- Empty state behavior: show "No team activity in this range" plus quick actions to clear filters or jump to last 24 hours.
|
||||
|
||||
@@ -114,6 +114,76 @@ The agents are visible to the user in three key areas:
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Team Huddle (System Contract)
|
||||
|
||||
The Team Huddle is the canonical operational surface for multi-agent coordination. It must stay consistent across dashboard widget, notifications, and the full activity view.
|
||||
|
||||
### Event/Data Contract
|
||||
All orchestration updates are emitted as typed records under a shared schema:
|
||||
|
||||
- **`status`**
|
||||
- `agent_id`, `state`, `started_at`, `last_heartbeat_at`, `run_id?`
|
||||
- State enum: `idle`, `running`, `blocked`, `waiting_approval`, `degraded`.
|
||||
- **`run`**
|
||||
- `run_id`, `workflow_type`, `trigger`, `started_at`, `ended_at`, `duration_ms`, `outcome`
|
||||
- Trigger enum: `scheduled`, `manual`, `event_driven`.
|
||||
- **`event`**
|
||||
- `event_id`, `run_id`, `agent_id`, `event_type`, `severity`, `summary`, `created_at`, `metadata`
|
||||
- Event type enum: `insight`, `task`, `system`, `handoff`.
|
||||
- **`alert`**
|
||||
- `alert_id`, `event_id`, `threshold_key`, `threshold_value`, `observed_value`, `created_at`, `is_acknowledged`
|
||||
- **`approval`**
|
||||
- `approval_id`, `run_id`, `action_label`, `requested_by`, `requested_at`, `expires_at`, `approval_state`
|
||||
- Approval state enum: `pending`, `approved`, `rejected`, `expired`.
|
||||
|
||||
### Refresh + Stream Semantics
|
||||
- Primary transport is SSE with incremental delivery for each record type.
|
||||
- Clients bootstrap with latest N (default 50) records, then subscribe for deltas.
|
||||
- On disconnect: exponential backoff reconnect; if retries exhausted, switch to 15-second polling.
|
||||
- Feed ordering is deterministic by `created_at DESC`, tie-broken by `event_id`.
|
||||
- Duplicate prevention uses idempotency key = `event_id` (`status` events key by `agent_id + last_heartbeat_at`).
|
||||
|
||||
### Latency SLOs
|
||||
- P50 ingest-to-UI: <= 2s for status/event.
|
||||
- P95 ingest-to-UI: <= 5s for all non-bulk events.
|
||||
- Critical alert propagation: <= 3s P95.
|
||||
- Approval decision reflection: <= 2s P95.
|
||||
|
||||
### Failure + Fallback Behavior
|
||||
- If ingestion pipeline lags, emit synthetic `system` event with severity `warning` to inform users.
|
||||
- If an agent misses two heartbeat windows, transition status to `degraded` and suspend dependent handoffs.
|
||||
- If schema validation fails, route to dead-letter queue and emit sanitized `system` placeholder event.
|
||||
- If transport unavailable, UI remains functional in read-only cached mode with manual refresh controls.
|
||||
|
||||
### User Detail Tiers + Security Constraints
|
||||
- **Tier 1: Summary** — agent, summary, timestamp, severity.
|
||||
- **Tier 2: Operational** — run context, thresholds, workflow outcome, approval state.
|
||||
- **Tier 3: Diagnostic** — trace/correlation IDs, retry counters, raw sanitized metadata.
|
||||
- Role mapping:
|
||||
- Workspace Member -> Tier 1
|
||||
- Analyst/Editor -> Tier 1-2
|
||||
- Admin/Owner -> Tier 1-3
|
||||
- Security rules:
|
||||
- Secrets, credentials, API keys, and personal identifiers are redacted before persistence.
|
||||
- Tier 3 data is never included in default exports or external webhook mirrors.
|
||||
- Approval actions require explicit authorization and audit logging of actor + timestamp.
|
||||
|
||||
### Acceptance Criteria: View Full Team Activity
|
||||
- The "View Full Team Activity" control navigates from widget to a dedicated timeline route and preserves filters.
|
||||
- The timeline supports filtering by agent, event type, severity, status, and approval state.
|
||||
- Minimum visible fields per row:
|
||||
- `event_id`, `created_at`, `agent_id`, `event_type`, `severity`, `summary`
|
||||
- `run_id`, `workflow_type`, `outcome`
|
||||
- `alert_id` (when present), `approval_id` + `approval_state` (when present)
|
||||
- Row expansion reveals Tier 2 details; Tier 3 panel is visible only for admin/owner roles.
|
||||
- Inline interactions:
|
||||
1. Acknowledge/unacknowledge alerts.
|
||||
2. Approve/reject pending approval requests.
|
||||
3. Jump from event row to related task/insight detail.
|
||||
- Navigation continuity: returning to dashboard restores previous Team Huddle scroll position and active filters.
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Future Roadmap
|
||||
|
||||
* **Inter-Agent Chat**: Allow agents to debate strategy (e.g., SEO Agent vs. Creative Agent).
|
||||
|
||||
Reference in New Issue
Block a user