May 22, 2026

Fieldforce Phase 4 — Daily AI Briefings + At-Risk Delays Design

Phase 4 ships a daily AI-generated briefing per org, persisted as ff_briefings records and surfaced as a panel dashboard widget for managers and supervisors. A hybrid pipeline (deterministic heuristics shortlist → single LLM call ranks and reasons) keeps costs at one call per org per day, with full heuristic-only fallback when the AI gate denies or the LLM fails. A new ai_prompt_templates table lets platform admins tune LLM prompts in the DB without code PRs.

Predecessor Phase 3 — AI Task Creation + BYOK Foundation
ADRs 0016 · 0017 · 0019
New ADRs 0022 · 0023 (proposed)
Status Approved

View source markdown ↗ generated by claude · diagrams mermaid

Phase 4 ships a daily AI-generated briefing per org — a 2–4 sentence narrative plus a ranked at-risk task list — persisted as ff_briefings records and surfaced as a panel dashboard widget for managers and supervisors. A hybrid pipeline (deterministic heuristics shortlist → single LLM call ranks and writes reasoning) keeps costs at one call per org per day, with full heuristic-only fallback when the AI gate denies or the LLM fails. A new ai_prompt_templates table lets platform admins tune prompts in the DB without code PRs.

Phase 4 — builds on Phase 3 BYOK + AccessGate
Audience Managers (org-wide) · supervisors (team-scoped)
Surface Panel dashboard widget + in-app notification bell
LLM Cost 1 call / org / day via fieldforce:briefing
New Tables ff_briefings, ff_briefing_at_risk, ai_prompt_templates, ff_briefing_retention_runs
Kill-switch Single DB flag flip — cron, routes, and widget go dark instantly

Scope & Goals

What Phase 4 ships

Daily AI-generated briefing per org — ff_briefings table, panel widget, in-app notification.
At-risk task list inside each briefing — heuristic shortlist (≤ 15 items) ranked and reasoned by a single LLM call.
Two-tier feature flag fieldforce.briefing (ADR-0016): global kill-switch for platform admins, per-org toggle for org admins. Cron, route handlers, and panel widget all check the flag — if either tier is off, the feature is invisible end-to-end.
Config knobs via org_module_configs (ADR-0019): schedule time, timezone, retention days, at-risk thresholds, notification toggles.
AccessGate integration under key fieldforce:briefing (Phase 3 pattern): LLM call counted against org AI usage; gate denial → heuristic-only briefing (feature stays usable).
DB-stored prompt templates (ai_prompt_templates table in the AI module): versioned, activate-on-demand, platform-admin-only edit access. Briefing is the first consumer; future AI features inherit this capability free.

Success criteria

Manager opens the panel in the morning — sees a 2–4 sentence summary of overnight activity plus 3–10 ranked at-risk tasks with one-line LLM reasoning each.
Supervisor sees the same widget; at-risk list filtered to their team via the existing member-scan (ADR-0017).
Both feature-flag tiers respected: global OFF stops generation for all orgs; per-org OFF stops it for that org only.
AccessGate records fieldforce:briefing usage per org per day; LLM failures fall back to heuristic-only briefing without crashing the cron.

Approach Selected — A (Hybrid Generation)

Selected

A · Hybrid: one org-scope LLM briefing

LLM calls / org / day: 1
Supervisor fidelity: High on data (filtered at-risk); shared summary prose
Failure path: Natural — gate denial or LLM error → heuristic-only

One org-scope briefing. Supervisors get team-filtered at-risk items with the same summary_md. Predictable cost. Clean AccessGate integration (one feature row per org per day). Reuses the Phase 3 BYOK gate without modification.

Rejected

B · Per-scope LLM briefings

LLM calls / org / day: 1 + N teams
Supervisor fidelity: Highest — bespoke summary per team
Failure path: Per-scope

Cost scales with team count; redundant for small teams. Bespoke prose per team is not worth the expense at this stage.

Rejected

C · Heuristic-only with optional AI prose

LLM calls / org / day: 0–1
Supervisor fidelity: Same at-risk fidelity as A
Failure path: N/A

Loses contextual per-task LLM reasoning — a core value driver. Heuristic-only is the fallback path, not the design target.

Architecture & Module Layout

Decision: Briefing is a sub-feature of the existing fieldforce module — no new top-level module. Hexagonal layout matches the rest of fieldforce.

Go module file tree

backend/go/internal/modules/fieldforce/
  domain/entity/
    briefing.go              # Briefing aggregate (sections, generation status)
    at_risk.go               # AtRiskCandidate, AtRiskReason value types
  application/port/
    briefing_ports.go        # BriefingRepository, ClockPort
                             # (reuses existing AccessGate, Notifier, PromptRepository ports)
  application/usecase/
    generate_briefing.go     # cron entry: gather → heuristic → gate → fetch prompt → LLM → persist → notify
    at_risk_heuristics.go    # pure functions, no IO
    briefing_query.go        # latest / list / by-id with supervisor team filter
    briefing_prompts.go      # heuristic-only locale templates (NOT LLM prompts)
  adapter/inbound/http/
    briefing_handler.go      # GET /fieldforce/briefings/latest, /:id, /
  adapter/inbound/cron/
    briefing_cron.go         # BriefingJob: own 1-minute ticker + goroutine, sibling to OverdueJob
  adapter/outbound/persistence/postgresql/
    briefings_repo.go

backend/go/internal/modules/ai/              # NEW — generic prompt-template capability
  domain/entity/
    prompt_template.go        # PromptTemplate aggregate (id, feature_key, locale, version, template, is_active)
  application/port/
    prompt_ports.go           # PromptRepository, PromptAdminService
  application/usecase/
    prompt_service.go         # GetActive, CreateDraft, Activate, ListVersions, Diff
    prompt_service_test.go
  adapter/inbound/http/
    prompt_admin_handler.go   # platform-admin routes: list/diff/edit/activate
    prompt_admin_handler_test.go
  adapter/outbound/persistence/postgresql/
    prompt_templates_repo.go
    prompt_templates_repo_test.go

Why a separate BriefingJob goroutine? LLM calls are slow and unpredictable — they must not block the time-sensitive overdue/escalation passes. BriefingCron is a sibling goroutine to OverdueJob, started from app.go, exit-aware via context.

Reused infrastructure — no new dependencies

AccessGate (Phase 3): Every LLM call passes through it under key fieldforce:briefing. Resolves mode, enforces global AI kill-switch, validates plan caps, decrypts BYOK credentials, records per-decision and per-success rows.
Notifier port (Phase 1): New event constant NotificationBriefingGenerated. NotifyParams gains a generic Payload map[string]string field — existing call sites unaffected (new field is zero-valued). Task-shaped events continue using TaskID/TaskTitle; briefing events write Payload["briefing_id"], Payload["briefing_date"], Payload["at_risk_count_for_recipient"].
Feature flag middleware (ADR-0016): New FieldforceBriefingFeatureFlagMiddleware checks key='fieldforce.briefing' in both tiers. Cron also consults the same two flag tables before generating.
Supervisor scope (ADR-0017): Read path uses the existing member-scan helper for team filtering — no new mechanism.
org_module_configs (ADR-0019): New namespaced keys briefing.*. Go-hardcoded defaults merged on read; a partial row returns full defaults for unset fields.
Migrator pattern: New CreateBriefingTables() raw-SQL function in migrator.go — per Phase 3 / CONTEXT.md convention; no SQL files.

Generation data flow

Briefing generation pipeline — from cron tick to notification.

ADRs that emerge from this spec

ADR-0022 — "Daily Fieldforce Briefings as Stored Snapshots": captures why stored snapshots beat live computation — predictable cost, history browsability, idempotent cron, snapshot semantics align with team_ids denormalization. (Next available ADR — 0021 is table-naming-convention.)
ADR-0023 (optional) — "Hybrid Heuristic+LLM At-Risk Detection": only if grill/review surfaces the rule-vs-LLM boundary as non-obvious.

Data Model

Two new tables (ff_briefings parent, ff_briefing_at_risk child), one new feature-flag key, new org_module_configs keys, and one new AI-module table (ai_prompt_templates). No changes to ff_tasks, ff_activities, or any Phase 1–3 table.

ff_briefings — one row per (org, day)

Column	Type	Notes
`id`	uuid PK
`org_id`	text NOT NULL	multi-tenant scope; matches existing fieldforce convention
`briefing_date`	date NOT NULL	logical day in the org's timezone; UNIQUE with org_id
`generated_at`	timestamptz NOT NULL	when cron finished
`generation_mode`	text NOT NULL	`pending` (lease claimed) · `llm` · `heuristic_only`
`summary_md`	text NOT NULL	2–4 sentence narrative; markdown
`at_risk_count`	int NOT NULL	total candidates in the child table
`tasks_created`	int NOT NULL DEFAULT 0	created in last 24h (org local)
`tasks_completed`	int NOT NULL DEFAULT 0	transitioned to completed in last 24h
`tasks_approved`	int NOT NULL DEFAULT 0	transitioned to approved in last 24h
`tasks_overdue`	int NOT NULL DEFAULT 0	currently overdue at generation time
`notable_events`	jsonb NOT NULL DEFAULT '{}'	per-type breakdown: `{escalation_fired, task_rejected, cancelled_after_activity, approval_stalled, repeated_resubmissions}`. Frontend renders non-zero types.
`locale`	text NOT NULL	`en` · `zh` · `ms`
`created_at`	timestamptz NOT NULL DEFAULT now()
`updated_at`	timestamptz NOT NULL DEFAULT now()

Constraints: UNIQUE (org_id, briefing_date) — makes the cron idempotent; concurrent ticks resolve via uniqueness violation. Index on (org_id, briefing_date DESC) for latest lookups and history pagination.

briefing_date semantics: briefing_date is whatever the org's timezone says at the moment the briefing generates — full stop. No retroactive remapping if the org changes its timezone later; the existing row keeps its original date.

ff_briefing_at_risk — at-risk items per briefing

Column	Type	Notes
`id`	uuid PK
`briefing_id`	uuid NOT NULL FK	→ ff_briefings ON DELETE CASCADE
`task_id`	uuid NOT NULL	not a hard FK — snapshot may outlive task
`rank`	int NOT NULL	1 = highest risk
`heuristic_score`	int NOT NULL	0–100; deterministic
`heuristic_reasons`	jsonb NOT NULL	array of rule and flag keys that fired, e.g. `["overdue","stale_activity"]`
`llm_reason_text`	text NULL	one-sentence LLM reasoning; NULL when `generation_mode='heuristic_only'`
`assignee_user_ids`	text[] NOT NULL	denormalized for fast read rendering
`team_ids`	text[] NOT NULL	denormalized at generation time; supports supervisor team filter on read

Constraints: UNIQUE (briefing_id, task_id). Index on (briefing_id, rank) — read path orders BY rank ASC.

Why denormalize team_ids onto the at-risk row? Supervisor reads need a team filter. Denormalizing at generation time gives cheap reads and snapshot semantics — team membership at generation time is the correct semantic for a daily snapshot. Live-joining the member table on every supervisor read would always be fresh but breaks the daily-snapshot guarantee.

org_module_configs additions (module: fieldforce)

briefing.schedule_local_time: Default "08:00" — when daily briefing fires (24h, in org's timezone)
briefing.timezone: Default "UTC" — IANA timezone string
briefing.retention_days: Default 90 — retention sweep deletes older briefings
briefing.at_risk_thresholds.no_activity_days: Default 3
briefing.at_risk_thresholds.due_soon_hours: Default 24
briefing.at_risk_thresholds.shortlist_max: Default 15 — max candidates fed to LLM ranker
briefing.notify_managers: Default true
briefing.notify_supervisors: Default true — still suppressed for supervisors whose filtered at-risk count = 0
briefing.locale: Default "en" — language for LLM-generated prose; one per org (per-recipient locale is out of scope for v1)

ai_prompt_templates — versioned DB-managed prompts (new AI module)

Lives in the AI module schema alongside ai_configs, ai_plans, ai_models. Briefing is the first consumer; future AI features (and a follow-up to migrate fieldforce:parse_task) reuse it.

Column	Type	Notes
`id`	uuid PK
`feature_key`	text NOT NULL	matches AccessGate feature key, e.g. `fieldforce:briefing`
`locale`	text NOT NULL	`en` · `zh` · `ms`
`version`	int NOT NULL	monotonic per (feature_key, locale): 1, 2, 3, …
`template`	text NOT NULL	prompt body with `{{placeholder}}` syntax; rendered in Go with `text/template`
`notes`	text NULL	author's "what changed" note
`is_active`	bool NOT NULL DEFAULT false	partial unique index enforces exactly one active row per (feature_key, locale)
`created_by`	text NOT NULL	platform admin user_id
`created_at`	timestamptz NOT NULL DEFAULT now()

Constraints: UNIQUE (feature_key, locale, version). Partial unique index: CREATE UNIQUE INDEX ON ai_prompt_templates (feature_key, locale) WHERE is_active = true. Index on (feature_key, locale, is_active) for the hot read path. Result cached for the duration of one cron tick (60s).

DB is sole source of truth after seeding. Seed v1 rows from backend/go/internal/modules/ai/seed_prompts/fieldforce_briefing/{en,zh,ms}.tmpl during migration. Once seeded, there is no code-side LLM-prompt fallback — a stale code constant would diverge from the live prompt and is more dangerous than the well-tested heuristic-only branch.

Prompt activation workflow

Create draft
Platform admin edits a template → service creates a new row at version = max(version)+1, is_active = false.
Review
Admin checks placeholder coverage, locale fidelity, length sanity. Formal eval-CLI-gating deferred to a follow-up spec — staging dark-launch is the v1 validation path.
Activate
Transactional swap: previous active row set to is_active = false, new row to is_active = true. Partial-unique-index protects against concurrent activations. Audit log records before_version, after_version, activated_by.
Rollback any time
Prior versions remain queryable in the diff view. Activate any prior version for one-click rollback.

Prompt DB read failure path: If the active prompt row cannot be fetched (missing row, connection error), log loudly, call AccessGate.RecordError, skip the LLM call, and generate a heuristic-only briefing. Never attempt a code-constant fallback prompt.

Feature flag rows

admin_feature_flags(key='fieldforce.briefing', is_enabled=false) — seeded by the migration with default OFF. Platform admin flips it true per environment via /admin/feature-flags. This is the sole end-to-end on/off switch (cron, routes, widget).
org_feature_flags — no rows seeded; defaults to enabled per ADR-0016 once the global flag is on.

Notifications

No schema change. The existing notifications table gets a new event_type value briefing_generated with payload {briefing_id, briefing_date, at_risk_count_for_recipient} for deeplink and headline text.

Retention sweep

Runs as a second pass inside BriefingJob.runTick, stamp-gated to once-per-org-per-day using ff_briefing_retention_runs:

ff_briefing_retention_runs
  org_id        text PRIMARY KEY
  last_swept_at timestamptz NOT NULL

Per tick, per org: if now() - last_swept_at < 23h, skip. Otherwise DELETE FROM ff_briefings WHERE org_id = ? AND created_at < now() - retention_days * INTERVAL '1 day', then INSERT … ON CONFLICT (org_id) DO UPDATE SET last_swept_at = now(). ON DELETE CASCADE handles the at-risk child rows. Multi-replica safe: the ON CONFLICT … DO UPDATE is atomic.

Generation Pipeline

The cron mirrors Phase 2's overdue job: a single time.NewTicker(60*time.Second) in the fieldforce module, started from app.go, exit-aware via context. Each tick scans candidates and dispatches work serially per org. No work-queue infrastructure.

Tick loop (per minute)

Global flag check
If admin_feature_flags(key='fieldforce.briefing').is_enabled = false, tick exits immediately. No org scanning.
Find orgs due for generation
Repo query applies cheap SQL filters (both flag tiers on, LEFT JOIN existing briefings for briefing_date BETWEEN current_date - 1 AND current_date + 1). Go-side filter loads the IANA timezone, computes local date/time, skips if before schedule_local_time or if today's briefing already exists. This is Phase 4's template for per-org local-time scheduling.
Claim lease
INSERT a placeholder ff_briefings row with generation_mode='pending', empty summary_md, at_risk_count=0. The UNIQUE(org_id, briefing_date) constraint means only one replica wins per org per day. Losers catch the conflict and skip — no LLM call, no AccessGate usage row, no wasted spend. If the winner crashes mid-generation, the pending row persists; a recovery clause (WHERE generation_mode='pending' AND generated_at < now() - 5min) re-claims on a later tick.
Gather inputs
Pull open tasks (status NOT IN approved/cancelled), last 24h activities (in org local time), last 24h audit log entries for the five notable-event types. Compute summary metrics. Compute denormalized assignee team_ids from member table (single batched scan).
Heuristic shortlist
Pure function at_risk_heuristics.Score() — no IO. Sort by heuristic_score descending; ties broken by task age; take top shortlist_max (default 15). Tasks whose only triggers are context flags (no scoring rule fired) are NOT eligible for the shortlist — a context flag adds colour but does not surface a task by itself.
AccessGate authorize
Call AccessGate.Authorize(ctx, AuthzInput{OrgID, Feature: "fieldforce:briefing"}). Denied (trial exhausted, global AI kill-switch, plan caps, no plan) → skip LLM, produce heuristic-only briefing.
LLM call (if allowed)
Fetch active prompt via PromptRepository.GetActive("fieldforce:briefing", orgLocale) (cached for the tick). Single combined-schema call, JSON mode. Expected shape: {summary_md: string, ranked: [{task_id, reason}]}. Post-hoc validation: summary ≤ 1000 chars, every returned task_id in the input shortlist, reason ≤ 200 chars per item, reason must contain at least one synonym for one of the task's heuristic_reasons rule keys (case-insensitive substring; synonym lists live in at_risk_heuristics.go). Items failing synonym match are dropped — the row falls back to templated heuristic text. Whole response invalid (top-level shape broken, summary fails, or all items dropped) → one retry with stricter instruction; still invalid → heuristic-only. Always AccessGate.RecordUsage on success; AccessGate.RecordError on failure.
Heuristic-only fallback
Used when gate denies, LLM fails, or output is entirely invalid. summary_md = templated paragraph from briefing_prompts.go (one per locale), populated from metrics. ranked = heuristic shortlist sorted by score; llm_reason_text = NULL; UI renders templated reason from heuristic_reasons. generation_mode = 'heuristic_only'.
Finalize (single transaction)
UPDATE ff_briefings with final mode, metrics, summary, at_risk_count, generated_at. Batch INSERT ff_briefing_at_risk rows. All-or-nothing — failure leaves the row in pending for the recovery clause to retry.
Notify (after commit)
Loop recipients sequentially. Best-effort, no retry — matches Phase 2 OverdueJob precedent. Recipients: all org members with role admin/owner/manager (if notify_managers), all supervisors whose team has ≥ 1 at-risk item (if notify_supervisors). Payload: {briefing_id, briefing_date, at_risk_count_for_recipient} precomputed per recipient using denormalized team_ids.
Retention sweep
Second pass per org: check ff_briefing_retention_runs.last_swept_at; if ≥ 23h old, run DELETE and stamp. Most ticks bounce out at the stamp check immediately.

Heuristic scoring rules

Four scoring rules contribute additively to heuristic_score (gates shortlist inclusion and rank). Two context flags append to heuristic_reasons but add no score. overdue and due_soon are mutually exclusive by construction. Status values follow the fieldforce task state machine (ADR-0013).

overdue due_date < now AND status NOT IN (approved, cancelled) 40
due_soon 0 < due_date − now ≤ due_soon_hours AND status = pending 25
stale_activity last_activity_at + no_activity_days < now AND status = in_progress 20
approval_stuck status IN (completed, needs_revision) AND updated_at + 48h < now 15
assignee_overload assignee.active_tasks > 8 context flag — no score —
repeated_resubmissions task resubmitted ≥ 2 times context flag — no score —

Failure modes

Failure	Behavior
Global flag off	Tick exits at step 1; no work performed.
Org flag off	Org skipped at step 2; no record created.
AccessGate denies LLM	Heuristic-only briefing generated; gate records a denial event.
LLM call times out / errors	One retry; if still failing, heuristic-only; `AccessGate.RecordError`.
LLM returns invalid JSON	One retry with stricter prompt; if still invalid, heuristic-only.
LLM returns unknown task_ids	Drop unknown items; if all items dropped, heuristic-only.
LLM reason fails synonym match	Drop just that item; row renders templated heuristic text; `synonym_mismatch` counter increments. If all items dropped, heuristic-only.
Lease conflict (race between replicas)	UNIQUE violation on placeholder INSERT; loser skips org with no LLM/gate side-effects.
Lease holder crashes mid-generation	Row stays in `generation_mode='pending'`; recovery clause re-claims after 5 min on a later tick.
Finalize UPDATE error	Logged; row stays `pending`; recovery clause retries on a later tick.
Member-scan returns no teams	Continue with empty `team_ids`; supervisor filter excludes that task from supervisor view.

Cost story

One LLM call per org per day. At 100 active orgs × ~1k input + ~600 output tokens per call ≈ 160k tokens/day platform-wide. AccessGate enforces per-org caps automatically — a buggy schedule config is rate-limited before it causes runaway spend.

Read Path & API Surface

Three endpoints under the existing fieldforce route group, mounted behind the new FieldforceBriefingFeatureFlagMiddleware (checks key='fieldforce.briefing' in both tiers).

Method	Path	Purpose
GET	`/api/organizations/:org_id/fieldforce/briefings/latest`	Most recent briefing (today's, else latest stored)
GET	`/api/organizations/:org_id/fieldforce/briefings/:id`	One briefing by id — full `at_risk` array
GET	`/api/organizations/:org_id/fieldforce/briefings?cursor=…`	Paginated history — omits `at_risk` array; tap-through to by-id for the full list

Role-based read access

Admin / Owner / Manager

org-wide

Full at_risk array
visible_at_risk_count = total_at_risk_count
Full summary_md narrative

Supervisor

team-scoped

at_risk filtered: team_ids ∩ supervisor.team_ids ≠ ∅
total_at_risk_count stays org-wide; visible_at_risk_count reflects the filter
Same shared summary_md — Approach A trade-off
Filter uses current team membership (fresh from member-scan), so role changes take effect immediately on the snapshot

Field Team

403

Not their surface in Phase 4
Considered for Phase 5 (mobile personal briefing)

Response JSON shape

{
  "id": "...",
  "briefing_date": "2026-05-22",
  "generated_at": "2026-05-22T08:00:14Z",
  "generation_mode": "llm",
  "locale": "en",
  "summary_md": "In the last 24 hours, 12 tasks were created and 8 completed...",
  "metrics": {
    "tasks_created": 12,
    "tasks_completed": 8,
    "tasks_approved": 6,
    "tasks_overdue": 3,
    "notable_events": {
      "escalation_fired": 2,
      "task_rejected": 0,
      "cancelled_after_activity": 0,
      "approval_stalled": 0,
      "repeated_resubmissions": 0
    }
  },
  "at_risk": [
    {
      "task_id": "...",
      "rank": 1,
      "heuristic_score": 85,
      "heuristic_reasons": ["overdue", "stale_activity"],
      "llm_reason_text": "Overdue by 2 days with no activity since assignment.",
      "assignee_user_ids": ["..."],
      "team_ids": ["..."]
    }
  ],
  "total_at_risk_count": 7,
  "visible_at_risk_count": 7
}

Pagination

Opaque base64 cursor of {briefing_date, id}. Default page size 20, max 50. History list omits the at_risk array; tap-through hits the by-id endpoint for the full list.

Frontend (SolidStart Panel)

File tree

frontend/solidstart/apps/panel/src/features/fieldforce/briefings/
  api.ts                        # TanStack Query hooks
  components/
    BriefingWidget.tsx          # dashboard card
    AtRiskList.tsx              # ranked task cards
    MetricsStrip.tsx            # 5-number strip
    BriefingHistoryList.tsx     # history page list
    BriefingsAdminPanel.tsx     # settings card inside fieldforce admin config
  pages/
    history.tsx                 # /fieldforce/briefings

frontend/solidstart/apps/panel/src/features/ai/prompts/
  api.ts                        # TanStack Query hooks for ai_prompt_templates
  components/
    PromptTemplateList.tsx      # group by feature_key × locale; show active version + history
    PromptEditor.tsx            # template body editor with placeholder reference panel
    PromptVersionDiff.tsx       # side-by-side diff between any two versions
    PromptActivateModal.tsx     # confirms activation + records audit reason
  pages/
    index.tsx                   # /admin/ai/prompts (platform admin only)

Today's Briefing

AI · LLM

22 May 2026 · generated 08:00 SGT

In the last 24 hours, 12 tasks were created and 8 completed. Two escalations fired overnight on Block C. Three tasks are currently overdue and require immediate attention.

Created

Completed

Approved

Overdue

Notable

At risk · 7 tasks

Install fire suppression panel — Block C

Overdue by 2 days with no field activity since assignment.

Score 60

Electrical inspection — Unit 4B

Awaiting manager approval for 51 hours.

Score 40

HVAC filter replacement — Tower A

Due in 18 hours. Status still pending, no check-in recorded.

Score 25

Panel widget mockup — manager-facing briefing card with MetricsStrip and AtRiskList.

All surfaces

Briefing widget — on the existing fieldforce dashboard at /fieldforce. Layout: header (date + mode badge), summary_md rendered as markdown, MetricsStrip, AtRiskList, "View history" link.
History page — /fieldforce/briefings — paginated list of past briefings with snippet + counts.
Admin settings — new "Briefings" section inside the existing fieldforce admin config screen: per-org enable, schedule time, timezone picker, retention days, at-risk thresholds (collapsed by default), notify checkboxes.
Notification dropdown — handler for briefing_generated event type, deeplinks to /fieldforce#briefing. Text: "Today's briefing is ready — 3 tasks at risk" (i18n'd).
Prompt admin (platform admin only) — /admin/ai/prompts. Lists all (feature_key, locale) rows with their active version. Click → version history + diff view. Edit → creates new draft version. "Activate" → promotes draft + audit log entry. No "Run eval" button in v1 (eval CLI deferred to a follow-up spec).

At-risk card rendering

Task title + assignee chips.
Reasoning: prefer llm_reason_text if present; otherwise build templated text from heuristic_reasons (e.g. "Overdue by 2 days · no activity in 4 days").
Score chip — color-coded by score bucket: 0–24 green · 25–49 amber · 50+ red.
Click → task detail page.

Empty states

Today's briefing not yet generated: "Today's briefing will arrive around 08:00 (Asia/Kuala_Lumpur)."
Per-org flag off: Widget hides entirely; admin sees the settings card instead.
No fieldforce activity: Heuristic-only fallback summary still renders: "No activity in the last 24 hours."

i18n: All UI chrome strings in Paraglide messages/{en,zh,ms}.json under fieldforce.briefing.*. LLM-generated summary_md and llm_reason_text are not translated client-side — they arrive in the org's briefing.locale from generation.

Testing Strategy

Layer	Tests
Heuristic scoring	Pure-function unit tests in `at_risk_heuristics_test.go` — every rule with edge cases (overdue exactly at threshold, no activity exactly at boundary, multiple rules firing on one task, context flags don't surface a task alone).
Use case	`generate_briefing_test.go` with stubbed AccessGate + stubbed LLM provider — covers allowed/denied/timeout/invalid-JSON/unknown-task-id/synonym-mismatch paths; confirms heuristic fallback fires correctly in each case.
Repository	Integration tests against real PG container — UNIQUE violation handling (lease conflict), cascade delete on retention sweep, batch insert behavior.
HTTP handlers	Seeded fixtures with role matrix (admin/owner/manager/supervisor/field_team) — team-filter correctness, 403 for field_team, pagination cursor round-trip.
Prompt service	`prompt_service_test.go` — CreateDraft increments version monotonically; Activate does transactional swap (partial-unique-index integrity holds under concurrency); platform-admin-only edit/activate; rollback to any prior version works; audit-log entry recorded on activation.
Frontend components	Vitest for BriefingWidget, AtRiskList, MetricsStrip, PromptEditor, PromptVersionDiff.
Frontend e2e	Playwright #1: widget loads after cron generation; supervisor sees only team items; admin sees all. Playwright #2: platform admin edits a prompt, draft saved, activate flow shows in audit log.
Eval set	Deferred to follow-up spec. Staging dark-launch is the v1 validation. A skeleton `evals/briefing/cases.jsonl` may be checked in as scaffolding; no CLI consumes it in Phase 4.

Rollout Sequence

Single switch: admin_feature_flags(fieldforce.briefing).is_enabled gates cron, routes, and widget end-to-end. Seeded false by migration; flipped per-environment via /admin/feature-flags. The kill-switch path is exercised during rollout, so it's proven before it's needed in anger.

Ship migration
Creates ff_briefings, ff_briefing_at_risk, ai_prompt_templates, ff_briefing_retention_runs. Seeds flag OFF. Seeds v1 prompt rows for fieldforce:briefing × {en, zh, ms} from seed_prompts/. Feature invisible — cron exits at step 1, routes 403, widget hides.
Staging dark-launch
Platform admin flips DB flag to true in staging. Run for a week. Prompt tuning is DB-only — no PR cycle ever: edit in /admin/ai/prompts → review checklist → activate when satisfied.
Prod dark launch
Flip DB flag to true in prod. Briefings begin generating; widget surfaces. Watch error rates, AccessGate cost, lease/recovery behavior under worker restart.
Watch & tune
Observe AccessGate fieldforce:briefing usage/cost. Tune defaults (shortlist_max, no_activity_days) based on real org data. Prompt iterations stay DB-only post-GA. Kill-switch: flip the same DB flag back to false — instant, all orgs.

Observability

AccessGate already records per-feature usage and per-call errors — Phase 3 dashboards show fieldforce:briefing rows automatically.
Structured cron logs: briefing_generated{org_id,mode} · briefing_skipped{org_id,reason} · briefing_llm_failed{org_id,err} · briefing_persist_failed{org_id} · briefing_retention_swept{count}.
Counter fieldforce_briefings_total{mode} (llm vs heuristic_only) — answers "are we using AI as expected, or always falling back?"
Counter fieldforce_briefing_llm_rows_rejected_total{reason} where reason ∈ {unknown_task_id, synonym_mismatch} — answers "is the LLM reasoning matching the data?" A persistently high synonym_mismatch rate signals that the prompt or the synonym list in at_risk_heuristics.go needs tuning.

Locked Decisions

All six open questions were resolved during brainstorming. Recorded here so reviewers can rely on them without re-litigating.

BYOK vs platform model preference

BYOK if configured, else platform default.

fieldforce:briefing calls prefer the org's configured BYOK model when present. Falls back to the platform default model when no BYOK key is set. Matches the established fieldforce:parse_task behavior from Phase 3 — AccessGate resolves this automatically.

Rejected: always-platform-default (ignores org's BYOK investment), always-BYOK (breaks orgs without BYOK configured).

Notable events taxonomy

Five event types counted in notable_events: escalation_fired, task_rejected, cancelled_after_activity, approval_stalled, repeated_resubmissions.

These cover the core manager-visible friction points. approval_stalled is a derived condition (status=completed AND updated_at + 48h < now), computed at briefing generation time — not a real-time event. Out of scope for v1: attachment audits, priority/due-date edits, reassignments, bulk creation, AI-driven creation counts.

Deferred to Phase 5: attachment audit flags, AI-driven creation tracking, priority change events.

Supervisor with no team at-risk coverage

Suppress the briefing_generated notification. Widget still accessible manually.

A supervisor whose team has zero at-risk items should not receive the bell — it would be noise. The widget remains reachable via direct navigation; only the proactive push is suppressed.

Rejected: always-notify-all-supervisors (noise for supervisors with nothing flagged).

Retention default

90 days default. "Forever" / compliance-grade retention deferred to Phase 5.

90 days covers reasonable operational history. Compliance-grade mode needs its own design surface — customer demand not yet surfaced.

Deferred: unlimited retention or compliance-grade mode (Phase 5 if demand surfaces).

Localization granularity

One locale per org (briefing.locale config). Per-recipient locale is out of scope for v1.

The per-recipient locale trade-off (LLM cost × unique locales in the org) is not worth solving at one call per org per day. The entire org sees summary prose in one language; UI chrome strings are still i18n'd per user preference via Paraglide.

Deferred: per-recipient locale based on user preference (Phase 5).

First-run experience

No "generate sample now" button. First briefing generates the next morning at the org's scheduled local time.

An on-demand regen button would expand the daily-snapshot-only architecture — introducing out-of-schedule cost and idempotency complications. The eval set already covers prompt validation; staging dark-launch is the v1 validation path.

Deferred: on-demand generation (Phase 5+ candidate).

Scope & Goals

What Phase 4 ships

Success criteria

Approach Selected — A (Hybrid Generation)

Architecture & Module Layout

Reused infrastructure — no new dependencies

Generation data flow

ADRs that emerge from this spec

Data Model

ff_briefings — one row per (org, day)

ff_briefing_at_risk — at-risk items per briefing

org_module_configs additions (module: fieldforce)

ai_prompt_templates — versioned DB-managed prompts (new AI module)

Prompt activation workflow

Feature flag rows

Notifications

Retention sweep

Generation Pipeline

Tick loop (per minute)

Heuristic scoring rules

Failure modes

Cost story

Read Path & API Surface

Role-based read access

Pagination

Frontend (SolidStart Panel)

Briefing widget

Today's Briefing

All surfaces

At-risk card rendering

Empty states

Testing Strategy

Rollout Sequence

Observability

Locked Decisions

BYOK vs platform model preference

Notable events taxonomy

Supervisor with no team at-risk coverage

Retention default

Localization granularity

First-run experience