Fieldforce Phase 4 — Daily AI Briefings + At-Risk Delays Design
Phase 4 ships a daily AI-generated briefing per org, persisted as ff_briefings records and surfaced as a panel dashboard widget for managers and supervisors. A hybrid pipeline (deterministic heuristics shortlist → single LLM call ranks and reasons) keeps costs at one call per org per day, with full heuristic-only fallback when the AI gate denies or the LLM fails. A new ai_prompt_templates table lets platform admins tune LLM prompts in the DB without code PRs.
Phase 4 ships a daily AI-generated briefing per org — a 2–4 sentence narrative plus a ranked at-risk task list — persisted as ff_briefings records and surfaced as a panel dashboard widget for managers and supervisors. A hybrid pipeline (deterministic heuristics shortlist → single LLM call ranks and writes reasoning) keeps costs at one call per org per day, with full heuristic-only fallback when the AI gate denies or the LLM fails. A new ai_prompt_templates table lets platform admins tune prompts in the DB without code PRs.
Scope & Goals
What Phase 4 ships
- Daily AI-generated briefing per org —
ff_briefingstable, panel widget, in-app notification. - At-risk task list inside each briefing — heuristic shortlist (≤ 15 items) ranked and reasoned by a single LLM call.
- Two-tier feature flag
fieldforce.briefing(ADR-0016): global kill-switch for platform admins, per-org toggle for org admins. Cron, route handlers, and panel widget all check the flag — if either tier is off, the feature is invisible end-to-end. - Config knobs via
org_module_configs(ADR-0019): schedule time, timezone, retention days, at-risk thresholds, notification toggles. - AccessGate integration under key
fieldforce:briefing(Phase 3 pattern): LLM call counted against org AI usage; gate denial → heuristic-only briefing (feature stays usable). - DB-stored prompt templates (
ai_prompt_templatestable in the AI module): versioned, activate-on-demand, platform-admin-only edit access. Briefing is the first consumer; future AI features inherit this capability free.
Success criteria
- Manager opens the panel in the morning — sees a 2–4 sentence summary of overnight activity plus 3–10 ranked at-risk tasks with one-line LLM reasoning each.
- Supervisor sees the same widget; at-risk list filtered to their team via the existing member-scan (ADR-0017).
- Both feature-flag tiers respected: global OFF stops generation for all orgs; per-org OFF stops it for that org only.
- AccessGate records
fieldforce:briefingusage per org per day; LLM failures fall back to heuristic-only briefing without crashing the cron.
Approach Selected — A (Hybrid Generation)
A · Hybrid: one org-scope LLM briefing
- LLM calls / org / day
- 1
- Supervisor fidelity
- High on data (filtered at-risk); shared summary prose
- Failure path
- Natural — gate denial or LLM error → heuristic-only
One org-scope briefing. Supervisors get team-filtered at-risk items with the same summary_md. Predictable cost. Clean AccessGate integration (one feature row per org per day). Reuses the Phase 3 BYOK gate without modification.
B · Per-scope LLM briefings
- LLM calls / org / day
- 1 + N teams
- Supervisor fidelity
- Highest — bespoke summary per team
- Failure path
- Per-scope
Cost scales with team count; redundant for small teams. Bespoke prose per team is not worth the expense at this stage.
C · Heuristic-only with optional AI prose
- LLM calls / org / day
- 0–1
- Supervisor fidelity
- Same at-risk fidelity as A
- Failure path
- N/A
Loses contextual per-task LLM reasoning — a core value driver. Heuristic-only is the fallback path, not the design target.
Architecture & Module Layout
Go module file tree
backend/go/internal/modules/fieldforce/
domain/entity/
briefing.go # Briefing aggregate (sections, generation status)
at_risk.go # AtRiskCandidate, AtRiskReason value types
application/port/
briefing_ports.go # BriefingRepository, ClockPort
# (reuses existing AccessGate, Notifier, PromptRepository ports)
application/usecase/
generate_briefing.go # cron entry: gather → heuristic → gate → fetch prompt → LLM → persist → notify
at_risk_heuristics.go # pure functions, no IO
briefing_query.go # latest / list / by-id with supervisor team filter
briefing_prompts.go # heuristic-only locale templates (NOT LLM prompts)
adapter/inbound/http/
briefing_handler.go # GET /fieldforce/briefings/latest, /:id, /
adapter/inbound/cron/
briefing_cron.go # BriefingJob: own 1-minute ticker + goroutine, sibling to OverdueJob
adapter/outbound/persistence/postgresql/
briefings_repo.go
backend/go/internal/modules/ai/ # NEW — generic prompt-template capability
domain/entity/
prompt_template.go # PromptTemplate aggregate (id, feature_key, locale, version, template, is_active)
application/port/
prompt_ports.go # PromptRepository, PromptAdminService
application/usecase/
prompt_service.go # GetActive, CreateDraft, Activate, ListVersions, Diff
prompt_service_test.go
adapter/inbound/http/
prompt_admin_handler.go # platform-admin routes: list/diff/edit/activate
prompt_admin_handler_test.go
adapter/outbound/persistence/postgresql/
prompt_templates_repo.go
prompt_templates_repo_test.go
Reused infrastructure — no new dependencies
- AccessGate (Phase 3)
- Every LLM call passes through it under key
fieldforce:briefing. Resolves mode, enforces global AI kill-switch, validates plan caps, decrypts BYOK credentials, records per-decision and per-success rows. - Notifier port (Phase 1)
- New event constant
NotificationBriefingGenerated.NotifyParamsgains a genericPayload map[string]stringfield — existing call sites unaffected (new field is zero-valued). Task-shaped events continue usingTaskID/TaskTitle; briefing events writePayload["briefing_id"],Payload["briefing_date"],Payload["at_risk_count_for_recipient"]. - Feature flag middleware (ADR-0016)
- New
FieldforceBriefingFeatureFlagMiddlewarecheckskey='fieldforce.briefing'in both tiers. Cron also consults the same two flag tables before generating. - Supervisor scope (ADR-0017)
- Read path uses the existing member-scan helper for team filtering — no new mechanism.
- org_module_configs (ADR-0019)
- New namespaced keys
briefing.*. Go-hardcoded defaults merged on read; a partial row returns full defaults for unset fields. - Migrator pattern
- New
CreateBriefingTables()raw-SQL function inmigrator.go— per Phase 3 / CONTEXT.md convention; no SQL files.
Generation data flow
ADRs that emerge from this spec
- ADR-0022 — "Daily Fieldforce Briefings as Stored Snapshots": captures why stored snapshots beat live computation — predictable cost, history browsability, idempotent cron, snapshot semantics align with
team_idsdenormalization. (Next available ADR — 0021 is table-naming-convention.) - ADR-0023 (optional) — "Hybrid Heuristic+LLM At-Risk Detection": only if grill/review surfaces the rule-vs-LLM boundary as non-obvious.
Data Model
Two new tables (ff_briefings parent, ff_briefing_at_risk child), one new feature-flag key, new org_module_configs keys, and one new AI-module table (ai_prompt_templates). No changes to ff_tasks, ff_activities, or any Phase 1–3 table.
ff_briefings — one row per (org, day)
| Column | Type | Notes |
|---|---|---|
id | uuid PK | |
org_id | text NOT NULL | multi-tenant scope; matches existing fieldforce convention |
briefing_date | date NOT NULL | logical day in the org's timezone; UNIQUE with org_id |
generated_at | timestamptz NOT NULL | when cron finished |
generation_mode | text NOT NULL | pending (lease claimed) · llm · heuristic_only |
summary_md | text NOT NULL | 2–4 sentence narrative; markdown |
at_risk_count | int NOT NULL | total candidates in the child table |
tasks_created | int NOT NULL DEFAULT 0 | created in last 24h (org local) |
tasks_completed | int NOT NULL DEFAULT 0 | transitioned to completed in last 24h |
tasks_approved | int NOT NULL DEFAULT 0 | transitioned to approved in last 24h |
tasks_overdue | int NOT NULL DEFAULT 0 | currently overdue at generation time |
notable_events | jsonb NOT NULL DEFAULT '{}' | per-type breakdown: {escalation_fired, task_rejected, cancelled_after_activity, approval_stalled, repeated_resubmissions}. Frontend renders non-zero types. |
locale | text NOT NULL | en · zh · ms |
created_at | timestamptz NOT NULL DEFAULT now() | |
updated_at | timestamptz NOT NULL DEFAULT now() |
Constraints: UNIQUE (org_id, briefing_date) — makes the cron idempotent; concurrent ticks resolve via uniqueness violation. Index on (org_id, briefing_date DESC) for latest lookups and history pagination.
ff_briefing_at_risk — at-risk items per briefing
| Column | Type | Notes |
|---|---|---|
id | uuid PK | |
briefing_id | uuid NOT NULL FK | → ff_briefings ON DELETE CASCADE |
task_id | uuid NOT NULL | not a hard FK — snapshot may outlive task |
rank | int NOT NULL | 1 = highest risk |
heuristic_score | int NOT NULL | 0–100; deterministic |
heuristic_reasons | jsonb NOT NULL | array of rule and flag keys that fired, e.g. ["overdue","stale_activity"] |
llm_reason_text | text NULL | one-sentence LLM reasoning; NULL when generation_mode='heuristic_only' |
assignee_user_ids | text[] NOT NULL | denormalized for fast read rendering |
team_ids | text[] NOT NULL | denormalized at generation time; supports supervisor team filter on read |
Constraints: UNIQUE (briefing_id, task_id). Index on (briefing_id, rank) — read path orders BY rank ASC.
org_module_configs additions (module: fieldforce)
briefing.schedule_local_time- Default
"08:00"— when daily briefing fires (24h, in org's timezone) briefing.timezone- Default
"UTC"— IANA timezone string briefing.retention_days- Default
90— retention sweep deletes older briefings briefing.at_risk_thresholds.no_activity_days- Default
3 briefing.at_risk_thresholds.due_soon_hours- Default
24 briefing.at_risk_thresholds.shortlist_max- Default
15— max candidates fed to LLM ranker briefing.notify_managers- Default
true briefing.notify_supervisors- Default
true— still suppressed for supervisors whose filtered at-risk count = 0 briefing.locale- Default
"en"— language for LLM-generated prose; one per org (per-recipient locale is out of scope for v1)
ai_prompt_templates — versioned DB-managed prompts (new AI module)
Lives in the AI module schema alongside ai_configs, ai_plans, ai_models. Briefing is the first consumer; future AI features (and a follow-up to migrate fieldforce:parse_task) reuse it.
| Column | Type | Notes |
|---|---|---|
id | uuid PK | |
feature_key | text NOT NULL | matches AccessGate feature key, e.g. fieldforce:briefing |
locale | text NOT NULL | en · zh · ms |
version | int NOT NULL | monotonic per (feature_key, locale): 1, 2, 3, … |
template | text NOT NULL | prompt body with {{placeholder}} syntax; rendered in Go with text/template |
notes | text NULL | author's "what changed" note |
is_active | bool NOT NULL DEFAULT false | partial unique index enforces exactly one active row per (feature_key, locale) |
created_by | text NOT NULL | platform admin user_id |
created_at | timestamptz NOT NULL DEFAULT now() |
Constraints: UNIQUE (feature_key, locale, version). Partial unique index: CREATE UNIQUE INDEX ON ai_prompt_templates (feature_key, locale) WHERE is_active = true. Index on (feature_key, locale, is_active) for the hot read path. Result cached for the duration of one cron tick (60s).
Prompt activation workflow
-
Create draft
Platform admin edits a template → service creates a new row at
version = max(version)+1,is_active = false. -
Review
Admin checks placeholder coverage, locale fidelity, length sanity. Formal eval-CLI-gating deferred to a follow-up spec — staging dark-launch is the v1 validation path.
-
Activate
Transactional swap: previous active row set to
is_active = false, new row tois_active = true. Partial-unique-index protects against concurrent activations. Audit log recordsbefore_version,after_version,activated_by. -
Rollback any time
Prior versions remain queryable in the diff view. Activate any prior version for one-click rollback.
Feature flag rows
admin_feature_flags(key='fieldforce.briefing', is_enabled=false)— seeded by the migration with default OFF. Platform admin flips ittrueper environment via/admin/feature-flags. This is the sole end-to-end on/off switch (cron, routes, widget).org_feature_flags— no rows seeded; defaults to enabled per ADR-0016 once the global flag is on.
Notifications
No schema change. The existing notifications table gets a new event_type value briefing_generated with payload {briefing_id, briefing_date, at_risk_count_for_recipient} for deeplink and headline text.
Retention sweep
Runs as a second pass inside BriefingJob.runTick, stamp-gated to once-per-org-per-day using ff_briefing_retention_runs:
ff_briefing_retention_runs
org_id text PRIMARY KEY
last_swept_at timestamptz NOT NULL
Per tick, per org: if now() - last_swept_at < 23h, skip. Otherwise DELETE FROM ff_briefings WHERE org_id = ? AND created_at < now() - retention_days * INTERVAL '1 day', then INSERT … ON CONFLICT (org_id) DO UPDATE SET last_swept_at = now(). ON DELETE CASCADE handles the at-risk child rows. Multi-replica safe: the ON CONFLICT … DO UPDATE is atomic.
Generation Pipeline
The cron mirrors Phase 2's overdue job: a single time.NewTicker(60*time.Second) in the fieldforce module, started from app.go, exit-aware via context. Each tick scans candidates and dispatches work serially per org. No work-queue infrastructure.
Tick loop (per minute)
-
Global flag check
If
admin_feature_flags(key='fieldforce.briefing').is_enabled = false, tick exits immediately. No org scanning. -
Find orgs due for generation
Repo query applies cheap SQL filters (both flag tiers on, LEFT JOIN existing briefings for
briefing_date BETWEEN current_date - 1 AND current_date + 1). Go-side filter loads the IANA timezone, computes local date/time, skips if beforeschedule_local_timeor if today's briefing already exists. This is Phase 4's template for per-org local-time scheduling. -
Claim lease
INSERT a placeholder
ff_briefingsrow withgeneration_mode='pending', emptysummary_md,at_risk_count=0. TheUNIQUE(org_id, briefing_date)constraint means only one replica wins per org per day. Losers catch the conflict and skip — no LLM call, no AccessGate usage row, no wasted spend. If the winner crashes mid-generation, thependingrow persists; a recovery clause (WHERE generation_mode='pending' AND generated_at < now() - 5min) re-claims on a later tick. -
Gather inputs
Pull open tasks (status NOT IN approved/cancelled), last 24h activities (in org local time), last 24h audit log entries for the five notable-event types. Compute summary metrics. Compute denormalized assignee
team_idsfrommembertable (single batched scan). -
Heuristic shortlist
Pure function
at_risk_heuristics.Score()— no IO. Sort byheuristic_scoredescending; ties broken by task age; take topshortlist_max(default 15). Tasks whose only triggers are context flags (no scoring rule fired) are NOT eligible for the shortlist — a context flag adds colour but does not surface a task by itself. -
AccessGate authorize
Call
AccessGate.Authorize(ctx, AuthzInput{OrgID, Feature: "fieldforce:briefing"}). Denied (trial exhausted, global AI kill-switch, plan caps, no plan) → skip LLM, produce heuristic-only briefing. -
LLM call (if allowed)
Fetch active prompt via
PromptRepository.GetActive("fieldforce:briefing", orgLocale)(cached for the tick). Single combined-schema call, JSON mode. Expected shape:{summary_md: string, ranked: [{task_id, reason}]}. Post-hoc validation: summary ≤ 1000 chars, every returnedtask_idin the input shortlist, reason ≤ 200 chars per item, reason must contain at least one synonym for one of the task'sheuristic_reasonsrule keys (case-insensitive substring; synonym lists live inat_risk_heuristics.go). Items failing synonym match are dropped — the row falls back to templated heuristic text. Whole response invalid (top-level shape broken, summary fails, or all items dropped) → one retry with stricter instruction; still invalid → heuristic-only. AlwaysAccessGate.RecordUsageon success;AccessGate.RecordErroron failure. -
Heuristic-only fallback
Used when gate denies, LLM fails, or output is entirely invalid.
summary_md= templated paragraph frombriefing_prompts.go(one per locale), populated from metrics.ranked= heuristic shortlist sorted by score;llm_reason_text = NULL; UI renders templated reason fromheuristic_reasons.generation_mode = 'heuristic_only'. -
Finalize (single transaction)
UPDATE
ff_briefingswith final mode, metrics, summary,at_risk_count,generated_at. Batch INSERTff_briefing_at_riskrows. All-or-nothing — failure leaves the row inpendingfor the recovery clause to retry. -
Notify (after commit)
Loop recipients sequentially. Best-effort, no retry — matches Phase 2
OverdueJobprecedent. Recipients: all org members with role admin/owner/manager (ifnotify_managers), all supervisors whose team has ≥ 1 at-risk item (ifnotify_supervisors). Payload:{briefing_id, briefing_date, at_risk_count_for_recipient}precomputed per recipient using denormalizedteam_ids. -
Retention sweep
Second pass per org: check
ff_briefing_retention_runs.last_swept_at; if ≥ 23h old, run DELETE and stamp. Most ticks bounce out at the stamp check immediately.
Heuristic scoring rules
Four scoring rules contribute additively to heuristic_score (gates shortlist inclusion and rank). Two context flags append to heuristic_reasons but add no score. overdue and due_soon are mutually exclusive by construction. Status values follow the fieldforce task state machine (ADR-0013).
-
overduedue_date < now AND status NOT IN (approved, cancelled)40 -
due_soon0 < due_date − now ≤ due_soon_hours AND status = pending25 -
stale_activitylast_activity_at + no_activity_days < now AND status = in_progress20 -
approval_stuckstatus IN (completed, needs_revision) AND updated_at + 48h < now15 -
assignee_overloadassignee.active_tasks > 8context flag — no score — -
repeated_resubmissionstask resubmitted ≥ 2 timescontext flag — no score —
Failure modes
| Failure | Behavior |
|---|---|
| Global flag off | Tick exits at step 1; no work performed. |
| Org flag off | Org skipped at step 2; no record created. |
| AccessGate denies LLM | Heuristic-only briefing generated; gate records a denial event. |
| LLM call times out / errors | One retry; if still failing, heuristic-only; AccessGate.RecordError. |
| LLM returns invalid JSON | One retry with stricter prompt; if still invalid, heuristic-only. |
| LLM returns unknown task_ids | Drop unknown items; if all items dropped, heuristic-only. |
| LLM reason fails synonym match | Drop just that item; row renders templated heuristic text; synonym_mismatch counter increments. If all items dropped, heuristic-only. |
| Lease conflict (race between replicas) | UNIQUE violation on placeholder INSERT; loser skips org with no LLM/gate side-effects. |
| Lease holder crashes mid-generation | Row stays in generation_mode='pending'; recovery clause re-claims after 5 min on a later tick. |
| Finalize UPDATE error | Logged; row stays pending; recovery clause retries on a later tick. |
| Member-scan returns no teams | Continue with empty team_ids; supervisor filter excludes that task from supervisor view. |
Cost story
One LLM call per org per day. At 100 active orgs × ~1k input + ~600 output tokens per call ≈ 160k tokens/day platform-wide. AccessGate enforces per-org caps automatically — a buggy schedule config is rate-limited before it causes runaway spend.
Read Path & API Surface
Three endpoints under the existing fieldforce route group, mounted behind the new FieldforceBriefingFeatureFlagMiddleware (checks key='fieldforce.briefing' in both tiers).
| Method | Path | Purpose |
|---|---|---|
| GET | /api/organizations/:org_id/fieldforce/briefings/latest | Most recent briefing (today's, else latest stored) |
| GET | /api/organizations/:org_id/fieldforce/briefings/:id | One briefing by id — full at_risk array |
| GET | /api/organizations/:org_id/fieldforce/briefings?cursor=… | Paginated history — omits at_risk array; tap-through to by-id for the full list |
Role-based read access
Admin / Owner / Manager
org-wide- Full
at_riskarray visible_at_risk_count = total_at_risk_count- Full
summary_mdnarrative
Supervisor
team-scopedat_riskfiltered:team_ids ∩ supervisor.team_ids ≠ ∅total_at_risk_countstays org-wide;visible_at_risk_countreflects the filter- Same shared
summary_md— Approach A trade-off - Filter uses current team membership (fresh from member-scan), so role changes take effect immediately on the snapshot
Field Team
403- Not their surface in Phase 4
- Considered for Phase 5 (mobile personal briefing)
Response JSON shape
{
"id": "...",
"briefing_date": "2026-05-22",
"generated_at": "2026-05-22T08:00:14Z",
"generation_mode": "llm",
"locale": "en",
"summary_md": "In the last 24 hours, 12 tasks were created and 8 completed...",
"metrics": {
"tasks_created": 12,
"tasks_completed": 8,
"tasks_approved": 6,
"tasks_overdue": 3,
"notable_events": {
"escalation_fired": 2,
"task_rejected": 0,
"cancelled_after_activity": 0,
"approval_stalled": 0,
"repeated_resubmissions": 0
}
},
"at_risk": [
{
"task_id": "...",
"rank": 1,
"heuristic_score": 85,
"heuristic_reasons": ["overdue", "stale_activity"],
"llm_reason_text": "Overdue by 2 days with no activity since assignment.",
"assignee_user_ids": ["..."],
"team_ids": ["..."]
}
],
"total_at_risk_count": 7,
"visible_at_risk_count": 7
}
Pagination
Opaque base64 cursor of {briefing_date, id}. Default page size 20, max 50. History list omits the at_risk array; tap-through hits the by-id endpoint for the full list.
Frontend (SolidStart Panel)
File tree
frontend/solidstart/apps/panel/src/features/fieldforce/briefings/
api.ts # TanStack Query hooks
components/
BriefingWidget.tsx # dashboard card
AtRiskList.tsx # ranked task cards
MetricsStrip.tsx # 5-number strip
BriefingHistoryList.tsx # history page list
BriefingsAdminPanel.tsx # settings card inside fieldforce admin config
pages/
history.tsx # /fieldforce/briefings
frontend/solidstart/apps/panel/src/features/ai/prompts/
api.ts # TanStack Query hooks for ai_prompt_templates
components/
PromptTemplateList.tsx # group by feature_key × locale; show active version + history
PromptEditor.tsx # template body editor with placeholder reference panel
PromptVersionDiff.tsx # side-by-side diff between any two versions
PromptActivateModal.tsx # confirms activation + records audit reason
pages/
index.tsx # /admin/ai/prompts (platform admin only)
Briefing widget
Today's Briefing
AI · LLM22 May 2026 · generated 08:00 SGT
In the last 24 hours, 12 tasks were created and 8 completed. Two escalations fired overnight on Block C. Three tasks are currently overdue and require immediate attention.
Install fire suppression panel — Block C
Overdue by 2 days with no field activity since assignment.
Electrical inspection — Unit 4B
Awaiting manager approval for 51 hours.
HVAC filter replacement — Tower A
Due in 18 hours. Status still pending, no check-in recorded.
All surfaces
- Briefing widget — on the existing fieldforce dashboard at
/fieldforce. Layout: header (date + mode badge),summary_mdrendered as markdown, MetricsStrip, AtRiskList, "View history" link. - History page —
/fieldforce/briefings— paginated list of past briefings with snippet + counts. - Admin settings — new "Briefings" section inside the existing fieldforce admin config screen: per-org enable, schedule time, timezone picker, retention days, at-risk thresholds (collapsed by default), notify checkboxes.
- Notification dropdown — handler for
briefing_generatedevent type, deeplinks to/fieldforce#briefing. Text: "Today's briefing is ready — 3 tasks at risk" (i18n'd). - Prompt admin (platform admin only) —
/admin/ai/prompts. Lists all (feature_key, locale) rows with their active version. Click → version history + diff view. Edit → creates new draft version. "Activate" → promotes draft + audit log entry. No "Run eval" button in v1 (eval CLI deferred to a follow-up spec).
At-risk card rendering
- Task title + assignee chips.
- Reasoning: prefer
llm_reason_textif present; otherwise build templated text fromheuristic_reasons(e.g. "Overdue by 2 days · no activity in 4 days"). - Score chip — color-coded by score bucket: 0–24 green · 25–49 amber · 50+ red.
- Click → task detail page.
Empty states
- Today's briefing not yet generated
- "Today's briefing will arrive around 08:00 (Asia/Kuala_Lumpur)."
- Per-org flag off
- Widget hides entirely; admin sees the settings card instead.
- No fieldforce activity
- Heuristic-only fallback summary still renders: "No activity in the last 24 hours."
Testing Strategy
| Layer | Tests |
|---|---|
| Heuristic scoring | Pure-function unit tests in at_risk_heuristics_test.go — every rule with edge cases (overdue exactly at threshold, no activity exactly at boundary, multiple rules firing on one task, context flags don't surface a task alone). |
| Use case | generate_briefing_test.go with stubbed AccessGate + stubbed LLM provider — covers allowed/denied/timeout/invalid-JSON/unknown-task-id/synonym-mismatch paths; confirms heuristic fallback fires correctly in each case. |
| Repository | Integration tests against real PG container — UNIQUE violation handling (lease conflict), cascade delete on retention sweep, batch insert behavior. |
| HTTP handlers | Seeded fixtures with role matrix (admin/owner/manager/supervisor/field_team) — team-filter correctness, 403 for field_team, pagination cursor round-trip. |
| Prompt service | prompt_service_test.go — CreateDraft increments version monotonically; Activate does transactional swap (partial-unique-index integrity holds under concurrency); platform-admin-only edit/activate; rollback to any prior version works; audit-log entry recorded on activation. |
| Frontend components | Vitest for BriefingWidget, AtRiskList, MetricsStrip, PromptEditor, PromptVersionDiff. |
| Frontend e2e | Playwright #1: widget loads after cron generation; supervisor sees only team items; admin sees all. Playwright #2: platform admin edits a prompt, draft saved, activate flow shows in audit log. |
| Eval set | Deferred to follow-up spec. Staging dark-launch is the v1 validation. A skeleton evals/briefing/cases.jsonl may be checked in as scaffolding; no CLI consumes it in Phase 4. |
Rollout Sequence
-
Ship migration
Creates
ff_briefings,ff_briefing_at_risk,ai_prompt_templates,ff_briefing_retention_runs. Seeds flag OFF. Seeds v1 prompt rows forfieldforce:briefing× {en, zh, ms} fromseed_prompts/. Feature invisible — cron exits at step 1, routes 403, widget hides. -
Staging dark-launch
Platform admin flips DB flag to
truein staging. Run for a week. Prompt tuning is DB-only — no PR cycle ever: edit in/admin/ai/prompts→ review checklist → activate when satisfied. -
Prod dark launch
Flip DB flag to
truein prod. Briefings begin generating; widget surfaces. Watch error rates, AccessGate cost, lease/recovery behavior under worker restart. -
Watch & tune
Observe AccessGate
fieldforce:briefingusage/cost. Tune defaults (shortlist_max,no_activity_days) based on real org data. Prompt iterations stay DB-only post-GA. Kill-switch: flip the same DB flag back tofalse— instant, all orgs.
Observability
- AccessGate already records per-feature usage and per-call errors — Phase 3 dashboards show
fieldforce:briefingrows automatically. - Structured cron logs:
briefing_generated{org_id,mode}·briefing_skipped{org_id,reason}·briefing_llm_failed{org_id,err}·briefing_persist_failed{org_id}·briefing_retention_swept{count}. - Counter
fieldforce_briefings_total{mode}(llm vs heuristic_only) — answers "are we using AI as expected, or always falling back?" - Counter
fieldforce_briefing_llm_rows_rejected_total{reason}where reason ∈ {unknown_task_id, synonym_mismatch} — answers "is the LLM reasoning matching the data?" A persistently highsynonym_mismatchrate signals that the prompt or the synonym list inat_risk_heuristics.goneeds tuning.
Locked Decisions
All six open questions were resolved during brainstorming. Recorded here so reviewers can rely on them without re-litigating.
BYOK vs platform model preference
BYOK if configured, else platform default.
fieldforce:briefing calls prefer the org's configured BYOK model when present. Falls back to the platform default model when no BYOK key is set. Matches the established fieldforce:parse_task behavior from Phase 3 — AccessGate resolves this automatically.
Rejected: always-platform-default (ignores org's BYOK investment), always-BYOK (breaks orgs without BYOK configured).
Notable events taxonomy
Five event types counted in notable_events: escalation_fired, task_rejected, cancelled_after_activity, approval_stalled, repeated_resubmissions.
These cover the core manager-visible friction points. approval_stalled is a derived condition (status=completed AND updated_at + 48h < now), computed at briefing generation time — not a real-time event. Out of scope for v1: attachment audits, priority/due-date edits, reassignments, bulk creation, AI-driven creation counts.
Deferred to Phase 5: attachment audit flags, AI-driven creation tracking, priority change events.
Supervisor with no team at-risk coverage
Suppress the briefing_generated notification. Widget still accessible manually.
A supervisor whose team has zero at-risk items should not receive the bell — it would be noise. The widget remains reachable via direct navigation; only the proactive push is suppressed.
Rejected: always-notify-all-supervisors (noise for supervisors with nothing flagged).
Retention default
90 days default. "Forever" / compliance-grade retention deferred to Phase 5.
90 days covers reasonable operational history. Compliance-grade mode needs its own design surface — customer demand not yet surfaced.
Deferred: unlimited retention or compliance-grade mode (Phase 5 if demand surfaces).
Localization granularity
One locale per org (briefing.locale config). Per-recipient locale is out of scope for v1.
The per-recipient locale trade-off (LLM cost × unique locales in the org) is not worth solving at one call per org per day. The entire org sees summary prose in one language; UI chrome strings are still i18n'd per user preference via Paraglide.
Deferred: per-recipient locale based on user preference (Phase 5).
First-run experience
No "generate sample now" button. First briefing generates the next morning at the org's scheduled local time.
An on-demand regen button would expand the daily-snapshot-only architecture — introducing out-of-schedule cost and idempotency complications. The eval set already covers prompt validation; staging dark-launch is the v1 validation path.
Deferred: on-demand generation (Phase 5+ candidate).