ADR-002: UX and Distribution Model¶

Status: Proposed Date: 2026-03-22 Context: Bernstein needs a user interaction model that balances fast onboarding, daily ergonomics, monitoring visibility, and enterprise readiness.

Options Evaluated¶

Option A: CLI-only¶

pip install bernstein
bernstein init
bernstein -g "Build a REST API for user management"
bernstein status
bernstein add-task "Add rate limiting"

Dimension	Rating	Notes
Time to first value	Fast (2-3 min)	pip install, one command, agents start working
Ongoing friction	Medium	Fine for starting work, poor for monitoring 6+ agents live
Monitoring capability	Weak	`bernstein status` is a snapshot, not a live view. Polling via `watch` is crude.
Enterprise readiness	Low	No audit trail UI, no access controls, no team visibility
Implementation cost	Low	Click + Rich already in deps. 1-2 weeks for solid CLI.

Verdict: Necessary as the foundation layer. Every other option builds on top of this. But insufficient alone once you have more than 2-3 agents running.

Option B: CLI + Web Dashboard¶

bernstein --dashboard
# Opens localhost:8080 with real-time agent status, task board, cost tracker

Dimension	Rating	Notes
Time to first value	Fast (2-3 min)	Same as CLI, dashboard is opt-in bonus
Ongoing friction	Low	Visual Kanban board, live agent heartbeats, cost meter. Replaces constant `status` polling.
Monitoring capability	Strong	Real-time WebSocket updates, task flow visualization, cost burn-down.
Enterprise readiness	Medium	Visual audit trail, but still single-user localhost. Needs auth layer for team use.
Implementation cost	Medium-High	FastAPI backend already exists (task server). Frontend is the cost: React/Svelte + WebSocket + task board UI. 3-5 weeks for a useful v1.

Verdict: High value but should be Phase 2, not Phase 1. The task server already serves JSON at localhost:8052 -- a dashboard is a natural read-only layer on top of it. Risk: frontend maintenance burden. Mitigation: keep it minimal (status + task board + cost), no complex state management.

Option C: "Seed" concept -- declarative config file¶

# bernstein.yaml
goal: "Build a REST API for user management"
budget: "$20"
team: auto
cli: auto    # auto-detects installed agents; or set explicitly: claude, codex, gemini, qwen

bernstein   # reads bernstein.yaml, plans, spawns, works

Dimension	Rating	Notes
Time to first value	Fastest (1 min)	Write YAML, run one word. Lowest possible friction.
Ongoing friction	Very low for starting, but high for mid-run adjustments	Great for "fire and forget." Bad for "actually I need to change the plan."
Monitoring capability	None inherent	Still needs CLI status or dashboard for visibility
Enterprise readiness	Medium	Declarative configs are version-controllable, reviewable, reproducible. Good for gitops.
Implementation cost	Low	YAML parsing + mapping to existing CLI commands. 1 week.

Verdict: Excellent as a convenience layer on top of CLI. The YAML file is not a replacement for the CLI -- it is a preset. Think docker-compose.yml: you still need docker commands, but compose gives you a one-command "bring up the whole stack" experience. This should ship with Phase 1.

Option D: "Life seed" -- self-evolving watcher¶

.bernstein/seed.md lives in the repo. Bernstein watches it. When it changes, it re-plans.

Dimension	Rating	Notes
Time to first value	Slow (5-10 min)	User has to understand the watcher model, write a spec in the right format, trust the system to react.
Ongoing friction	Low once understood	Edit the spec, Bernstein adapts. Feels like a living document.
Monitoring capability	Unclear	The watcher itself needs monitoring. Who watches the watcher?
Enterprise readiness	Low	Implicit control flow is hard to audit. "Why did it start doing X?" -> "Because someone edited seed.md line 47." Non-obvious causality.
Implementation cost	Medium	File watcher (watchdog/inotify), diff detection, re-planning logic. 2-3 weeks.

Verdict: Interesting but dangerous. The rag_challenge experience showed that implicit triggers cause confusion. Our agents worked best when commands were explicit and state was inspectable. A file-watcher that re-plans on every edit risks runaway re-planning, wasted budget, and "I saved a typo fix and it respawned 5 agents." This is a Phase 3+ experiment, not a core interaction model. If pursued, it needs a confirmation step ("Detected spec change. Re-plan? [y/N]").

Option E: GitHub-native (GitHub App)¶

Create issues with bernstein label, agents pick them up, produce PRs.

Dimension	Rating	Notes
Time to first value	Slow (15-30 min)	Install GitHub App, configure permissions, create first labeled issue, wait for agent pickup.
Ongoing friction	Low for teams	Familiar issue/PR workflow. Humans review PRs as normal.
Monitoring capability	Strong	GitHub's existing issue board, PR timeline, check runs. No custom UI needed.
Enterprise readiness	Highest	Audit trail built-in. Branch protection. PR reviews. RBAC via GitHub permissions. SOC2-friendly.
Implementation cost	High	GitHub App registration, webhook handling, OAuth, API rate limit management, CI integration. 6-8 weeks for production quality.

Verdict: The strongest enterprise story but the weakest developer story. The latency penalty is real: GitHub webhook -> spawn agent -> agent works -> push -> PR created is 30-60+ seconds of plumbing overhead per task. For a solo developer running Bernstein locally, this is pure friction. For a team of 10 engineers sharing a Bernstein instance, this is the correct model. This is Phase 3, and it should be a separate distribution channel, not the primary interface.

Comparison Matrix¶

Dimension	A: CLI	B: CLI+Dash	C: Seed	D: Life seed	E: GitHub
Time to first value	2-3 min	2-3 min	1 min	5-10 min	15-30 min
Ongoing friction	Medium	Low	Low*	Low*	Low (teams)
Monitoring	Weak	Strong	None	Unclear	Strong
Enterprise ready	Low	Medium	Medium	Low	High
Impl. cost	Low	Med-High	Low	Medium	High
Solo dev fit	Good	Great	Great	Risky	Poor
Team fit	Poor	Good	Medium	Poor	Great

*Low for steady-state. High for mid-run adjustments.

Recommendation: Layered approach (C + A + B, then E)¶

The options are not mutually exclusive. They are layers.

Phase 1 (MVP): CLI + Seed file¶

Ship with both interaction modes from day one:

# Imperative mode — full control
bernstein -g "Build a REST API"
bernstein status
bernstein add-task "Add rate limiting"

# Declarative mode — fire and forget
cat bernstein.yaml   # goal, budget, team, cli
bernstein            # reads config, does everything

The seed file (bernstein.yaml) is the "easy button." The CLI is the "control panel." Both talk to the same task server underneath.

Why this wins: A new user can see results in under 2 minutes. The YAML file is shareable, version-controllable, and self-documenting. The CLI provides escape hatches for mid-run adjustments. Implementation cost is low because the CLI is already planned and YAML parsing is trivial.

The bernstein.yaml name echoes docker-compose.yml -- developers immediately understand the pattern: "This file describes what I want; the tool figures out how to do it."

Phase 2: Web Dashboard¶

Once the task server is stable, add bernstein --dashboard:

Real-time agent status grid (heartbeat indicators)
Task Kanban board (backlog / in-progress / done)
Cost burn-down chart
Log viewer per agent
WebSocket-driven, no polling

Keep it read-only initially. The CLI remains the control interface. The dashboard is for monitoring.

Technology: keep it minimal. A single-page app served by the existing FastAPI server. Use SSE (server-sent events) over WebSocket to avoid connection management complexity. HTMX or Alpine.js over React to minimize frontend maintenance surface.

Phase 3: GitHub App (enterprise distribution)¶

For teams and enterprises, ship Bernstein as a GitHub App. This is a separate distribution channel with its own onboarding:

Install Bernstein GitHub App on org
Add bernstein.yaml to repo (same format as local)
Create issues, label them bernstein
Agents run in Bernstein's cloud infra (or self-hosted runner)
PRs land with full provenance

This requires a hosted service or self-hosted runner infrastructure. It is a product expansion, not a feature addition.

Option D (life seed): Shelved¶

File-watching with automatic re-planning is shelved. The risk/reward ratio is poor for the current stage. The rag_challenge experience showed that implicit state changes cause agent churn and wasted budget. Explicit commands beat implicit watchers.

If revisited later, the minimum viable version is: watch bernstein.yaml for changes, show a diff, prompt for confirmation before re-planning. Never auto-execute.

Distribution Model¶

Phase 1:  pip install bernstein
          -> CLI + seed file
          -> Local only, single user

Phase 2:  pip install bernstein[dashboard]
          -> Adds web dashboard dependency
          -> Still local, but visually monitorable

Phase 3:  GitHub App + bernstein-cloud
          -> Team/enterprise distribution
          -> Hosted or self-hosted runner
          -> GitHub issues as input, PRs as output

The pip package remains the core. Dashboard is an optional extra. GitHub App is a separate product surface.

Open Questions¶

Should bernstein.yaml support multi-cell configs? E.g., defining cells with sub-goals. Defer until single-cell is proven.
Should the dashboard allow task creation / re-prioritization? Starting read-only is safer. Adding write operations is Phase 2.5.
Should bernstein without arguments look for bernstein.yaml AND .bernstein/ directory? Probably yes -- check for config file first, then fall back to interactive CLI prompt.
Notification model: Should Bernstein notify the user when done? (Desktop notification, terminal bell, Slack webhook?) Worth adding a notify config option in the YAML.

Decision¶

Adopt the layered approach: Phase 1 ships CLI + seed file. Phase 2 adds dashboard. Phase 3 adds GitHub App.

The seed file (bernstein.yaml) is the signature UX innovation. It makes Bernstein feel like infrastructure ("declare what you want, it happens") rather than a tool you have to babysit. Combined with the CLI for control and the dashboard for visibility, this covers solo developers through small teams with minimal implementation cost.