preview tier · release notes
// CHANGELOG

Release notes

Release notes for SDET Pulse — preview tier. Every shipped change to the plugin and the macOS companion app, newest first. ← back to landing

43 releases · plugin + macOS app · generated from CHANGELOG.md

0.46.0 plugin latest

Fixed

  • Honest tier/haiku accept-rate — outcomes now resolve from the per-turn session_turns ledger instead of the session's first-seen model. Taking a mid-session /model downgrade (exactly what the nudge asks) is now credited as accepted rather than silently scored rejected. accept_rate was a strict lower bound; it now reflects sessions with real per-turn evidence. Pre-v0.46 sessions without a ledger stay pending until a later history backfill.
  • pruneOldData now drops session_turns rows with a null ts. WHERE ts < ? never matches NULL (SQL three-valued logic), so turns whose transcript timestamp failed to parse were silently retained forever. Such turns are also unusable — resolvers filter ts > suggestion.ts, so null-ts turns can never contribute to an outcome. They are now dropped alongside aged rows.

Added

  • session_turns table (schema v10): one row per assistant turn (model, usage columns, per-turn cost_usd) extracted from the transcript at Stop; pruned on the 90-day rolling window. Foundation for per-model cost attribution and the honest accept-rate resolver.

0.45.0 plugin

Added

  • packs/model-tier-suggest/ — new advisory pack: one-tier-down nudges for execution-heavy sessions on premium models (Fable→Opus, Opus→Sonnet). Motivation: with Fable 5 ($10/Mtok input) as the daily driver, the only existing downgrade signal was haiku-router's binary jump to Haiku ($1) — real-world accept-rate ~0%, plausibly too big a leap. This pack offers the missing middle tiers. default_enabled: true, not sync-gated (not in ACTIVE_PACK_NAMES), UserPromptSubmit tap, advisory-only (additionalContext; the /model switch stays with the user).
    • Tier ladder derives from lib/cost.mjs — new exports modelFamily(model) + familyLadder() (families sorted by FAMILY_PRICING input rate: fable $10 > opus $5 > sonnet $3 > haiku $1). No hardcoded rates anywhere in the pack; nudge copy prices both sides live (Claude Opus handles this class of work at $5/Mtok input vs Fable $10/Mtok — consider /model opus for ~50% input-cost savings on the remainder).
    • Conservative heuristic — BOTH signals required: (1) session is tool-execution-heavy: mechanical tools (Bash/Edit/Write/Read/Grep/Glob) ≥ 70% of recorded tool_calls over ≥ 15 calls (mechanical_ratio / min_tool_calls / mechanical_tools configurable); (2) prompt is not frontier-flavored: configurable frontier_keywords (architect, design, research, prove, analyze deeply, ultracode, plan) absent, case-insensitive. Sonnet/Haiku sessions are silent (already cheap); unknown session model is silent (no fallback baseline — we never claim savings against a tier we can't identify).
    • Coordination with haiku-router: prompts that would qualify for haiku-router's own suggestion (keyword or short-prompt signal, using its EFFECTIVE threshold = pack_state config — which its auto-tuner adjusts — merged over its shipped defaults) are ceded entirely. Haiku is the deeper saving; never double-nudge.
    • Throttle is DB-durable (hooks spawn a fresh node per event — no module state): at most once per session AND once per cooldown_hours (default 4h) across sessions, both read from model_tier_suggestions itself.
  • Schema v9 — model_tier_suggestions (SCHEMA_VERSION 8 → 9, idempotent CREATE TABLE/INDEX IF NOT EXISTS like v7): one row per emitted nudge with frozen estimates (prompt_chars/4 tokens × lib/cost.mjs input rates — same convention as haiku_suggestions), outcome lifecycle pending → accepted/rejected, indexes on session_id + outcome. New insertTierSuggestion helper in lib/db.mjs.
  • lib/tier-effectiveness.mjs — lazy suggestion→outcome resolver, mirroring lib/haiku-effectiveness.mjs on its own table (haiku machinery untouched). accepted iff the session settled on the suggested family or cheaper; rejected iff it stayed on the from-tier or pricier; pending while the model is unknown. Documented honestly: sessions.model is FIRST-SEEN (SessionStart INSERT OR IGNORE; the Stop transcript backfill takes the FIRST assistant turn's model and only fills 'unknown'/NULL rows — verified in hooks/stats-tap.mjs), so a mid-session /model switch never updates the nudged session's row and the measured accept-rate is a strict LOWER BOUND on real acceptance. The same artifact depresses haiku-router's measured rate.
  • GET /api/tier-effectiveness?window= — haiku-effectiveness-shaped response (window, suggested, resolved, accepted, rejected, pending, accept_rate over resolved, potential_savings_usd over rejected, realized_savings_usd over accepted, recommendation: insufficient_data <20 resolved / low_accept <0.15 / healthy / high_accept >0.6). Resolves pending rows lazily on read; empty DB returns zeros, never 500s. No auto-tune — the tier heuristic stays hand-configured. New contract fixture test/fixtures/contract/tier-effectiveness.json wired into test/contract.test.mjs; existing endpoint shapes untouched.
  • Dashboard card + digest line. "Model tier suggest · accept-rate" card next to the Haiku router card (same .card/.metric-* styling, recommendation-colored headline, suggested/accepted/rejected/pending counts, $ left-on-the-table vs realized); weekly digest gains a Tier suggest: N suggested, X% accepted, $Y potential savings unrealized line under the Haiku router line.
  • Tests: test/pack-model-tier-suggest.test.mjs (fires on Fable + mechanical-heavy with cost.mjs rates and 50%/40% copy, Opus→Sonnet, est-math row freeze, silent on Sonnet/Haiku/unknown model, cedes haiku-eligible short + keyword prompts incl. tuned pack_state threshold, frontier-keyword silence, below-min-calls + low-ratio silence, once-per-session throttle, cross-session DB cooldown incl. expiry, alert row, no-db/session/prompt guards), test/tier-effectiveness.test.mjs (rate table incl. bare family names + [1m]/dated ids, classify accepted/cheaper/rejected/pending incl. 'unknown' placeholder + malformed to_model, resolver persistence + idempotence, metric counts/savings/recommendation tokens/window cutoff), test/api-tier-effectiveness.test.mjs (empty-DB no-500, outcome resolution incl. missing session, pending-excluded accept_rate + per-outcome sums, all four recommendation tokens, window filter + invalid-window fallback, write-back persistence), schema v9 assertions in test/db.test.mjs (fresh table/indexes, insert helper, simulated v8→v9 reopen with data intact, version pins 8→9). /api/packs/details count 23 → 24. 837 tests, 0 fail.

0.44.0 plugin

Fixed

  • Automatic historical-cost backfill on upgrade — testers' dashboards get the v0.43.0 corrected accounting without manual steps. v0.43.0 fixed the pricing table (Opus $15/$75 → $5/$25, Fable 5 + opus-4-8 added) and the operator DB was corrected by hand via the operator-only bin/backfill-costs.mjs — but that script never ships in the tester tarball, so the three testers' DBs still carried ~3x-inflated historical Opus costs and $0 opus-4-8/fable rows. The core backfill now lives in lib/backfill-costs.mjs (ships in the tarball) and the daemon runs it ONCE automatically at bootstrap (startServer, right after the DB opens), gated by a new schema_meta.cost_backfill_v044_done flag (migration v8, ALTER TABLE guarded like v5; SCHEMA_VERSION 7 → 8). One log line with before/after totals via logError; any failure is logged, never crashes the daemon, and leaves the flag unset so the next boot retries (the backfill is idempotent). bin/backfill-costs.mjs remains as a thin operator-only CLI wrapper (keeps --dry-run).
    • haiku_suggestions frozen estimates recomputed too — in the same transaction. Rows written before v0.43.0 froze est_opus_cost_usd at the wrong $15/Mtok rate; the backfill recomputes est_opus_cost_usd + est_haiku_cost_usd from the stored prompt_chars (deterministic: tokens = chars/4 × current input rates, baseline from the suggestion's session model, conservative opus-4-8 fallback — same formula as the pack's estimateSuggestionCost). Rows without prompt_chars are left untouched. Keeps potential_savings_usd aggregates honest instead of zeroing or carrying old-rate estimates.
  • cost-control token-estimate divisor derived from lib/cost.mjs. tokensRough in the live cost-ceiling potential-saving divided by a stale hardcoded $15 "approx Opus input rate"; it now divides by the session model's current input rate via getPricing (opus-4-8 family default $5/Mtok when the model is unknown), so the tokens_saved column reflects real current rates.
  • Dashboard Total-Savings breakdown can no longer render a negative subscription line. The subscriptionSavings - cacheSavings - compactSavings breakdown term is clamped with Math.max(0, …) — on low-usage windows the difference could go below zero and display as a nonsense negative saving.
  • Tests: new test/backfill-auto.test.mjs (suggestions recompute math incl. fable/opus/fallback baselines + no-prompt_chars untouched + dry-run, flag set after first run, second-boot no-op short-circuits before the backfill, failure leaves flag unset for retry and never throws, startServer end-to-end wire-in) plus schema v8 assertions. 803 tests, 0 fail.

0.52.0 macOS app

Fixed

  • Parallels processes (prl_*) now recognized as critical in kill-hogs. KillHogsPlanner.isCriticalProcess matched the keyword "parallels", but real Parallels process names are prl_disp_service / prl_vm_app — the keyword never matched them, so a Parallels VM ranked as a top memory hog arrived pre-checked in the kill picker. Added "prl_" to criticalSubstrings; Parallels rows are now listed-but-unchecked like Docker/QEMU. Tests cover prl_disp_service, prl_vm_app and the preloader non-match (no underscore). Info.plist 0.51.0 → 0.52.0.

0.43.0 plugin

Fixed

  • Corrected Opus pricing: $15/$75 → $5/$25 — every recorded Opus session cost was ~3x overstated. Opus has billed $5 input / $25 output per Mtok since Opus 4.5 (Nov 2025); the table in lib/cost.mjs was still carrying the Opus 4.1-era $15/$75 into claude-opus-4-7/-4-6 (verified against platform.claude.com/docs/en/about-claude/pricing). Cache multipliers unchanged (creation 1.25×, read 0.10× — now 0.10×$5 = $0.50/Mtok cache read on Opus, not $1.50).
    • bin/backfill-costs.mjs (new, operator-only — listed in OPERATOR_ONLY_BIN, never ships in the tester tarball) recomputes sessions.cost_usd from the stored token columns for every model-resolvable session; unresolvable models (NULL/'unknown'/foreign ids) keep their stored value. Idempotent; --dry-run prints the summary without writing. Applied to the operator DB: total recorded cost $33,975.96 → $14,609.89, 110 sessions rewritten, 3,318 skipped-unknown (zero-token ghost rows).
    • Percentage metrics were NOT affected — efficacy deltas (e.g. the −20% avg-cost-per-session claim) compare cohorts priced with the same table, so the constant rate factor cancels in the ratio. Only absolute $-figures change.
  • Unknown models are no longer silently $0. computeCost/getPricing now fall back to per-family default rates when the exact id misses the table (/claude-(fable|opus|sonnet|haiku)/ → fable $10/$50, opus $5/$25, sonnet $3/$15, haiku $1/$5), covering dated variants (claude-fable-5-20260609) and future minor bumps (claude-opus-4-9). A truly unknown family still returns $0 but now logs cost:unknown-model <id> via logError (once per process) so pricing-table drift is visible in error.log/doctor instead of invisible under-counting. New resolvePricing(model) export reports provenance (exact/family/null).
  • Dashboard $-figures derive from lib/cost.mjs instead of hardcoded rate cards. The tokenizer-estimator card ("$15 / $75" / "$3 / $15"), the Total-Savings + Reconciliation RATES = {input: 15, output: 75} tables, the context-tax $1.50/Mtok cache-read rate and the /compact simulator's CR_RATE_PER_MTOK = 1.50 all now compute from the window's dominant model via getPricing + exported CACHE_CREATION_MULTIPLIER/CACHE_READ_MULTIPLIER (fallback: current-era Opus). Stale $1.50/M vs $15/M wording in cache-ttl-aware fixed; README + landing (EN/PL) "15x cheaper than Opus" claims corrected to the real input-rate ratios (5–10× vs Opus/Fable).
  • lib/savings-helper.mjs fallback model re-pointed claude-opus-4-7claude-opus-4-8 (current era) — savings estimates for sessions without a recorded model now price at $5/Mtok input.

Added

  • Claude Fable 5 pricing — $10 input / $50 output per Mtok. Released 2026-06-09; sessions recorded as claude-fable-5 (or claude-fable-5[1m] from SessionStart) were priced $0 until now. Also added the missing claude-opus-4-8 entry (operator's opus-4-8 sessions were $0 — recovered ~$2,282 of recorded usage in the backfill).
  • haiku-router dynamic pricing. The frozen OPUS_INPUT_USD_PER_MTOK = 15 / HAIKU_INPUT_USD_PER_MTOK = 1 constants are gone — the pack now reads both sides from lib/cost.mjs, with the baseline taken from the session's ACTUAL model (sessions row; conservative fallback claude-opus-4-8 when unrecorded). The nudge copy computes its multiplier live: *"Haiku 4.5 input $1/Mtok vs Fable 5 $10/Mtok — 10x savings"* on a Fable session, 5x on Opus (the old copy claimed a now-impossible 15x). haiku_suggestions.est_opus_cost_usd keeps its column name for schema-v7 stability but semantically holds the *non-Haiku baseline* cost computed from the actual model — documented at the write site.
  • Tests: corrected-rate assertions across cost/savings-helper/pack-haiku-router/pack-cost-control, plus new coverage for Fable 5 rates + cache-multiplier math, family fallback (dated fable / future opus / legacy dated sonnet), resolvePricing provenance, unknown-model logged-once + falsy-model-quiet, dynamic haiku-router multipliers (10x Fable / 5x Opus) and the backfill script (recompute / skip-unknown / idempotent / dry-run). 797 tests, 0 fail.

0.51.0 macOS app

Fixed

  • Memory Relief / Kill Top Memory Hogs no longer blind-kills Docker/VM — selectable confirmation with container/VM unchecked by default. The old "Kill Top Memory Hogs" action (and step 2 of the Memory Relief bundle) filtered the process snapshot by a tiny system-exclusion set, took prefix(3) by phys_footprint, and immediately SIGTERM'd them. On any dev box Docker Desktop's Apple-Virtualization VM (2–8 GB) is permanently a top hog, so the "free RAM" button silently destroyed every running container.
    • KillHogsPlanner (new, pure/UI-free)isCriticalProcess(_:) flags container/VM infrastructure by case-insensitive substring (docker, vpnkit, qemu, virtualiz, containerd, podman, colima, vmware, parallels, utm, com.apple.virtualmachine). Substring matching is truncation-proof — proc_name() clips to ~16 chars, so com.docker.backend arrives as com.docker.back and still matches. buildCandidates(from:topN:displayCount:) hard-excludes the system set + self + pids < 100, default-checks the top topN non-critical hogs (a critical row at rank 1 no longer consumes a checked slot), and surfaces a few extra context rows. killSelected(pids:candidates:killer:) SIGTERMs only the given PIDs (killer injectable for tests).
    • KillHogsPickerSheet (new) — a SwiftUI confirmation sheet (same pattern as the Restart-App picker) listing each candidate with a checkbox, name, MB, and a red "container/VM — killing loses running state" tag on critical rows. Non-critical hogs up to topN are pre-checked; Docker/VM are listed but unchecked. "Kill Selected" SIGTERMs only the checked PIDs. The standalone "Kill top 3 memory hogs" menu item now opens this sheet instead of blind-killing.
    • Memory Relief bundle — step 2 now auto-skips critical (container/VM) processes entirely; the bundle confirm message says so. Docker is safe whether the user runs the bundle or the standalone action; it can only die by a deliberate checkbox click in the picker. SIGTERM (graceful), the existing systemExclusions hard-exclude, and the v0.40 freesMemory stats-refresh all preserved.
    • Tests: new KillHogsPlannerTests (11 tests — isCritical matches/non-matches incl. truncated + case-insensitive, Docker-as-#1 listed-but-unchecked, critical-doesn't-consume-slot, system/self/low-pid exclusion, displayCount-vs-topN, killSelected only-selected + failure/unknown labelling + empty sentinel) plus an ActionProtocolTests case for the new explicitPIDs action shape. Info.plist 0.50.0 → 0.51.0 (footer derives from the bundle). Full Swift suite (87 tests, 24 suites) green; build.sh --profile=preview succeeds (universal arm64 + x86_64).

0.42.0 plugin

Added

  • auto-update-notify — testers now get told when a newer SDET Pulse is available, across every platform. Until now nothing surfaced an outdated *daemon*: the macOS app's update checker only polled the Swift-app zip manifest, and the plugin had no self-update check at all — so a tester could run a months-old plugin indefinitely. This release closes that, and the key reach is the pure-plugin nudge which works for Windows testers who have no macOS app.
    • lib/update-check.mjs — version-check helper (pure + best-effort). compareVersions(a, b) → -1/0/1 is semver-ish (numeric dot-compare, zero-pads different lengths, and strips a leading v plus any pre-release/build suffix *before* the dot-split so 0.42.0-rc.1 never spuriously outranks 0.42.0). fetchLatestVersion(baseUrl, { fetchImpl, cacheFile, ttlMs }) GETs {baseUrl}/dist/latest.json, caches { latest, checked_at, base_url } in ~/.claude/sdet-pulse/update-check.json (atomic tmp+rename write), and only re-fetches when the cache is missing, for a different base URL, or older than the TTL (default 6h). It is strictly best-effort — on offline/non-2xx/malformed-body it returns the cached value if present, else null, and never throws into the caller. getUpdateStatus({ currentVersion, latest }) resolves { current, latest, update_available }. Base URL defaults to https://pulse-test.sdet.it.
    • /health now carries latest_version (string|null) and update_available (bool). Both are read from the cache file only — /health never blocks on a live network fetch — so it stays as fast as before. Cache absent → latest_version: null, update_available: false.
    • packs/update-nudge/ — new SessionStart advisory pack (default_enabled: true, not gated, not in ACTIVE_PACK_NAMES). On SessionStart it best-effort resolves the latest version (6h-cached) and, when the installed daemon is behind, emits a one-line additionalContext: *"SDET Pulse vX is available (you have vY). Update: re-run your install one-liner (curl … | bash / irm … | iex) then restart Claude Code."* The current version is read from package.json. It is throttled to once per 24h via last_nudge_at stored in the same update-check.json cache file (CC hooks spawn a fresh node process per event, so no module-level state could survive — the file is the only durable throttle), honors a config flag enabled (default true), and swallows all errors so it can never break SessionStart. Because it is pure plugin, it is the cross-platform reach — Windows testers get the nudge too.
    • GET /api/update-status{ current, latest, update_available, checked_at }. Cache-only read for the macapp to consume (no network, same speed as /health).
    • Tests: test/update-check.test.mjs (compareVersions across equal / a>b / b>a / different lengths / pre-release suffix; fetchLatestVersion cache-hit-skips-fetch / expired-refetches / network-error-returns-cached-or-null-not-throw / malformed-body; getUpdateStatus true/false/null), test/pack-update-nudge.test.mjs (fires when outdated, silent when current==latest, silent when ahead, throttle respected + clears after window, flag-off no-op, offline no-throw), and test/update-status-endpoint.test.mjs (/api/update-status + /health update fields, cache-seeded). The /health JSON contract fixture and the macapp health decoder were extended for the two new top-level keys.

0.50.0 macOS app

Added

  • Daemon-update banner — the menubar app now surfaces an outdated PLUGIN, not just an outdated app. The app's UpdateChecker previously polled only the macapp-zip manifest (macapp-latest.json), so a tester whose *daemon* was months behind got no signal from the app. This release adds a second, independent signal driven by the daemon's new GET /api/update-status.
    • UpdateChecker also polls http://localhost:3535/api/update-status on the same loop (a cheap localhost GET, run every tick rather than gated on the 24h zip boundary, plus on manual "Check for updates"). It mirrors the daemon's current / latest / update_available into new @Published daemonUpdateAvailable / daemonLatestVersion / daemonCurrentVersion fields and caches the last result in UserDefaults (update.daemonLatest / update.daemonCurrent) so the banner can appear on launch before the first poll. A missing daemon (connection refused) leaves the last-known state untouched — it never raises a daemon-down error from the update checker (the CC tab already owns that).
    • DaemonUpdateBannerView — a visually distinct, informational banner. Because the daemon updates via the install script (not the app's in-place zip swap), this banner deliberately has NO "Install Now": it shows "Daemon update available: v0.42.0" (+ "You're running v0.41.0") with a green plugin accent to set it apart from the blue macapp-update banner, a "How to update" button that opens an alert with the one-line install command and a Copy command action (curl form for macOS, with the Windows PowerShell irm … | iex form spelled out too), and a "View on GitHub" changelog link. Rendered in SystemRootView right below the existing macapp-update banner.
    • Localizable.strings (en + pl). New update.daemon.* keys for the banner headline, the current-version subtitle, the "How to update" button, and the alert title/body/copy/close — Polish with proper diacritics.
    • Info.plist bumped 0.49.0 → 0.50.0 (footer derives from the bundle, so it tracks automatically). The macapp/daemon contract test (ContractTests + TestHelpers) was updated for the two new /health keys (key count 12 → 14). Full Swift suite (76 tests, 23 suites) green; build.sh --profile=preview succeeds.

0.41.0 plugin

Added

  • haiku-router effectiveness tracking + conservative threshold auto-tune — the pack can finally tell whether its advice converts. Real 7-day telemetry showed haiku_prompt_candidate firing ~886×/week (the #2 most-fired alert) while only ~38 sessions actually ran on Haiku — the pack was blind to whether it was generating ~886 lines of ignored noise or the user was leaving large savings on the table. This release closes the suggestion → outcome loop end to end. All tracking is durable in SQLite (CC hooks spawn a fresh node process per event, so no module-level state could survive).
    • haiku_suggestions table (schema v7). Every time the user-prompt advisory fires, the handler also writes a pending row (session_id, ts, prompt_chars, and a frozen est_opus_cost_usd / est_haiku_cost_usd snapshot — tokens ≈ chars/4 at Opus $15/Mtok vs Haiku $1/Mtok). The write is best-effort and swallows errors so a stats failure can never break the hook. Indexed on session_id and outcome; migration is idempotent (CREATE … IF NOT EXISTS).
    • Lazy outcome resolver (lib/haiku-effectiveness.mjs). A pure, tested resolver looks up each pending suggestion's session model: ran on Haiku → accepted, ran on Opus/Sonnet → rejected, model still unknown → stays pending. It runs lazily when the endpoint/digest is queried (the session's final model isn't settled at suggestion time, and lazy resolution keeps the latency-sensitive hook path free of extra DB work). Idempotent — resolved rows are never revisited.
    • GET /api/haiku-effectiveness?window=7d. Returns suggested / resolved / accepted / rejected / pending, accept_rate (denominator = resolved, not suggested — a pending backlog can never drag the rate toward zero), potential_savings_usd (Σ of est_opus − est_haiku over rejected = dollars left on the table), realized_savings_usd (same Σ over accepted), and a recommendation. Honors the shared window helper; insufficient_data (resolved < 20) is a clean 200, never a 500. recommendation: accept-rate < 0.15 with resolved ≥ 20 → low_accept_raise_threshold; accept-rate > 0.6 → high_accept_lower_threshold; else healthy.
    • Conservative, gated auto-tune. When the metric recommends it, the tuner nudges the pack's short_prompt_chars threshold up ~20% on a low accept-rate (stop suggesting on prompts too big to be Haiku-class) or down ~20% on a high accept-rate, clamped to [50, 2000], at most one move per 3-day cooldown (persisted as last_tuned_at in the per-pack config_json) so it can't thrash on noisy week-to-week data. It emits an info alert haiku_threshold_tuned so the move is visible, and is gated behind a config flag auto_tune (default true) so it can be disabled. Clamp + cooldown are mandatory.
    • Dashboard card + digest line. A new "Haiku router — does it convert?" card shows the accept-rate as a big percent (colored by recommendation), the suggested/accepted/rejected/pending counts, dollars potential-vs-realized, and the recommendation as a one-line hint (honors the window picker). The weekly digest gains one line: Haiku router: N suggested, M% accepted, $X potential savings unrealized.

0.49.0 macOS app

Changed

  • Shared DaemonClient + generic DaemonFetchView unify every daemon-backed API tab — no behavior change. Each of the six API tabs (CC, Cost, Packs, Stats, Skills, Testers) used to re-declare the same reachable / lastError / lastRefresh @State triad, its own async fetchJSON / postJson, a (200..<300)-guarded URLRequest carrying a per-tab User-Agent, an identical 10s/30s poll loop, and a byte-for-byte "Daemon unreachable + Start daemon" panel. That copy-paste is what produced several review findings about inconsistent recoverable states. This wave centralizes all of it into one new file, bin/macapp/Pulse/DaemonClient.swift, with the full Swift test suite (76 tests, 23 suites) and the macapp/daemon contract fixtures as the safety net — same endpoints, same per-tab User-Agent strings, same 3s GET / 8s Testers-POST timeouts, same poll cadence, same Start-daemon affordance.
    • DaemonClient (@MainActor) — owns the base URL, the URLRequest builder + UA header, GET/POST with the (200..<300) guard and JSON decode, and the canonical Health / Alert decode (which still delegates to the single hand-written parseHealth / parseAlerts in App.swift, so there remains exactly one health-shape parser in the app). One instance per tab preserves the distinct per-surface User-Agent the daemon's request log keys off.
    • DaemonFetchView<Model: Decodable, …> — owns the reachable / lastError / lastRefresh triad, the poll loop (.task(id:)), and a uniform loading / empty / unreachable+StartDaemonButton / error state. The four single-model tabs (Cost, Packs, Skills, Testers) shrink to "given the decoded Model, draw the rows"; a refreshToken (bumped on window changes, .skillsListChanged, or after a successful POST) forces an immediate refetch since the wrapper now owns the loop. CC keeps its bespoke two-endpoint shape but routes both GETs through DaemonClient and drops its private CCAlertSnap in favour of the canonical Alert; Stats stays aggregator-driven but routes its one Cost-Projection GET through the client too.
    • DaemonUnreachableView — the recoverable panel the API tabs render when the daemon is down, mirroring MonitorUnreachableView (which already serves the live-monitor tabs). The recoverable state — copy + Start-daemon button — is now identical across every tab (Testers' panel padding aligns to the shared 40-pt the other tabs already used). The seven live-monitor tabs were already centralized through SystemSectionState + MonitorUnreachableView and are unchanged.

0.40.0 plugin

Added

  • landing/install.ps1 — Windows one-line installer (PowerShell mirror of install.sh). Windows testers now get the same 30-second bootstrap Mac testers get, via $env:SDET_PULSE_TOKEN="<token>"; irm https://pulse-test.sdet.it/install.ps1 | iex. It mirrors every install.sh step — token-validated manifest fetch, tarball download (with Invoke-WebRequest -OutFile so PowerShell 5.1's codepage conversion can't corrupt the .tgz), SHA256 verify-or-abort, tar -xzf extract into the plugin cache, npm install --omit=dev, install-token + sync-state.json (local-only) write, and the same claude CLI auto-detect with a LOUD manual-register banner fallback. Windows-specific: daemon auto-start is a logon Scheduled Task (schtasks /SC ONLOGON) plus an immediate hidden Start-Process (Startup-folder shortcut fallback if schtasks is blocked), and the statusbar-windows.ps1 tray companion is launched and given a Startup shortcut. ASCII-only console output ([OK]/[!]/[x]) for cp1252 consoles. Served at https://pulse-test.sdet.it/install.ps1 (text/plain; charset=utf-8).

Changed

  • bin/dashboard-server.mjs split into lib/http + lib/api/* + lib/render/* — no behavior change. The dashboard daemon had grown into a single ~1955-line file: a 580-line linear if/else router, ~25 build*/gather* HTML render functions, and all the HTTP plumbing crammed together. This wave carves it into focused modules with the daemon's full ~660-test suite as the safety net — every endpoint's status code and response shape is byte-for-byte identical (the macapp/daemon contract tests from the prior wave still pass unchanged). The bin/dashboard-server.mjs entrypoint drops from 1981 to 266 lines.
    • lib/http.mjssendJson(res, status, obj) / sendHtml / send404 collapse the 34 repeated res.writeHead(...); res.end(JSON.stringify(...)) blocks into three helpers; the bytes written are unchanged.
    • lib/render/dashboard.mjs — every build* / gather* / sparkline HTML builder plus renderDashboard and gatherCostByProject moved verbatim (pure string functions, already exported + tested).
    • lib/api/stats.mjs / health.mjs / packs.mjs / testers.mjs — the endpoint handlers, now returning values that a slim dispatcher serialises (a raw(status, json) envelope carries the non-200 paths for the pack-toggle and tester routes).
    • Slim bin/dashboard-server.mjscreateServer, readRequestBody, startServer, and a route dispatch table ([{ method, path, handler }], string or RegExp paths, evaluated in the original order). The localhost Host/Origin guard (DNS-rebinding / CSRF defense) is applied once, centrally, in the dispatcher before any handler runs; the DB is still opened once per server lifetime. Synchronous routes complete on the same tick (the dispatcher only awaits genuinely-thenable handlers), preserving the existing no-await handleRequest() call contract the unit tests rely on. The module re-exports the data shapers and render helpers (renderDashboard, handleRequest, isLocalRequest, gatherCostByProject, gatherCostProjection, gatherSkillUsage, sparklineSvg, gatherDailyBreakdown, buildTodayStrip) from their new homes so existing importers and tests are unaffected.

0.48.0 macOS app

Contract guard (macapp side) + version single-source audit

  • The macapp now decodes the shared contract fixtures in CI, completing a bidirectional guard. The daemon side already asserted (test/contract.test.mjs) that a live /health, /api/testers, /api/skill-usage, /api/cost-by-project, /api/cost-projection and /api/packs/details response carries exactly the keys pinned in test/fixtures/contract/*.json. But the macapp half — the Swift Codable structs that actually consume those payloads — had no equivalent check, so a daemon field rename could pass the Node suite and still blank a tab at runtime (a missing key decodes to nil). New bin/macapp/tests/ContractTests.swift builds a JSON payload whose keys are taken from the same descriptor files and decodes it into TesterEntry, SkillEntry, PackEntry, Project, CostProjectionResponse, and (via the hand-written parseHealth) Health. It flags drift in *either* direction — a descriptor key the struct doesn't model, a modeled field the descriptor dropped, or a field whose type changed — so a future daemon rename fails the Swift test instead of silently shipping a dead tab. The Foundation-touching logic (JSONDecoder / filesystem) lives in tests/TestHelpers.swift to keep test files on import Testing only (the _Testing_Foundation overlay ships with Xcode.app, not CLT); the fixtures directory is resolved by walking up from the working directory, with a SDET_PULSE_CONTRACT_FIXTURES override for CI. These run via the existing ./build.sh --tests harness — find ./tests -name '*.swift' picks the new file up with no harness change.
  • Version single-source audit — every version display reads the bundle. Confirmed across the whole macapp Swift source that no hardcoded version-string literal remains in any display path: About (AboutWindow), the CC-tab footer, the System-tab footer, the Check-for-Updates dialog, and the UpdateChecker comparison all read Bundle.main.infoDictionary["CFBundleShortVersionString"] (the System footer's hardcoded v0.44.0 and the CC footer were fixed in earlier waves; this wave verifies nothing regressed). The only remaining semver literals in the tree are documentation comments and an ultimate ?? "0.0.0" decode fallback — no display surface can drift from Info.plist again.

0.39.0 plugin

Release source-of-truth + macapp/daemon contract

  • release.mjs macapp subcommand — single source of truth for dist/macapp-latest.json. bin/release.mjs was daemon-only: npm run release writes latest.json from package.json, but the macapp's update manifest (macapp-latest.json), Info.plist version, and the System-tab footer were each hand-edited in separate places, so its three numbers (version / sha256 / size) drifted from the binary that actually shipped. The new macapp subcommand reads CFBundleShortVersionString (and LSMinimumSystemVersion) from bin/macapp/Info.plist and derives the artifact metadata from the built dist/SdetPulse-<v>-preview.zip, then writes dist/macapp-latest.json (version + sha256 + size_bytes + built_at + released_at + download_url, preserving the existing manifest_url / min_os / build_profile fields). It errors clearly when the preview zip is absent ("build the macapp first") rather than emitting a manifest for a binary that doesn't exist. Exported readPlistString / readMacappVersion / buildMacappLatest / runMacapp, all unit-tested.
  • Shared macapp/daemon contract fixtures. New test/fixtures/contract/*.json pin the canonical JSON key shape of every daemon endpoint the macapp decodes — health, cost-by-project, skill-usage, cost-projection, packs/details, testers. A daemon-side field rename or drop previously surfaced only as a silent nil-decode in the Swift app at runtime; now test/contract.test.mjs starts a real HTTP daemon on an ephemeral port against a throwaway SDET_PULSE_DATA_DIR (never the operator's running daemon or real DB), seeds one row per surface, and asserts the LIVE response's top-level keys — and the array-item keys — match each fixture exactly. The macapp side decodes the SAME fixtures, so both ends stay in lockstep.
  • Version-sync checks wired into npm test. New test/version-sync.test.mjs asserts the two source-of-truth pairs never drift after a release: daemon package.json version == dist/latest.json version, and Info.plist CFBundleShortVersionString == dist/macapp-latest.json version. dist/ is gitignored (artifacts are produced locally), so the check skips cleanly on a fresh checkout where no artifact exists and otherwise fails loud on the classic "bumped the version but forgot to regenerate the manifest" mistake.

0.47.0 macOS app

Fixed (onboarding resilience)

  • Monitor tabs no longer spin forever when a poll fails. The 7 live-monitor tabs (Memory, Procs, Docker, Ports, Launchd, LLM, GPU) rendered "Loading…" indefinitely when their monitor poll threw — no error, no way out. The 6 daemon-backed API tabs already showed a "Daemon unreachable" panel with a one-click Start daemon button; the monitor tabs had no equivalent, so a tester landing on Memory or Procs on a wedged machine saw a dead spinner. SystemSectionState now tracks a per-monitor *Failed flag (set whenever that monitor's poll() throws), threaded into each tab as pollFailed. When a tab has no snapshot AND its last poll failed, it renders the new shared MonitorUnreachableView — same orange triangle + recoverable StartDaemonButton the API tabs use — instead of "Loading…". A genuine first-load spinner (poll not yet returned) is unaffected; the unreachable state only appears after a real failure.

Fixed (reliability)

  • SubprocessRunner now drains stdout AND stderr concurrently before waitUntilExit(). It previously assigned standardError = Pipe() but never read it, and read stdout only after the process exited. For launchctl list, lsof -i or docker ps whose output exceeds the ~64KB OS pipe buffer, the child blocks writing into a full pipe, never exits, the 5s timeout terminate() fires, and every poll returns truncated/empty data on a busy machine. Both pipe read-ends are now drained on their own background queues started *before* waitUntilExit(), so the child runs to completion; the timeout is preserved. Covered by a new regression test (/bin/dd emitting 204800 bytes of stdout — over 3x the buffer — plus stderr; pre-fix this deadlocked and returned nil).

0.38.0 plugin

Onboarding resilience

  • install.sh now has a real end-to-end test. Until now landing/install.sh was never executed end-to-end anywhere — CI is ubuntu-only and the existing test only text-extracted two helper functions. That blind spot is exactly how two testers' onboarding breaks shipped. New test/install-sh-e2e.test.mjs runs the real install.sh against a local fixture HTTP server (a tiny tarball + latest.json + macapp-latest.json) inside a fully sandboxed HOME, with claude, launchctl, npm, open and uname stubbed on a throwaway PATH so nothing touches the real machine. It asserts the happy path (download → sha verify → extract to the canonical cache path → sync-state.json written local-onlyinstall-token written chmod 600 → plugin registration attempted → the LOUD NEXT-STEPS banner when claude is absent), that the script aborts on a deliberately-wrong sha (no extract, no token), and that a present claude stub gets marketplace add + install invoked with no manual banner. Gated on bash + tar; skips cleanly where the fixtures can't be hosted.
  • SDET_PULSE_BASE_URL override hook added to install.sh (default unchanged: https://pulse-test.sdet.it) so the e2e test can point the installer at its fixture origin. A companion SDET_PULSE_APP_CANDIDATES override lets the test isolate Claude.app discovery from the host — production installs set neither and behave exactly as before.
  • /doctor now has an ENVIRONMENT section. bin/doctor.mjs previously only inspected the SQLite DB — none of what actually broke the first testers showed up. Four new probes, each green/red with a one-line remediation hint: (a) daemon reachableGET http://localhost:3535/health; (b) LaunchAgent loaded — via statusAgent(), skipped cleanly off macOS; (c) plugin registered — the cache dir exists and holds a plugin manifest, package.json, or version subdir; (d) node on a launchd-style minimal PATH — catches the nvm/asdf-shim-only setups where the daemon can't find node at login. Every probe is dependency-injected and fully unit-tested (green + red branches) without touching the network, launchctl, or the real filesystem.

Notes

  • bin/doctor.mjs now guards its main() behind an import.meta.url check (matching install-launch-agent.mjs) so the module can be imported for testing without auto-running. CLI behaviour and exit codes are unchanged; a failing environment probe is a warning (exit 1), not a DB-corruption error (exit 2).

0.46.0 macOS app

Fixed (responsiveness)

  • Node resolution no longer freezes the menubar. findNode() falls back to /bin/zsh -l -c 'command -v node' (up to a 1.5s poll) when no static path resolves — and it was being called on the main thread from three places: the first-launch bootstrap (before its background hop) and both alert-acknowledge menu actions. That stalled the menubar for up to 1.5s on launch and on every ack click, ironically hurting the nvm/mise users the shell fallback was added to rescue. bootstrapIfNeeded now resolves node inside its background queue; acknowledgeOne/acknowledgeAll resolve + spawn off the main thread and hop back to main only for the UI refresh. No main-thread Process spawn for node resolution remains.

Fixed

  • System tab version footer is now dynamic. It hardcoded v0.44.0; it now reads CFBundleShortVersionString from the bundle (same source About and the CC tab use), so it always reflects the real build.
  • CC-tab sync indicator reflects real daemon state. CCTabView decoded flat sync_stale / sync_status keys the daemon never emits, so the indicator was always "unknown". Both call sites now share the one Health decoder (App.swift parseHealth), which reads the daemon's actual nested shape (sync_gate.stale + the sync object). Daemon unchanged.

Concurrency

  • Bootstrap double-spawn guard. First-launch bootstrapIfNeeded and the "Start daemon" button could both run install-launch-agent.mjs / launchctl load at the same time. An in-process isBootstrapping flag (claimed/released under a lock) now serializes the two paths, and the bootstrapSucceededAt write — previously written from both the background queue and the main thread — is synchronized under the same lock.

0.37.0 plugin

Security

  • DNS-rebinding / CSRF guard on the localhost dashboard daemon. The daemon binds 127.0.0.1, but handleRequest performed no Host/Origin validation — so a malicious web page the user was visiting could have its browser POST to http://localhost:3535 from the user's own machine. POST /api/testers mints a real bearer and SSHes to the VPS; GET /api/testers leaks tester emails. New isLocalRequest(req) runs at the top of handleRequest for all routes and returns 403 forbidden when (a) the Host header's host-portion is not loopback (localhost / 127.0.0.1 / ::1; the port is ignored so any configured port works) or (b) the request carries any Origin header (legit CLI / macapp traffic sends none — only a browser fetch does). Requests with no Host header (in-process tooling, unit-test mocks) are treated as local.

Fixed

  • Tester lookup is now exact-match. findTesterByName used LIKE %name%, so revoke "Raf" could revoke Rafal and the create-conflict check could false-positive on a prefix. Switched to name = ?. Covers both the create conflict-check and the revoke path (plus the testers.mjs status/revoke CLIs).
  • Window helpers deduplicated. The four near-identical implementations (windowCutoffMs + daysInWindow in the dashboard server, cutoffFor in bin/stats.mjs, daysFor in bin/digest.mjs) now share lib/window.mjs (cutoffMs / days / normalize / VALID_WINDOWS). Behavior is unchanged for every window the UI emits.
  • Removed an unused execFileSync import in packs/post-edit-validate/handler.mjs (only spawnSync is used).

Performance

  • DB opened once per server lifetime instead of once per HTTP request. startServer previously ran openDb()initSchema()seedMissingPackStates() on every request (re-checking migrations and re-reading every pack manifest), which is wasteful for a dashboard polled every few seconds. The single shared handle is reused across requests (WAL already enabled; better-sqlite3 is synchronous) and closed on server shutdown so the WAL checkpoint flushes cleanly.

0.36.0 plugin

Fixed (critical onboarding)

  • install.sh now detects whether claude CLI is present. If absent, attempts to symlink it from /Applications/Claude.app/Contents/Resources/cli/claude (or similar known locations) to ~/.local/bin/claude. If still unreachable, prints a prominent NEXT STEPS block at end of install with exact UI steps to register the marketplace manually (Settings → Plugins → Marketplace). Previously the registration step silent-skipped and testers (Rafał) thought install was broken because the plugin never appeared in Claude Code.

Why

  • Second tester (Rafał) ran v0.35.0 install.sh on macOS, plugin code landed in ~/.claude/plugins/cache/sdet-pulse-marketplace/sdet-pulse/ but Claude Code on macOS ships as a .app bundle without a claude symlink on PATH by default, so claude plugin marketplace add + claude plugin install silently no-op'd via || true. Plugin invisible in CC, tester thought install was broken. This release closes that gap with auto-discovery of the embedded CLI plus a loud "ONE MORE STEP" block as a safety net.

0.45.0 macOS app

Fixed (critical onboarding)

  • LaunchAgent bootstrap retries on next launch if it failed. Previously the macapp set pulse.launchAgent.bootstrapTried: true regardless of result; if the first attempt failed (e.g. findNode() returned nil, installer not on disk yet, launchctl exit != 0) the flag was latched and the bootstrap never retried — Paweł's stuck state. New design: only sets pulse.launchAgent.bootstrapSucceededAt: Date on success; failures retry next launch. Legacy v0.44.0 flag is cleared on first read so upgrades reset cleanly.
  • Bootstrap now logs to ~/Library/Logs/SdetPulse/bootstrap.log (chmod 644) for diagnostic trail — every attempt appends a TSV line (timestamp, event, detail) so when a tester reports "bootstrap silently failed" we can ask for the file.
  • findNode() hardened with nvm v22/v20 path aliases, asdf shims, fnm default alias, MacPorts /opt/local/bin path, plus a /bin/zsh -l -c 'command -v node' login-shell fallback (1.5s timeout). Users without Homebrew in standard locations now get auto-bootstrap.
  • One-shot NSAlert when findNode() returns its /usr/bin/env fallback — points the user at the Pulse tab "Start daemon" button OR Homebrew install instructions. Non-blocking, one per launch, ack'd on click. Bootstrap flag stays unset so the next launch retries after Node is installed.
  • startDaemon() also marks bootstrapSucceededAt on success — clicking "Start daemon" once is now treated as authoritative bootstrap success.

0.44.0 macOS app

Added

  • "Start daemon" button in the daemon-unreachable error state. One-click loads (or installs+loads) the LaunchAgent and re-checks daemon health. No more "open Claude Code and run a slash command" workaround. Wired into CCTabView, SkillsView, TestersView, CostByProjectView, and PacksSettingsView so every daemon-down screen has the same affordance.
  • Auto-bootstrap LaunchAgent on first launch — if the plist doesn't exist, macapp invokes install-launch-agent.mjs once. UserDefaults flag pulse.launchAgent.bootstrapTried prevents retry. Closes the gap for users (e.g. Paweł Cebernik) who installed before daemon v0.35.0.

0.35.0 plugin

Added

  • bin/install-launch-agent.mjs — macOS LaunchAgent installer. Writes ~/Library/LaunchAgents/it.sdet.pulse.daemon.plist (auto-start at login + restart on crash via KeepAlive: SuccessfulExit=false) and loads it via launchctl. Subcommands: default (install+load), --uninstall, --status.
  • landing/install.sh now invokes the LaunchAgent installer on macOS so the daemon is up immediately after install. No more "daemon unreachable" first-run errors for new testers.

Why

  • First tester (Paweł Cebernik) installed v0.34.0 daemon + v0.43.0 macapp and immediately hit "daemon unreachable" because nothing started bin/dashboard-server.mjs. The daemon previously only launched via /sdet-pulse:server start slash command which testers don't know about. This release closes that gap.

0.43.0 macOS app

Added

  • Testers tab — Manage preview testers from the macapp. List existing testers (name, email, token_first8, status, activity count). "Invite new tester" form generates a token via daemon v0.34.0 endpoint, syncs to VPS tokens.map, saves to local testers.db, opens the rendered email template in the default editor + copies to clipboard for instant paste into Apple Mail. Right-click row → Revoke (confirms + removes from VPS).

Why

  • Previously onboarding a tester required terminal + node bin/add-tester.mjs + manual editor open of ~/Desktop/pulse-invite-<name>-<date>.md. New flow is one tab + one button.

0.34.0 plugin

Added

  • Testers management endpoints: GET /api/testers (list with first-8 token + activity), POST /api/testers (generate token + VPS sync + return email template), POST /api/testers/:name/revoke (mark revoked + remove from VPS tokens.map). Powers the macapp Testers UI in v0.43.0. Full 47-hex token never leaves the host except inside the rendered email template field.

0.42.0 macOS app

Added

  • Packs Settings UI in Settings tab. All 22 packs listed with master toggle + 24h fire count + expandable threshold sliders (numeric configs), toggles (boolean), and text fields (string configs). Search filter. Reset-to-defaults per-control + bulk. Hot/Silent hints on packs that fire heavily or stay silent for 7+ days. Backed by daemon v0.33.0 endpoints.

Why

  • Previously pack tuning required SQLite edits or bin/cli-pack.mjs. Power-user feature surfaced into the GUI lets testers tune without leaving the app.

0.33.0 plugin

Added

  • Pack management endpoints: GET /api/packs/details (inventory with defaults + current + effective config + 24h fires), POST /api/packs/:name/state (toggle enable), POST /api/packs/:name/config (merge threshold overrides). Backs the macapp Settings → Packs UI shipping in v0.42.0.

0.41.0 macOS app

Added

  • Skills tab actions. Per-skill "Disable" via right-click/context menu + bulk "Disable all never-used" header button. Action moves the skill folder to ~/.claude/skills/.disabled/<name>/ (user-owned skills, Option 2 — reversible filesystem move) or appends to ~/.claude/sdet-pulse/disabled-skills.txt (plugin/marketplace skills, Option 1 — daemon-side skip-list). Toast on success + immediate counts refresh via NotificationCenter.skillsListChanged. The Skills tab is no longer read-only — 34 dead-weight skills can be cleared in one click without leaving the popover.
  • macOS native notifications for pack alerts via UNUserNotificationCenter. New Notifications/AlertNotifier.swift module wraps gating + content construction. Polls hook into the existing CC alert poll (no extra HTTP traffic) and surface new warn / critical alerts as system banners. Settings → Pack alerts: master toggle + per-severity toggles (Critical / Warn / Info — Critical+Warn ON by default, Info OFF). Respects macOS Do Not Disturb / Focus modes automatically. First-launch permission prompt persists granted/denied into UserDefaults so the user is never re-prompted.

Why

  • Skills tab was read-only — 34 never-used skills visible but no way to act. Now one-click reversible disable, with a defensive split: user-dir skills get moved (file-level), marketplace skills get skip-listed (so they re-appear after npm install won't undo the user's intent).
  • Alerts previously required dashboard attention. Native notifications give 10x discoverability without continuous monitoring. Per-severity toggles let users opt into info-level chatter (default off) or silence warns (e.g. while pair-programming).

0.32.0 plugin

Added

  • ~/.claude/sdet-pulse/disabled-skills.txt skip-list support in lib/skills-enumerator.mjs + gatherSkillUsage(). One skill name per line; blank lines + #-prefixed comments tolerated. Names in this file are excluded from /api/skill-usage (both the on-disk total AND tracked rows). The macapp Skills tab (v0.41.0+) writes here when the user disables a plugin/marketplace skill that can't be safely moved on disk.
  • Enumerator now skips . -prefixed directories under ~/.claude/skills/ so the .disabled/ sub-directory created by the macapp's filesystem-move path is never inadvertently surfaced.

0.40.0 macOS app

Fixed

  • Stats didn't refresh after Memory Relief / Free RAM actions. RAM, Swap, and CMPRS values stayed at pre-action levels until the next poll cycle (5-30s lag). Now: action handlers post memoryStatsShouldRefresh notification on success, and all stats-displaying views (Monitors, Memory section, status item label) subscribe to it and force-refresh immediately. User clicks Purge / Kill Heavy / Unload Ollama / Memory Relief bundle → sees fresh numbers within 1s.

0.39.0 macOS app

Fixed (critical)

  • Preview testers were seeing the old CC menu instead of the v0.34+ popover with 12 tabs. Root cause: FeatureFlags.compiledDefault returned false for the preview build profile, so setupSystemModule() was never invoked at startup. Status item stayed in the old NSMenu mode. Fix: preview profile now compiles with systemModuleEnabled = true by default — testers see the full popover (CC, Stats, Cleaners, Cost by Project, Skills, etc.) out of the box.

Added

  • Settings → Advanced → Enable system module toggle. Lets users opt out to the classic CC menu without rebuilding. Restart required (mentioned in caption).
  • install.sh now writes the UserDefaults override during install (belt-and-suspenders for any future build profile change).

0.38.0 macOS app

Added

  • Skills tab — Skills Hygiene view. Header pills: total / active / never-used counts. Filter All/Active/Stale/Never-used. Per-skill row: name + status badge + invocation count + days since use. Fetches daemon GET /api/skill-usage shipped in v0.31.0.
  • Cost Projection card on Stats tab — projected month-end spend based on configurable window rate. Trend pill vs prior period. Color-coded by amount.

Why

  • User has 38 visible skills in CLAUDE.md, only 4 ever invoked. 34 dead-weight skills inflate every turn's context. Skills tab makes them visible at a glance, supporting one-shot decision "remove these now."
  • Cost projection answers "is this month going to be expensive?" without manual math.

0.31.0 plugin

Added

  • GET /api/skill-usage — full skill inventory with status (active / stale / never_used), invocation_count, days_since_use. Cross-references CLAUDE.md skills enumerator with skill_usage table. Optional include_never_used=1 to surface dead-weight skills (real data: 34 of 38 visible skills never invoked).
  • GET /api/cost-projection — current-month burn rate + projected month-end total based on a window's rate (default 7d). Also exposes rate change vs prior period for trend signal.

0.37.0 macOS app

Added

  • Time-range picker on Stats tab (24h / 7d / 30d / All), matching the dashboard's window picker. Selection persisted in UserDefaults under pulse.stats.window. Default 7d.
  • Cost by Project tab (from v0.36.0) shares the same window selection — change window in either tab, both update.

Why

  • All-time aggregates hide recent trends. Daemon v0.29.0 added the window-aware ?window= param. This release brings macapp to parity.

0.36.0 macOS app

Added

  • Cost by Project tab — new sidebar entry. Renders top 5 projects (cwd) for the selected window with horizontal bars showing share of total. Backed by daemon GET /api/cost-by-project shipped in daemon v0.30.0. Refreshes every 10s.

Why

  • Real telemetry showed 3 projects accounted for ~85% of all cost. Aggregate cards (Stats tab) didn't reveal where the spend was concentrated. This adds parity with the dashboard's cost-by-project card.

0.30.0 plugin

Added

  • Cost by project card on dashboard. Top 5 projects (cwd) by cost in the current window with horizontal bars showing share of total. New endpoint GET /api/cost-by-project?window=&limit= returns sessions count, cost, and percent of total per project. Honors the v0.29.0 window picker.

Why

  • Real telemetry showed 3 projects accounted for ~85% of all cost. Aggregate cards don't let the user see WHERE the spend is concentrated. This card answers it at a glance.

0.29.0 plugin

Added

  • Dashboard time-range picker. Header buttons 24h / 7d / 30d / All filter every hero card (cost, cache hit ratio, sessions, pack health) to the selected window. Default 7d. Selection persists in localStorage across reloads. All affected endpoints (/api/stats, /api/packs/health) accept ?window= and validate the value.

Why

  • All-time stats buried recent trends. User DB had $19K all-time but only ~$160 today — the recent signal was invisible without a window control.

0.28.1 plugin

Fixed

  • bash-parallel-suggest and mcp-batch-suggest were silent in real-world usage. After 4 days post-v0.24.0 ship neither pack had emitted a single alert despite 8064 historical Bash calls and 474 ClickUp create_task calls in user telemetry. Investigation: both handlers maintained their burst-detection ring buffer as a module-level Map in process memory, but CC's hook system spawns a fresh node hooks/stats-tap.mjs process per event — the in-memory ring was wiped between every call, so consecutive was always 1 and the threshold was unreachable regardless of how it was tuned. Both packs now query the durable tool_calls table (which stats-tap.mjs populates before pack dispatch) instead. Adjustments:
    • bash-parallel-suggest: min_consecutive 4 → 3, window_seconds 30 → 60, whitelist (git status/diff/log, npm ls/list/outdated, docker ps/images, kubectl get/describe, cat, ls, pwd) bypasses $VAR rejection for known read-only commands, durable cooldown via alerts table.
    • mcp-batch-suggest: min_consecutive 4 → 3, _delete_ added to family-extraction verbs (was only _create_/_update_), durable cooldown via alerts table.

0.28.0 plugin

Added

  • claude-md-budget v2.1 — "never used skills" detection. The lru_candidates array in too_many_skills alerts now includes skills that exist in CLAUDE.md but have never been invoked since v0.27.0 skill tracking started, in addition to the existing 14-day-unused list. Prioritization: never-used first (strongest signal), then stale-14d. Real-world data after v0.27.0 ship showed that 34 of 38 visible skills had zero skill_usage rows — those exact skills are the most valuable to remove and previous version's logic missed them.

Why

  • Previous version filtered last_used_at > 14d ago only. Skills with zero invocations don't appear in skill_usage at all, so they were invisible to the recommendation engine. This release closes that gap.

0.35.0 macOS app

Added

  • One-click in-app update. The Update Available banner now has an [Install Now] button. Clicking it downloads the new build (token-authenticated via ~/.claude/sdet-pulse/install-token saved during install.sh), verifies sha256 against the manifest, clears quarantine, atomically replaces /Applications/SdetPulse.app, and relaunches. No manual install.sh re-run required.
  • Menu: Check for Updates… (⌘⇧U) — manual update check from anywhere. Triggers the macapp manifest poll and auto-opens the popover so the banner is immediately visible when a new version is available.

Changed

  • install.sh now writes the bearer token to ~/.claude/sdet-pulse/install-token (chmod 600) so the macapp can authenticate auto-update downloads without prompting the user for the token again. Existing testers who installed before v0.35.0 need to re-run install.sh once to populate this file — otherwise the macapp's auto-updater falls back to anonymous downloads and the nginx 401 will surface as a "re-run install.sh" error in the banner.

Files

  • bin/macapp/Update/UpdateInstaller.swift (NEW) — download / sha256 / unzip / xattr / Info.plist validate / atomic replace / NSWorkspace relaunch pipeline.
  • bin/macapp/Update/UpdateChecker.swift — captures sha256, size_bytes, download_url into latestManifest; exposes installUpdate() and @Published installState.
  • bin/macapp/Update/UpdateBanner.swift (NEW) — SwiftUI banner with [Install Now] button, progress bar, error / retry state machine.
  • bin/macapp/Pulse/CCSection.swiftCheck for Updates… menu item gets ⌘⇧U shortcut and now also kicks UpdateChecker.checkNow() + auto-opens the popover when an update is available.
  • bin/macapp/Pulse/SystemSection.swift — inlined banner replaced with UpdateBannerView; footer label v0.34.4 → v0.35.0.
  • bin/macapp/Info.plistCFBundleShortVersionString / CFBundleVersion 0.34.4 → 0.35.0.
  • bin/macapp/Resources/{en,pl}.lproj/Localizable.strings — new keys update.install_now, update.retry, update.verifying, update.installing, update.installing.detail.
  • landing/install.sh — writes $TOKEN to ~/.claude/sdet-pulse/install-token after validation.

0.27.0 plugin

Added

  • claude-md-budget v2 — alerts now include LRU skill removal suggestions. New skill_usage table (PostToolUse Skill tool tracking) records first_seen_at, last_used_at, invocation_count per skill. When too_many_skills fires, the alert message lists the top 3 skills unused for 14+ days with their unused-day count. The context_json of each alert now carries a structured lru_candidates array for UI consumption.
  • New schema: skill_usage table with index on last_used_at.

Notes

  • Recommendations require ~2 weeks of usage data to populate. Fresh installs see the existing generic message ("too many skills, no clear LRU candidates yet") until data warms up.
  • Historical too_many_skills alerts are not backfilled with candidate data.

0.26.0 plugin

Changed

  • cost-control alert codes consolidated into a single session_budget_alert code with structured dimension (cost/output/turns) and phase (live/final) fields stored in context_json. Previously emitted six distinct codes (cost_ceiling_live, cost_ceiling, output_ceiling_live, output_ceiling, turn_ceiling_live, turn_ceiling) for the same conceptual budget event — across 100 sessions in real telemetry these accounted for 3697 alerts and dominated the alerts panel. Single code is easier to filter, aggregate, and ack.

Notes

  • Historical alert rows with the old codes are NOT migrated; they remain queryable under their original codes.

0.25.0 plugin

Added

  • Pack health card on the local dashboard (http://localhost:3535/) showing X/Y active, last fire time across all packs, and silent count over the last 24h. Backed by new endpoint GET /api/packs/health. Surfaces "is this thing on?" at a glance for testers in their first session.

0.24.0 plugin

Added

  • mcp-batch-suggest pack — advisory alert when 4+ consecutive mcp__*_create_* / _update_* calls of the same MCP family fire within 60 seconds. Suggests using the MCP server's batch endpoint where available. Driven by 474 ClickUp create_task + 462 Chrome browser_batch calls observed in real session data.
  • bash-parallel-suggest pack — advisory alert when 4+ independent Bash calls (no &&, |, cd, no var dependency) execute within 30 seconds. Suggests parallel execution. Triggered by Bash dominance in tool-call telemetry (8064 calls / 16h total runtime).

Changed

  • active-subagent-truncate now also intercepts Agent tool name in addition to Task (CC dispatches subagents under either name depending on context). Default truncate_bytes lowered from 80000 to 30000 — analysis of 190 Agent calls averaging 4 minutes each surfaced a much higher savings ceiling than the old threshold captured.

0.23.0 plugin

Changed

  • Default sync mode is now local-only for fresh installs (was disabled). All 20 packs are active out of the box; no network upload happens until the user explicitly runs node bin/sync.mjs enable. Existing installs with {enabled: true} are migrated to cloud mode, existing {enabled: false} migrated to disabled (no change in behavior for explicit opt-outs).
  • bin/sync.mjs gains a new subcommand local-only and the status output now shows the current mode.

Fixed

  • post-edit-validate pack now records savings entries when it pre-empts validator round-trips. Previously it returned findings without ever writing to the DB.
  • cache-ttl-aware threshold lowered from 50000 to 10000 input tokens so warm-session reminders fire on typical sessions instead of only on outlier workloads.

Notes

  • Preview tester install path (install.sh) writes the local-only state file automatically; testers see all packs active immediately after install.

[v0.34.4] — 2026-05-21

Added

  • Welcome window — final step polish. New step 3 ("You're ready") replaces the previous generic actions slide. Live daemon-connection pill fetched from localhost:3535/health showing Daemon: connected · N packs active (green dot) or Daemon: not running (orange dot), refreshed every 5s while the view is visible. Tolerates missing packs_enabled by collapsing to connected.
  • 10-row tab quick tour on step 2. Each row shows the live SF Symbol icon (matching the sidebar), the tab name, and a one-sentence purpose. Mirrors SystemRootView.Tab.symbolName so the onboarding visually matches the sidebar a tester sees seconds later.
  • First-alert tip on the final step pointing testers to the CC tab and menubar status item count.

Changed

  • Final CTA label Done — let's goGot it — let's go for clearer intent.
  • Onboarding window sized 600×560 (was 560×480) to fit the 10-tab tour without scroll on default density.

Why

  • Preview tester onboarding feedback (Paweł Cebernik, first external tester): the previous welcome flow ended with a generic "ready" slide and gave no signal that the daemon was actually wired up. The new final step confirms daemon connectivity, the active pack count, and where the first alert will surface — inside the first 30 seconds of opening the app.

Files

  • bin/macapp/Preferences/OnboardingWindow.swift — full rewrite of step 3, new DaemonHealthPill view, tabTourRows data.
  • bin/macapp/Info.plistCFBundleShortVersionString / CFBundleVersion 0.34.30.34.4.
  • bin/macapp/Pulse/SystemSection.swift — footer version label v0.34.3v0.34.4.

[v0.34.2] — 2026-05-18

Changed

  • UI rebrand to "Pulse" on user-visible surfaces — brand "SDET Pulse" preserved in About dialog and full brand identification contexts.
    • Dashboard browser tab title: $X/day · sdet-pulse$X/day · Pulse
    • Dashboard header brand name: sdet-pulsePulse
    • macOS status item attributed title: sdet • N Pulse • N
    • macOS notification banner title: sdet-pulse: packPulse: pack

Fixed

  • App.swift parseHealth(): daemon v0.22.7+ returns packs_enabled as Int, but code cast as [Any] always nil → packs count showed "0/20 packs" forever. Defensive back-compat: Intpacks_enabled_names array → legacy [Any].

[v0.34.1] — 2026-05-18

Added

  • Dashboard Packs tab with GET /api/packs + POST /api/packs/:name/toggle endpoints and per-pack toggle UI.

Changed

  • Dashboard: removed <meta http-equiv="refresh" content="10"> auto-reload; added manual Refresh button with tab state persisted to localStorage.
  • CC tab: shows both versions explicitly — macapp v0.34.1 • daemon v0.22.7.
  • lib/db.mjs reads PLUGIN_VERSION dynamically from package.json.

Fixed

  • Security: 3× shell injection eliminated — /bin/sh -c "rm -rf '...'" replaced with FileManager.removeItem in HuggingFaceCacheAction, PipCachePurgeAction, XcodeDerivedDataAction.
  • Security: /api/alerts ?status= now enforces allowlist ['all', 'unread'].
  • Perf: Two-call proc_listpids idiom in LocalLLM + Process monitors — fixes busy-machine truncation.
  • Correctness: UPDATE schema_meta scoped to single rowid (no unbounded update).
  • Bug fix: /health packs_enabled was returning Array (caused "0/20 packs" forever); now returns Int + new packs_enabled_names Array field.

[v0.34.0] — 2026-05-17

Added

  • NavigationSplitView replaces top-scroll tab strip — Mail.app/Notes.app sidebar pattern.
    • Sidebar sections: Main (CC / Memory / Cleaners / Stats) + Tools (Procs / Ports / LLM / Docker / Launchd / GPU).
    • SF Symbols icons per tab.
  • Sidebar .regularMaterial, detail .thinMaterial — native Liquid Glass look on macOS 15+.
  • Popover widened: 720×520.

[v0.33.3] — 2026-05-17

Fixed

  • Localization: 34 replacements String(localized:)NSLocalizedString(key, tableName:bundle:comment:) — Swift macro does not resolve bundle in plain swiftc multi-file build; NSLocalizedString with explicit bundle works correctly.

[v0.33.2] — 2026-05-17

Changed

  • Default tab changed: Memory → CC (primary value prop shown on first launch).
  • CC tab shows "0/20 packs enabled" in orange with (enable in dashboard →) hint when no packs are active.

Fixed

  • Info.plist: added CFBundleDevelopmentRegion + CFBundleLocalizations array — without these the macOS bundle loader did not see .lproj directories.
  • NSPopover.contentSize set explicitly to (480×520) — fixes multi-monitor positioning.

[v0.33.1] — 2026-05-17

Changed

  • Popover height reduced 600 → 520 pt — fits notched MacBook Pros without clipping.

Added

  • CCTabView replaces the "see status item menu" placeholder with live daemon status, alerts list, and dashboard link.

[v0.33.0] — 2026-05-17

Added

  • 28 new unit tests across UpdateChecker, StatsAggregator, NotificationRules, ScheduleConfig, HuggingFaceEnumerator, Actions, Memory edge cases. Total test count: 25 → 53.
  • Performance instrumentation: NSLog slow operations (PerfLog.measure*). Threshold 500ms for general operations, 100ms for DB queries. Instruments full poll cycle, cleaner poll, and all DB queries.
  • Util/PerfLog.swift: PerfLog.measure() + PerfLog.measureAsync() helpers.

Changed

  • README full quality pass — complete feature inventory and architecture overview for v0.33.0 state.
  • UpdateChecker.isNewer marked nonisolated static (was inheriting @MainActor from class).
  • StatsAggregator.computeStats() promoted from private to internal to enable direct test calls.
  • Info.plist version bumped from 0.22.0 to 0.33.0.

[v0.32.0] — 2026-05-17

Added

  • HuggingFace model-aware cleaner: Preferences → HuggingFace cache lets you mark specific models as "keep"; HuggingFaceCacheAction skips them and deletes only unmarked models. HuggingFaceEnumerator.enumerateModels() runs du -sk per model directory (parallel async).
  • Restart App picker (RestartAppPickerSheet): sheet with a live-filtered running-app list; pick any app to quit and reopen it. Accessible from the Free RAM footer menu.
  • Polish localization expansion: 12 new keys for HuggingFace, Restart App, and Schedule section headers.

[v0.30.0] — 2026-05-17

Added

  • UpdateChecker: zero-dependency daily GitHub Releases API poll (api.github.com/repos/darco81/sdet-pulse/releases/latest). Compares tag with CFBundleShortVersionString via semver. Popover header shows a subtle blue banner + "View on GitHub" link when a newer version is found. Last result cached in UserDefaults so the banner is instant on relaunch.
  • Preferences → Updates section: "Check for updates daily" toggle + "Check now" button with last-checked relative timestamp and inline error display.
  • Polish localization (pl.lproj/Localizable.strings): 54 keys covering tab names, Free RAM menu items, popover header, Preferences section headers, About dialog tagline/subtitle, Onboarding window step titles. English remains the default (en.lproj). macOS selects language automatically based on system locale.
  • build.sh now copies en.lproj and pl.lproj into dist/SdetPulse.app/Contents/Resources/ on every build.
  • Full CHANGELOG history for v0.22.0 → v0.29.0 (Keep-a-Changelog format, see entries below).

Changed

  • Tab enum rawValue lowercased ("CC""cc", etc.) to match tab.<key> lookup pattern; displayName computed via String(localized:) so tab strip renders localized names.
  • freeRAMLabel in SystemRootView uses localized menu.free_ram / menu.free_ram_with_usage keys.
  • PreferencesWindow height grew from 560 → 620 to accommodate the new Updates section.

[v0.29.0] — 2026-05-17

Added

  • App Intents: 5 actions exposed to Shortcuts.app / Spotlight / Siri — PurgeMemoryIntent, EmptyTrashIntent, UnloadOllamaModelsIntent, MemoryReliefIntent, MemoryStatusIntent (returns formatted memory snapshot string).
  • About dialog (AboutWindow.swift): version, build profile, GitHub/Releases/Issue links.
  • button in popover header (next to gear) — posts .openAbout notification.
  • Per-tab help tooltips on 9 tabs via .help() modifier: concise data source + action explanation on hover.
  • Empty Trash button added to the Free RAM footer menu.

[v0.28.0] — 2026-05-17

Added

  • First-launch onboarding window (OnboardingWindow.swift): 3-step welcome sequence (intro → tab tour → Free RAM menu explanation). Sets ui.hasOnboarded on completion; skips intro banner in the popover.
  • Full Disk Access banner in the Ports tab when lsof returns suspiciously few rows.
  • Keyboard shortcuts: ⌘R refresh + ⌘1⌘9, ⌘0 for tab switching.
  • Loading spinners + empty-state messages across all tabs.

Changed

  • SMOKE.md and README synced for v0.27 + v0.28 features.

[v0.27.0] — 2026-05-17

Added

  • Proactive notification rules (NotificationRules.swift): CMPRS_HIGH + SWAP_CRITICAL fire macOS banners with 30-minute per-rule throttle.
  • Stats tab sparkline: 7-day "MB freed per day" bar chart via Swift Charts BarMark.
  • First-time intro banner in popover (lightbulb + "Got it" dismiss).

Changed

  • Tab order: CC → Memory → Cleaners → Stats → Procs → Ports → LLM → Docker → Launchd → GPU (value-prop tabs moved forward).

[v0.26.0] — 2026-05-17

Added

  • Stats tab (10th tab): lifetime freed MB, action count, per-action-type breakdown, success rate, last 10 actions. StatsAggregator with 60-second cached query layer over EventLog.
  • Scheduled cleanups for safe actions (Empty Trash + Homebrew cleanup) via Preferences Picker (Never / Daily / Weekly).
  • Optional menubar inline pressure % display (Preferences → Display toggle).
  • Free RAM menu label dynamically shows "Free RAM (X GB used)" based on swap + compressed.
  • Cleaners tab manual "Refresh sizes" button.

[v0.25.0] — 2026-05-17

Added

  • Hot-reload for System module toggle: taking effect immediately in Preferences without requiring a restart.
  • Settings menu wire: ⌘, opens Preferences. Gear button (⚙) in popover header. Right-click status item shows Preferences/Quit menu.

Fixed

  • KillTopMemoryHogsAction excludes own PID (prevents self-kill on memory-heavy machines).
  • NodeCachesCleanAction throws when no package manager is found (was a silent no-op).
  • UnloadAllOllamaModelsAction (new): single confirm instead of N-per-model.
  • HomebrewCleanupAction empty brewPath guard.
  • RestartAppAction quote sanitization.
  • MemoryReliefBundle surfaces kill-hogs step failures in its output.

Changed

  • SMOKE.md full update for v0.24 (Cleaners tab + Free RAM menu + 19 actions).
  • README dual-onboarding with current 20 actions inventory.

[v0.24.0] — 2026-05-17

Added

  • 5 new Cleaners: NodeCachesClean (npm/yarn/pnpm), PipCachePurge, HomebrewCleanup, DNSFlush (sudo), BrowserCachesClean (Safari + Chrome + Brave + Firefox).
  • OllamaModelPurgeAction: per-model "Purge from disk" red button in the LLM tab.
  • RestartAppAction: universal app restart via osascript quit + activate.
  • KillTopMemoryHogsAction: top N by CMPRS with system-process exclusion list.
  • MemoryReliefBundleAction: combo (unload Ollama → kill top 3 → sudo purge).
  • "Free RAM" footer menu replacing single Purge button (4 options + divider + bundle).

[v0.23.0] — 2026-05-17

Added

  • Cleaners tab (9th tab) with 4 disk-space targets: Trash, Xcode DerivedData, Docker reclaimable, HuggingFace cache.
  • CleanerMonitor: lazy 60-second polling via du -sk + docker system df --format.
  • Size column with orange threshold highlight at ≥ 10 GB.
  • Size-aware confirm dialogs ("Delete HuggingFace cache (~92 GB)? Models re-download on next use.").

[v0.22.1] — 2026-05-16

Fixed

  • efficacy: tokens_per_turn filters sessions with turn_count >= 10 (excludes one-shot outliers).
  • LocalLLM: readArgv bounds check size > 4 (KERN_PROCARGS2 argc prefix).
  • DB: new runStmt(sql:bindings:) overload for parameterised non-query SQL.
  • EventLog / MetricsRing: prune uses bindings instead of string interpolation.
  • CCSection: applicationWillTerminate explicit timer/task cancellation.
  • Preferences: System module toggle labeled "(restart required)" (removed in v0.25.0 when hot-reload landed).

[v0.22.0] — 2026-05-16

Added

  • System module for SdetPulse.app — Mac toolkit for AI/SDE developers.
  • 7 monitors: Memory (host_statistics64), Process (libproc), Ports (lsof), Launchd (launchctl), Docker (docker ps / stats JSON), LocalLLM (Ollama HTTP + MLX argv scan), GPU (IOKit IOAccelerator).
  • 7 actions: RestartLaunchd, KillProcess (SIGTERM/SIGKILL), KillPort, sudo Purge (NSAppleScript), Docker container/Desktop restart, Ollama unload.
  • SwiftUI popover with 8 tabs (CC + 7 system tabs).
  • Status item severity tint (green/orange/red) via Combine subscription on memory state.
  • 1-hour Memory sparkline via Swift Charts.
  • SQLite event log (90-day retention) + 7-day metrics ring at ~/Library/Application Support/SdetPulse/system.db.
  • JSON event export to Desktop.
  • Preferences window with threshold Steppers + module toggle + build profile display.
  • Build profiles: personal (default ON), preview (default OFF), public (default ON).

Changed

  • Refactored SdetPulse.swift into modular files: App.swift / main.swift / Pulse/CCSection.swift / Monitors/* / Actions/* / Persistence/* / Preferences/*.
  • Tagline expanded: "Mac toolkit for AI/SDE developers — CC observability + System health."

v0.22.x — System module (preview)

Added

  • System module in SdetPulse.app: Memory/Procs/Ports/Launchd/Docker/Local LLM/GPU monitors.
  • Quick actions: restart launchd, kill process/port, sudo purge, Docker restart, Ollama unload.
  • SQLite event log + 7-day metrics ring at ~/Library/Application Support/SdetPulse/system.db.
  • 1-hour Memory sparkline (swap + compressed) via Swift Charts.
  • NSAppleScript Authorization pattern for sudo purge.
  • Build profiles: personal (default ON), preview (default OFF), public (default ON).
  • Preferences window with thresholds + module toggle + profile display.
  • JSON event-log export to ~/Desktop.
  • Status item severity tint (green/orange/red) driven by memory thresholds.

Changed

  • Refactored SdetPulse.swift into modular files (App / main / Pulse/CCSection / Monitors/* / Actions/* / Persistence/* / Preferences/*).
  • Tagline expanded: "Mac toolkit for AI/SDE developers — CC observability + System health."

[Unreleased]

2026-05-16 — Mobile / narrow-viewport responsive (v0.22.6)

  • New @media (max-width: 800px) — topbar wraps + tightens, .brand-live hides (redundant when liveness already conveyed), window pills wrap instead of overflow, hero collapses to 2 cols (predictable), tabs wrap, table fonts + padding scale.
  • New @media (max-width: 480px) — hero 1 col, version chip hides, sections + grid get overflow-x: auto so wide tables get native horizontal scrollbar rather than clipping.

2026-05-16 — Schema downgrade safety (v0.22.5)

  • lib/db.mjs initSchema detects existing.schema_version > SCHEMA_VERSION and captures the mismatch in _schemaDowngradeWarning instead of rewriting the row. schema_meta row left intact — reinstalling the newer plugin picks up where it left off without a forward migration step.
  • New getSchemaDowngradeWarning() exported — returns null or { db_schema, running_schema, db_plugin, running_plugin }.
  • bin/doctor.mjs coreHealth adds a schema_downgrade check that fires ok: false with a human-readable detail ("DB schema vX > running vN — reinstall newer plugin to restore"). Drives doctor exit code non-zero.
  • existing DB with HIGHER schema_version triggers warning, NOT crash, and does NOT rewrite schema_meta
  • warning is null when DB and plugin match

2026-05-16 — Calibration bundle: cost-control threshold + Bash dedup window (v0.22.4)

  • Live data: 252 cost_ceiling_live warn fires in 20 h on the maintainer's machine (~12/hour). Opus 4.7 [1m] sessions routinely cross $5 within 5 min of productive work — at threshold = 5 the alert reads as noise, not signal.
  • The $25/fire cap from v0.21.2 still bounds the dashboard's potential $ figure honestly. Bumping the threshold cuts fire frequency without changing per-fire math.
  • A long-context Opus session worth flagging is ≥ $10 in cost. Threshold matches that intuition.
  • v0.22.3 (15 min) caught only 2 realized fires in 20 h, both at very-short gaps (53s + 87s). Median dup gap measured from live data is 12 min, so 15 min should have caught more — but 4× session-parallel work spreads duplicates further apart in practice.
  • 30 min captures most legitimate repeats (cd && source && exec, docker compose exec psql, git log -1) while staying short of the 60-min line where git status, docker ps, etc. become at-risk for stale-output replay.
  • bash_cache_max_entries: 200 → 300 — scales with window. 30-min window holds more distinct commands before LRU eviction.

2026-05-15 — Bash dedup window revision based on live dogfood data (v0.22.3)

  • 1 673 Bash calls in 24h, 1 634 unique → 2.3% raw dup rate
  • 39 duplicate pairs, median gap 734 seconds (12 min)
  • Top duplicate: cd jarvis-orchestrator && source secrets && exec … fired 13× in one session (idempotent boilerplate, perfect dedup candidate)
  • bash_dup_window_ms: 5 * 60 * 100015 * 60 * 1000 (5 min → 15 min). New window catches the 12-min median tail without chasing the 75-min outliers where cached stdout would likely be stale. Window history is now documented in the source comment.
  • bash_cache_max_entries: 100200. Scales with window length — 3× longer window means 3× more distinct commands cohabit the cache before LRU eviction.
  • Surfaces total Bash calls, unique count, dup rate, dup pair count, median gap, current window setting, and the top 5 most-repeating commands over the last 24 h. Doubles as the data source for future window tuning and as a "is the user's workflow even Bash-repetitive?" sanity check that testers can run on their own machine.
  • Warns when median gap exceeds the configured window — actionable nudge to bump bash_dup_window_ms via /sdet-pulse:pack config tool-efficiency '{"bash_dup_window_ms": N}'.

2026-05-15 — Polish bundle: atomic sync-state write + Dependabot + issue templates (v0.22.2)

  • writeSyncState no longer truncates the canonical file in-place. New flow: write the JSON to sync-state.json.tmp, then renameSync to the final path. POSIX guarantees rename is atomic on the same filesystem, so a power loss or OOM kill mid-write can only leave behind the .tmp sibling (which readSyncState ignores) or the fully-written final file. Before this fix, a partial write would land at the canonical path, JSON.parse would fail, the recovery branch would rename the corrupt file aside AND reset to DEFAULTS — silently disabling the tester's sync until they noticed last_success_at stopped advancing days later.
  • Two new tests covering the contract: writeSyncState uses atomic temp-then-rename — no .tmp leftover, no partial canonical file + readSyncState ignores .tmp leftover from a crashed write. The second specifically simulates the crash window (only .tmp on disk, no canonical file) and asserts the recovery doesn't leak the partial content into state.
  • Weekly npm updates on Monday 09:00 Europe/Warsaw, monthly GitHub Actions updates. better-sqlite3 is the only production dep but it's a native module — a Node-version compat break or security advisory needs to land as a PR the maintainer can review, not be discovered mid-launch. Labels each Dependabot PR dependencies so they're skimmable in the queue. No auto-merge — manual triage.
  • preview-tier-feedback.md — structured "what happened, how to reproduce, doctor output, version" template auto-assigned to the maintainer with the preview-tier label.
  • bug-report.md — terser variant for clear bugs (daemon crash, hook error, dashboard 500). Asks for tail -10 ~/.claude/sdet-pulse/error.log.
  • .github/pull_request_template.md — what / why / test plan skeleton, capped at 200 words so the CHANGELOG stays the long-form record.

2026-05-15 — Rolling-window DB prune + WAL checkpoint (v0.22.1)

  • lib/migrations/v5-prune-tracking.sqlALTER TABLE schema_meta ADD COLUMN last_prune_at INTEGER. Idempotent via the same PRAGMA table_info guard pattern used for v4. Existing testers upgrade in-place on the next openDb() without data loss.
  • New pruneOldData(db, retainDays = 90): deletes from tool_calls, prompts, alerts where ts < cutoff. Drops only realized = 0 rows from savings (advisor potential — the opportunity moment has passed); realized = 1 rows are kept forever as an audit trail of what the plugin actually saved. sessions and pack_state are never pruned — sessions are needed for efficacy baseline comparisons, pack_state is user config. After delete: PRAGMA wal_checkpoint(TRUNCATE) shrinks the WAL file back to zero, then PRAGMA optimize lets SQLite refresh stats. Records last_prune_at in schema_meta.
  • New maybePruneOldData(db, retainDays = 90, intervalDays = 1): once-per-day guard. Reads last_prune_at; skips with { ok: true, skipped: true, reason: 'cooldown' } if it was within intervalDays. Otherwise calls pruneOldData.
  • handleSessionStart now calls maybePruneOldData(db, 90, 1) after the auto-sync trigger. First SessionStart of the day runs an actual prune; all subsequent ones short-circuit on the cooldown guard. Failures land in error.log; successful prunes are silent so the log isn't filled with daily "pruned 0 rows" entries.
  • Core health adds two new rows: db_size_mb (warns if > 100 MB on a fresh install) and last_prune (shows ISO timestamp + fresh/stale; stale = > 2 days since last prune).
  • New flag --prune runs a manual prune. --retain-days=N overrides the default 90. Useful for a tester who wants to ship a clean batch before opting into sync.
  • pruneOldData: deletes tool_calls/prompts/alerts older than retainDays — seeds 2 old + 2 fresh rows per table, asserts only the 2 old per table get dropped, sessions row stays.
  • pruneOldData: drops realized=0 savings past cutoff but preserves realized=1 — exact audit-trail guarantee for the savings table.
  • maybePruneOldData: cooldown skips second call within intervalDays — guard semantics.
  • 476/476 pass.

2026-05-15 — macOS app: actionable upgrade dialog + drop dead GitHub link (v0.22.0)

  • bin/macapp/SdetPulse.swiftshowUpdateResult now extracts download_url and sha256 from latest.json (both fields are already produced by bin/release.mjs). The alert body now shows:
    • Released date (when available)
    • The full download URL
    • The sha256 to verify
    • 3-step install guide: curl -fLO (with bearer token reminder), tar -xzf … && cd … && npm install, /plugin install sdet-pulse
  • New first button "Copy Download URL" — puts the URL on the macOS clipboard so the next terminal step is a single ⌘V. Button order branches when download_url is absent (offline / malformed latest.json) — defensive.
  • bin/macapp/SdetPulse.swift showAbout — removed the broken "Open GitHub" button + replaced text reference to github.com/darco81/sdet-pulse with Project: sdet.it. The new button "Open sdet.it" goes to https://sdet.it which is the public surface testers actually have access to.
  • bin/macapp/SdetPulse.swift:26 — removed dead githubURL constant (last reference dropped in showUpdateResult + showAbout above).
  • bin/macapp/Info.plist — bundle version 0.21.40.22.0.
  • package.json + lib/db.mjs PLUGIN_VERSION0.21.50.22.0. Minor bump because the upgrade dialog UX is a new tester-facing feature, not just a fix.

2026-05-15 — Critical safety bundle from production review subagent (v0.21.5)

  • docs/preview-tier/RUNBOOK.mdssh dariusz@54.36.174.173 × 5 → ssh ubuntu@54.36.174.173. The VPS account is ubuntu (confirmed live), add-tester.mjs:12 already had it right. RUNBOOK was wrong — anyone following it would hit Permission denied (publickey) on every operation. Pre-launch trust killer.
  • docs/preview-tier/PRIVACY_NOTE.md — version stamp 0.20.00.21.4. The savings table row now mentions the realized flag explicitly (0 = potential / advisor opportunity, 1 = active pack actually saved tokens). Testers reading the document will see schema fields that match the v0.21.x DB.
  • bin/sync.mjsUSER_AGENT no longer hardcoded at 'sdet-pulse-sync/0.9.3' (stuck since v0.14). Reads version from package.json at module-load time so VPS receiver logs accurately tag each tester's plugin version. Version-based log analysis works again.
  • bin/server-ctl.mjs — port conflict on start now returns a structured { ok: false, error: 'EADDRINUSE', message: '…' } instead of letting the detached child silently crash 10 ms later and leave a stale PID record. The message suggests the next-higher port (start ${port+1}). The start function is now async (preflight probes the port via net.createServer); test suite + CLI updated to await it. CLI exits 1 on failure so shell pipelines and future CI catch the conflict.
  • templates/dashboard-live.html — footer link github.com/darco81/sdet-pulse (private repo, 404 for testers) replaced with sdet.it. About modal Repo row → Project · sdet.it. Plus footer copy made by sdet.itpreview tier — private repo to make it obvious the GitHub link is intentionally absent during the preview phase.
  • README.md## Current features (v0.20.0) heading → (v0.21.4). First-impression credibility nit.
  • test/server-ctl.test.mjsstart surfaces friendly EADDRINUSE message when port is occupied. Binds a squatter net.createServer on the test port, then asserts start(port) returns ok: false + error: 'EADDRINUSE' + message containing the next port number. 473/473 pass.
  • Item #3 (VPS receiver watchdog unverified) — false alarm. Live SSH check showed systemctl cat sdet-pulse-receiver already has Restart=on-failure + RestartSec=5. No fix needed.
  • Item #4 (alerts table missing from PRIVACY_NOTE) — false alarm. The table at line 20 already lists alerts. Subagent miscounted; only the version stamp + realized flag were stale.
  • Item #6 (DB pruning — ~1 GB growth per tester over 6 weeks): real concern but M-effort, not blocking Tuesday's launch.
  • Item #2 (upgrade path UX — macOS app "Check for Updates" alert doesn't show install URL): SHOULD-FIX, deserves its own focused PR with Swift edits.
  • Item #9 (schema downgrade silent data loss): edge case for testers downgrading past v4, RUNBOOK can warn about it.

2026-05-15 — Perf regression guard + baseline measurement (v0.21.4)

  • 10 cold curl renders, 7d window: 9.4 – 11.3 ms (avg ~10.3 ms)
  • 3 renders, 30d window: ~9.7 ms
  • In-process renderDashboard avg over 10 calls: 8 ms
  • test/dashboard-server.test.mjs — new renderDashboard 7d window: under 200 ms avg over 20 renders (perf regression guard). Seeds a modest fixture (50 sessions × 7 days, tool_calls, savings) so SQL plans scan real data, warms up the module cache, then asserts 20 renders avg < 200 ms. 200 ms is generous (~25× the current baseline) — fires only on real regressions, not noise.

2026-05-15 — Always-visible Savings hero card (v0.21.3)

  • bin/dashboard-server.mjsgatherDailyBreakdown now splits savings by realized flag into two separate fields (savings_realized + savings_potential) instead of returning realized only. SQL grouped by (day, realized) so both come from a single query.
  • bin/dashboard-server.mjsbuildHero always emits the Savings card. New renderSavingsCard(realized, potential) helper picks the right shape: realized > 0 → main value with potential as + $X potential sub-line; realized = 0 but potential > 0 → $0 main with $X potential (needs action) sub-line; both 0 → $0 main with no fires yet — packs are watching. Sparkline always prefers the non-zero series so the green line shows a trend even when the headline is "$0".
  • test/dashboard-server.test.mjs — two new cases. gatherDailyBreakdown splits realized + potential asserts both fields populate from mixed savings rows. Savings card renders even when no realized savings uses a fresh isolated tmp DB to assert the "no fires yet" sub-label shows on day-zero installs.

2026-05-15 — Three pack bugs from research subagent (v0.21.2)

  • packs/env-tax-advisor/handler.mjs:25-34savings_pct was stored as a STRING ('30-40%', '10-20%', 'prevents 1M-context spillover'). The reducer (s, f) => s + (f.savings_pct || 0) coerced strings to NaN, fell back to 0, so totalPct was always 0, estTokens was always 0, and recordPotentialSaving was never called. Pack emitted alerts but logged zero $ for its entire existence.
  • Fix: split into numeric savings_pct (35/15/10/10/5) and human-readable savings_label ('30-40%' etc.). Snippet builder now uses savings_label for display, math uses the numeric. Smoke test confirms: empty env now records $1.35 / 90k tokens per fire (75% combined savings on 150k-token baseline).
  • packs/haiku-router/handler.mjs:19task_short_prompt_chars: 400 was too high for real Task tool dispatches. A typical "analyze X and return Y" Task prompt easily exceeds 400 chars; the auto-route branch never fired in production (zero realized = 1 rows in 7 days despite auto_route_short_tasks: true being set).
  • Fix: lowered to 150 chars. Catches genuine lookups (list files, run tests, check git status) which are the actual auto-route wins. The explicit-model guard (!event.tool_input?.model) still protects Tasks the caller deliberately put on Opus.
  • packs/intelligence/handler.mjs:91daily_anomaly describes a day-wide state (today's spend exceeded prior 6-day avg) but used shouldEmit(db, session_id, ...) which is per-session. Every session's onStop independently re-fired the same global signal, producing 33 duplicate alerts in one heavy day. Alert fatigue.
  • Fix: new shouldEmitGlobal(db, code, cooldownMs) query that ignores session_id. daily_anomaly now fires at most once per cooldown window (1 hour) across all sessions. Other intelligence codes (e.g. subagent_recommended which IS per-session) keep their per-session cooldown.

2026-05-15 — Cap enforcement regression tests + inflated-savings cleanup tool (v0.21.1)

  • bin/doctor.mjs — new --cleanup-inflated flag (with --dry-run mode). Drops savings rows where realized = 0 AND cost_saved_usd > PACK_CAPS[technique]. Pack-specific caps are wired into the new PACK_CAPS map (auto-compact-nudge: 30, cost-control: 25) which matches the in-handler values so adding a new advisor-with-cap is a one-line addition. The corresponding alert rows are kept — only the buggy savings rows are touched.
  • bin/doctor.mjsPOTENTIAL_USD_PER_FIRE_SANITY_CAP raised 20 → 35. The old threshold was below the highest per-fire cap (auto-compact-nudge: 30), so a pack consistently writing at its cap was falsely tripping the "MATH SANITY" warning. New threshold sits above every shipped cap so real packs never trip it, but new packs with cumulative-math bugs still get caught.
  • test/pack-auto-compact-nudge.test.mjs — two new regression cases. cap enforced at $30/fire even on huge sessions simulates a 150M-token session and asserts the recorded row is ≤ $30. first fire on marathon shape caps at $30 reproduces the exact live-data shape (1.6B input+cache_read tokens) that produced $4815.19 in the wild.
  • test/pack-cost-control.test.mjscap enforced at $25/fire even on enormous overrun writes a fake transcript with a $700 session cost and asserts the recorded row is ≤ $25.

2026-05-14 — Dashboard sparklines + Today strip (v0.21.0)

  • bin/dashboard-server.mjs — three new functions: sparklineSvg(values, colour) renders a 100×28 SVG polyline + filled area for any numeric series (cyan for cost/sessions/cache, green for savings). Empty / flat / all-zero series degrade to a dashed baseline so the card height stays consistent. gatherDailyBreakdown(db, days = 7) returns one row per calendar day (oldest first) with cost, sessions, cache_pct, savings_realized. Missing days are zero-filled so the array is always exactly days long. buildTodayStrip(db) returns a single-row HTML pill with five 24h metrics: spend, top tool name + call count, unacked alert count, peak hour, top project.
  • bin/dashboard-server.mjsheroCard() signature extended with a 5th sparkSvg argument (defaults to empty string so any call site that doesn't pass it still works). buildHero() accepts a 3rd breakdown argument and threads sparklines into 4 of the 6 cards (Sessions, Cost, Output tokens, Cache hit) plus an opt-in 7th "Savings 7d" card that only renders when realized savings are non-zero. Cards with no meaningful 7-day series (Open alerts, Prompt p95) stay sparkline-free.
  • bin/dashboard-server.mjs renderDashboard() — now computes the breakdown once via gatherDailyBreakdown(db, 7) and the today strip via buildTodayStrip(db), then injects them into the new __TODAY_STRIP__ placeholder and the existing __HERO__ slot.
  • templates/dashboard-live.html — added __TODAY_STRIP__ placeholder directly under the hero. CSS additions: .spark (sparkline sizing) and the .today / .today-label / .today-item pill rules.

2026-05-14 — Stop spamming error.log when sync gate is closed (v0.20.0)

  • lib/packs.mjs — dropped the logError call and the now-unused describeSyncGate import. Gated packs are silently skipped. Comment updated to make the "this is a normal operational state, not an error" intent explicit.
  • ~/.claude/sdet-pulse/error.log rotated to error.log.pre-v0.20 so the historical entries are archived but the doctor's count starts fresh.

2026-05-14 — Truth pass: doctor diagnostic + pack auto-registration (v0.19.0)

  • bin/doctor.mjs — new standalone diagnostic. Per-pack 7d report: fires, savings, last fire timestamp, health verdict (working / silent / disabled / not-registered). Replaces the inline-node-script approach. Exit codes: 0 healthy, 1 needs attention, 2 schema broken.
  • lib/db.mjsopenDb() now calls seedMissingPackStates(db) which idempotently inserts a pack_state row for every pack in ALL_PACK_NAMES missing one. Initial enabled flag respects manifest's default_enabled. User's prior enable/disable choices are never overridden (INSERT OR IGNORE).
  • commands/doctor.md — points at bin/doctor.mjs instead of inline script.

2026-05-14 — Threshold recalibration on real data (v0.19.1)

  • packs/bash-output-redactmax_lines: 100 → 50 (head 60→30, tail 30→15). Catches the long-tail noise (find spam, status dumps, log scrapes) without touching normal command output.
  • packs/big-read-truncatetrigger_bytes: 100_000 → 30_000. Max observed Read was 43 KB; 100 KB never fired. 30 KB catches lock files, minified bundles, generated docs.
  • packs/mcp-output-truncatemax_lines: 120 → 60 (head 80→40, tail 40→20). 41 mcp__* calls had bytes_out above the rough 120-line equivalent but never tripped; line counts in MCP JSON differ from byte estimates. 60 lines trims long ClickUp/Linear dumps that bloat context.
  • grep-truncate (max_lines: 80) — only 9 Globs / 0 Greps in the window; pack is correctly idle until Grep/Glob usage grows.
  • subagent-budget (alert_bytes: 50_000) — 0 Task tool calls in the window; correctly idle.
  • post-edit-validate — advisor with runtime condition (.ts/.tsx + tsc binary), not a threshold issue.

2026-05-14 — Advisor packs quantify potential savings (v0.19.2)

  • lib/migrations/v4-realized-savings.sql — adds realized INTEGER NOT NULL DEFAULT 1 to savings. 1 = active pack truncated context, redaction applied, etc. 0 = advisor identified an opportunity the user/model has to act on for the saving to materialize. Migration is guarded by PRAGMA table_info so it's idempotent across openDb() calls.
  • lib/savings-helper.mjsrecordSaving() now accepts realized and optional cost_saved_usd_override. New recordPotentialSaving() is the canonical entry point for advisors. New inputCostDeltaPerMtok(from, to) lets price-tier routing packs estimate $ delta between two models.
  • haiku-router — on each haiku_prompt_candidate / haiku_task_candidate fire, estimates prompt_chars/4 tokens × (Opus_input − Haiku_input) per Mtok and records as potential. A 5,000-char Opus prompt becomes a ~$0.018 potential save.
  • claude-md-budget — on claude_md_oversized, claude_md_global_oversized, and too_many_skills fires, records excess_tokens × ~50 turns/session as potential (the metadata is loaded every turn so the per-session cost is roughly per-turn × turns).
  • skill-hygiene — on prompt_bloat_live (prompt > 30k chars), records (chars − threshold)/4 as potential tokens trimmed.
  • auto-compact-nudge — on compact_nudge fire (session crossed threshold), records 30% × (input + cache_read tokens) as potential — conservative estimate of what /compact would shed.
  • bin/doctor.mjs now splits realized vs potential columns. Verdict line distinguishes "X saves · $Y realized" (active pack working) from "X alerts · $Y potential (needs action)" (advisor surfacing an opportunity). Summary line shows both totals.

2026-05-14 — Real Haiku routing: PreToolUse rewrite + /sdet-pulse:haiku command (v0.19.3)

  • packs/haiku-router/handler.mjs — new config flag auto_route_short_tasks (default false, opt-in). When enabled, the onPreToolUse handler returns updatedToolInput.model = 'haiku' for Task dispatches with prompt < task_short_prompt_chars. CC's pack dispatcher passes that through to the actual Task tool invocation, so the subagent really runs on Haiku 4.5. Caller-set tool_input.model is respected (no downgrade of a deliberate Opus dispatch). When auto-routed, the saving is logged as realized: 1 instead of potential — CC is about to bill on Haiku rates.
  • commands/haiku.md (new) — /sdet-pulse:haiku <prompt> for explicit user-driven routing. Dispatches a Task with model: "haiku" so the user can opt-in per request without enabling auto-route globally.

2026-05-14 — Dashboard UX: brand top bar, tabs, About modal (v0.19.4)

  • templates/dashboard-live.html — complete rewrite. Sticky brand top bar (logo + name + version pill + live pulse + window selector + ⓘ About button). Tabbed body — Money (reconciliation, projects, models, sessions), Health (efficacy, tax meter, alerts, recommendations), Savings (total, pack savings, advisor packs, optimization), Forensics (tools, tokenizer estimator). About modal with version, schema, license, repo link. Inline SVG favicon (heart-pulse line) so the macOS tab and Windows taskbar show the brand identity.
  • bin/dashboard-server.mjsrenderDashboard() fills two new placeholders: __PAGE_TITLE__ (dynamic: "$X/day · sdet-pulse" so the browser tab / window switcher surfaces cost at a glance) and __PLUGIN_VERSION__ (read from package.json, cached). replaceAll used for __PLUGIN_VERSION__ because it appears in the top bar, footer, and About modal.
  • Existing section builders (buildHero, buildReconciliation, buildEfficacy, etc.) untouched — their HTML output drops into the new tab panels unchanged. No data-layer refactor.

2026-05-14 — tool-efficiency goes ACTIVE: Bash dedup cache (v0.19.5)

  • packs/tool-efficiency/handler.mjs — new onPreToolUse for Bash. Loads on-disk per-session cache at ~/.claude/sdet-pulse/bash-cache/<sid>.json. If (cwd, command) matches an entry younger than bash_dup_window_ms (default 60s) AND the command isn't on the destructive skip list (rm, mv, git push, git commit, npm install, sudo, etc.), injects the cached stdout via additionalContext and logs realized: 1 saving. onPostToolUse writes the cache after every Bash that's under bash_cache_max_bytes (16 KB).
  • New config keys: active_bash_dedup (default true), bash_cache_max_bytes, bash_cache_max_entries, bash_cache_skip_pattern.
  • The existing bash_dup alert is preserved — same trigger, message updated to note "active dedup replayed cached output" when active mode is on. Keeps the dashboard sensitive to the pattern for future threshold tuning.

2026-05-14 — Three more advisors quantify potential $ + crisper alerts (v0.19.6)

  • packs/cache-ttl-aware/handler.mjs — on cache_cold_resume fire, records prev.input_tokens as potential saving (those tokens will be re-read from scratch instead of cache-read). Alert text now starts with "Cache cold:" and ends with "Run /compact NOW" — actionable command, not a soft suggestion.
  • packs/env-tax-advisor/handler.mjs — on env_tax fire, aggregates savings_pct from each actionable finding × a typical 150k-token session, capped at 60%. Conservative but gives the dashboard a real number to display.
  • packs/cost-control/handler.mjs — on cost_ceiling_live, records (cost − threshold) × 0.5 as potential saving (assumes a /compact at the crossing would shed ~50% of remaining context cost). Alert text: "Session cost $X crossed $Y. Run /compact NOW…" — imperative + reason.
  • intelligence, prompt-coach, post-edit-validate, contextignore-recommender — these surface qualitative signals (anomaly, prompt score, syntax issues, ignore-pattern suggestions). Trying to put a $ on them would be hand-waving more than measurement, so they stay alert + injection only. The dashboard's "advisor packs · fire counts" panel still shows their activity.

2026-05-14 — Full posture: 20/20 packs enabled + auto-haiku ON + doctor "mixed" verdict (v0.19.7)

  • bin/doctor.mjs — verdict logic now handles mixed-mode packs (active + advisor in one — haiku-router after auto-routing was added). Previously such packs showed (unclassified) because the classifier sees recordSaving import and labels them "active", but the verdict branch only had kind === 'active' for realized > 0. New mixed branch: when both realized and potential count > 0, show "X realized · $Y + $Z potential". Also tightened the advisor branch text from "advisor, no $ quantified yet" to "no $ quantified (qualitative signal)" — clearer that this isn't an oversight but a design choice for packs like intelligence / post-edit-validate.
  • pack_state.enabled = 1 for all 20 packs (was 14/20).
  • pack_state.config_json for haiku-router merged with {"auto_route_short_tasks": true}. From this point every Task tool dispatch under task_short_prompt_chars (400 chars) auto-routes to Haiku 4.5 unless caller explicitly set tool_input.model.
  • $0.66 realized (read-dedup-cache, the one that's been live since day 1)
  • $587.79 potential across cost-control ($306), auto-compact-nudge ($281), claude-md-budget ($0.03), haiku-router ($0)
  • 10 packs actively firing, 10 still silent (most are active-truncate packs waiting for over-threshold inputs that haven't happened yet)

2026-05-14 — Sanity pass on calibration + math after live data review (v0.19.9)

  • packs/tool-efficiency/handler.mjsbash_dup_window_ms: 60_000 → 300_000 (1 min → 5 min). Live data: real workflow repeats commands every 5–15 min (git status, ls, docker ps, sqlite queries), not within 60 seconds. 60s window caught 1 duplicate in 4 parallel sessions over 90 min; 5-min should catch ~10–20× more. bash_cache_max_entries: 50 → 100 so the longer window doesn't evict useful entries.
  • packs/tool-efficiency/handler.mjsbash_cache_skip_pattern rewrite. Old pattern anchored to ^ only — missed env-var prefixes (FOO=1 && rm …), compound chains ((sudo rm …)), and many destructive verbs. New pattern allows leading ;/&/|/` `/(/= boundary and covers: truncate, dd, mkfs, fdisk, wipefs, tee, pip install, pnpm add, cargo install, gem install, brew install, apt install/remove, docker rm/rmi/prune, systemctl, launchctl, curl -X POST/PUT/DELETE/PATCH, wget --method=…, and psql/sqlite3` with INSERT/UPDATE/DELETE/DROP/TRUNCATE/ALTER.
  • packs/auto-compact-nudge/handler.mjs — was recording 30% × cumulative session input + cache_read tokens on every fire, double-counting tokens across nudges in long sessions ($684 across 2 fires for a marathon). Now: keyed on the delta since the previous nudge (read from prior alert's context_json.stats). First fire = full current total; subsequent fires = only new tokens since last nudge. Hard cap of $30/fire.
  • packs/cost-control/handler.mjs — was recording (cost − threshold) × 0.5. For a $200 session that's $97.50/fire. Now capped at $25/fire. The cap is reached at $50 of overrun, beyond which the dashboard treats further overrun as "more of the same warning, not more $".
  • bin/doctor.mjsclassifyPack regex tightened from src.includes('recordSaving') to /\brecordSaving\s*\(/ because the old substring matched recordPotentialSaving too, misclassifying advisor-only packs as active.
  • bin/doctor.mjs — new MATH SANITY section. Any pack whose potential_usd / potential_count > $20 is flagged with a one-line warning so future calibration drift is visible at a glance. Surfaces the historical inflated rows from earlier v0.19.x writes (auto-compact $1833 avg, cost-control $184 avg) so the dashboard reader knows the number is from old over-counting.

2026-05-13 — Strategic pivot: preview tier (free, community-trained)

  • Cache hit rate already 98% — input-side optimization mostly saturated.
  • Output/input ratio 129x-296x — cost dominated by output tokens, not tool waste.
  • Read = 9 calls / Edit = 7 calls — sample too small to justify AST truncate or post-edit validate as defaults.
  • Bash = 41% of calls but content is ssh / health-checks / launchctl, NOT search.
  • Real bottleneck = long sessions + prompt bloat (avg prompt 22 671 chars).
  • F2 = rule packs system (pluggable, opt-in modules) replacing monolithic Wozcode port
  • F3 = local auto-tuning via weekly digest (user-approved, never silent flip)
  • F4 = cloud sync to VPS (anonymized, opt-in only)
  • F5 = preview tier program (5-10 invited testers across diverse workloads, FREE)
  • F6 = sdet.it case study with multi-user data after 3-6 months
  • F7 = public OSS release (no premium, no paywall)

Planned next

  • Continue dogfooding 1-3 weeks to accumulate baseline.
  • Then start F2 rule packs scaffolding (core/observability always-on + pack/cost-control + pack/skill-hygiene as first two opt-in packs).
  • Stats enhancement (per-prompt distribution, peak hour, model split) folded into F2 polish.

Superseded by 2026-05-13 pivot

  • ~~3 new hooks: post-edit validate, AST truncate, merged search~~ — replaced by pluggable rule pack system; original techniques may return as individual opt-in packs once data justifies enabling them.

[0.18.1] — 2026-05-14 — SdetPulse.app About + Check for Updates

Added — "About Sdet Pulse"

Added — "Check for Updates…"

  • Up to date — green confirmation NSAlert + Close
  • Update available — NSAlert with Open GitHub / Open Dashboard / Close
  • Offline / fetch failed — error NSAlert with diagnostic text

Added — `bin/release.mjs` writes `dist/latest.json`

VPS-side change (deployed manually)

Tests

Bundle size

[0.18.0] — 2026-05-14 — Sprint AA: pack/contextignore-recommender

Why

Added — `pack/contextignore-recommender`

  • Telemetry-driven (preferred): Parses tool_calls.input_preview over window_days (default 30) — the JSON-encoded args we already store for every Read/Edit/Write call. Buckets paths into well-known groups (node_modules/**, dist/**, *.lock, ...) when possible, passes through literal paths otherwise. A path is recommended only if it was Read at least read_count_threshold times (default 5) and Edit/Write was never called on it.
  • Static fallback: When telemetry produced zero hits (fresh install, no Read events, malformed preview rows), emits the curated default_deny_patterns list — 14 entries covering the universal-waste buckets (node_modules/**, dist/**, build/**, .next/**, .nuxt/**, coverage/**, target/**, __pycache__/**, .venv/**, *.lock, *.min.js, *.min.css, *.map, .git/**).

Round-trip user workflow

Preset changes

  • observability-only — added (advisor only, low risk). Now 8 packs.
  • active-full — added. Now 20 packs.
  • enterprise — added. Description bumped "19 packs → 20 packs".
  • preview-tier-default — added. Now 8 packs.

Tests

[0.17.0] — 2026-05-14 — Sprint Z: pack/active-subagent-truncate

Why

  • pack/subagent-budget — advisory only, never mutates (default). For users who want signal but trust their subagents.
  • pack/active-subagent-truncate — actively replaces large Task returns with a structured summary + escape-hatch file path. For multi-agent users who feel the burn.

Added — `pack/active-subagent-truncate`

Top numeric findings (auto-extracted)

  • "Test count: 429" (line 14)
  • "duration_ms: 308546" (line 88)

Top file references (auto-extracted)

  • packs/cache-ttl-aware/handler.mjs (line 22)
  • bin/release.mjs (line 67) ```

Extraction heuristics

  • Numeric findings: lines matching /\b(\w+):\s*\d[\d,.]*\b/ — common "key: number" patterns (test counts, durations, byte sizes, line numbers).
  • File references: lines matching /\b[\w-]+\/[\w./-]+\.(mjs|js|ts|tsx|json|md|yml|yaml|toml|html|css|swift|ps1)\b/ — captures most repo-relative source paths.

Escape-hatch design rationale

  • It's outside the user's home dir (privacy under read-dedup-cache / accidental Git commits).
  • /tmp is OS-cleaned across reboots — escape files don't accumulate forever.
  • mkdirSync({ recursive: true, mode: 0o700 }) + writeFileSync({ mode: 0o600 }) + belt-and-suspenders chmodSync(0o600).

Configurable

Tests

Presets updated

  • active-full+active-subagent-truncate
  • enterprise+active-subagent-truncate, description bumped "18 packs → 19 packs", truncate_bytes: 60_000 (lowered for enterprise multi-agent traffic)
  • deep-coding+active-subagent-truncate (multi-agent workflows common in long focused coding)
  • NOT added to observability-only / preview-tier-default — this DOES transform tool output; testers should opt in deliberately.

Integration

  • lib/presets.mjs::ALL_PACK_NAMESactive-subagent-truncate appended
  • lib/sync-gate.mjs::ACTIVE_PACK_NAMES — added (active transformation, so it gates on sync freshness like every other transform pack)
  • NOT in bin/dashboard-server.mjs::ADVISOR_PACKS — it's a savings pack, not an advisor; surfaces automatically in the "Pack savings by technique" panel.

[0.16.0] — 2026-05-14 — Sprint Y: pack/cache-ttl-aware

Added — `pack/cache-ttl-aware`

Trigger conditions

  • Gap from previous session's last activity > cache_ttl_minutes (default 5)
  • Previous session input_tokens >= warm_session_input_threshold (default 50,000)
  • Cooldown not active (cooldown_minutes default 60, across all sessions — never per-session-only; users restart CC often and would otherwise get spammed)

Design notes

  • Pure advisor. Never blocks, never mutates, never returns decision. One additionalContext line + one info alert per fire.
  • No transcript reads. We use the sessions table only — reading ~/.claude/projects/*/transcript.jsonl is too fiddly across CC versions and not necessary.
  • Tests can bypass cooldown via SDET_PULSE_CACHE_TTL_COOLDOWN_MS=0 (mirrors env-tax-advisor's pattern).
  • Default-off, like every other advisory pack. Enabled in observability-only, active-full, enterprise, preview-tier-default, deep-coding presets — explicitly not in ACTIVE_PACK_NAMES because there's no transformation, only signal.

Configurable

Why this is "niche-owned"

Tests

  • 12 new tests in test/pack-cache-ttl-aware.test.mjs covering: no-prior-sessions silence, sub-TTL gap silence, low-token-prev silence, basic fire path, long-gap fire path, cooldown active, cooldown bypass, additionalContext shape, context_json shape, configurable TTL, configurable threshold, NULL ended_at fallback to started_at.

Presets updated

  • observability-only → +cache-ttl-aware (perfect fit, informational only)
  • active-full → +cache-ttl-aware
  • enterprise → +cache-ttl-aware, description bumped "17 packs → 18 packs"
  • preview-tier-default → +cache-ttl-aware (testers benefit immediately)
  • deep-coding → +cache-ttl-aware (long sessions = highest expiry risk on context switches)

Integration

  • lib/presets.mjs::ALL_PACK_NAMEScache-ttl-aware appended
  • bin/dashboard-server.mjs::ADVISOR_PACKScache-ttl-aware appended
  • bin/preset.mjs::runCurrentState count test bumped 6 → 7 for observability-only

Net test count

[0.15.0] — 2026-05-14 — Sprint X: pack/prompt-coach

Added — `pack/prompt-coach`

Decision logic

  • score >= 3 → silent passthrough (good prompt).
  • score == 2 → silent by default. nudge_at_score_2: true opt-in for stricter users.
  • score <= 1 → emit additionalContext listing ONLY the missing rules (not all four).
  • references_artifact: name a file path, function, URL or CLI command …
  • success_criteria: state WHAT done looks like …
  • specificity: be concrete. Avoid bare "fix it" / "popraw" / "zrób to" …

Skipped prompts (silent without scoring)

  • Slash commands (event.prompt starts with /)
  • Empty / whitespace-only prompts
  • Prompts shorter than min_prompt_chars (default 5)
  • Continuation tokens — ok, okay, yes, no, tak, nie, dalej, lecimy, puszczaj, daj, rób, spadaj (case-insensitive, full match or short-prefix), plus those tokens with a single trailing ., !, ?.

Why advisor, not blocker

Cooldown

Alert row schema

  • pack: 'prompt-coach'
  • severity: 'info'
  • code: 'low_score_prompt'
  • message: 'Prompt score N/4 — missing rule_a, rule_b, …'
  • context_json: { score, missing: string[], prompt_length: number }

Preset integration

  • observability-only — added (now 6 packs). Educational nudge fits the watch-only posture.
  • active-full — added (now 17 packs).
  • enterprise — added (now 17 packs); description bumped to "All 17 packs". Cooldown widened to 60 min in the enterprise config override (long sessions, more low-score noise tolerated).
  • preview-tier-default — added (now 6 packs).
  • deep-coding — added (now 6 packs); description bumped — heavy coding sessions are where vague prompts hurt the most.
  • All other presets unchanged.

Tests

Documentation

[0.14.0] — 2026-05-14 — Sprint W: pack/env-tax-advisor

Added — `pack/env-tax-advisor`

  • pack: 'env-tax-advisor', code: 'env_tax', severity: 'info'
  • A headline message like "3 of 5 CC tuning knobs unset — MAX_THINKING_TOKENS, MAX_MCP_OUTPUT_TOKENS, BASH_MAX_OUTPUT_LENGTH. Estimated 30-60% savings if all applied."
  • context_json with structured findings (status unset / sub_optimal / ok, current value, recommended value, claimed savings) for every knob — downstream tooling can render or sort.
  • additionalContext echoing the same info with a paste-ready ~/.claude/settings.json "env": { ... } snippet so the user can apply the fix without leaving the alert.

Observability, not active

Cooldown

Preset integration

  • observability-only — added (now 5 packs). This is the safe default for first-week dogfooders, so a config nudge fits perfectly.
  • active-full — added (now 16 packs).
  • enterprise — added (now 16 packs); description bumped to "All 16 packs ON".
  • preview-tier-default — added (5 packs). Exactly the kind of low-risk nudge invited testers benefit from.
  • All other presets unchanged.

Tests

Documentation

  • docs/research/2026-05-14-token-optimization-survey.md — already documented the rationale (idea #3); now realized in code.

[0.13.0] — 2026-05-14 — Sprint V: pack/fuzzy-match-suggest

Added — `pack/fuzzy-match-suggest`

    • ≥ 0.85 (configurable): returns permissionDecision: "ask" with the suggested replacement and a diff hint in additionalContext. The user / agent can accept the suggestion before paying the cost of a failed Edit + Read + retry round.
    • 0.50–0.85: no decision change, but additionalContext injected with a nudge ("best fuzzy candidate is only X% similar — re-Read or refine").
    • < 0.50: silent passthrough — don't pollute context with weak hints.

Why now

Preset integration

  • active-full — added (now 15 packs).
  • enterprise — added (now 15 packs).
  • deep-coding — added; this is the pack's sweet spot (long refactor sessions where Edit retries hurt the most).
  • All other presets unchanged.

Sync-gate participation

Tests

Documentation

  • docs/WOZCODE-COMPARISON.md — row 5 (Fuzzy match) flipped from "⬜ not started" to "✅ shipped in v0.13.0".

[0.12.7] — 2026-05-14 — Statusbar hotfixes: symlink main guard + 💓 brand

Fixed — symlink-safe main guard

Added — 💓 brand prefix

Tests

[0.12.6] — 2026-05-14 — macOS menu bar integration (xbar)

Added — `bin/statusbar-xbar.mjs`

  • Icon: 🟢 (no alerts) / 🟡 (info) / 🟠 (warn) / 🔴 (error or sync stale)
  • Unacked alert count next to the icon
  • Plugin version + pack count + sync gate state
  • Top 5 unacked alerts with per-severity colour
  • Click "Open dashboard" → opens localhost:3535 in the browser
  • Click "Acknowledge all" → shells out to bin/alerts.mjs ack-all, refreshes
  • Graceful degradation: ❌ daemon-down state when the daemon isn't reachable, with a refresh action

Added — `bin/alerts.mjs`

Added — README "macOS menu bar integration (xbar)" section

Tests

[0.12.5] — 2026-05-14 — Hotfix: batch top-level field rename to match receiver

Fixed

  • lib/anonymize.mjs — batch top-level field renamed user_iduser_hash. Matches the receiver validator at /opt/sdet-pulse-receiver/server.mjs. The schema sdet-pulse-batch-v1 semantics are unchanged; we're locking the canonical name now before any tester data flows.
  • bin/export.mjsresult.user_id / result.user_hash renamed accordingly; export receipt JSON now prints user_hash instead of user_id so audit output matches what the receiver sees.
  • test/bin-export.test.mjs (3 assertions) + test/bin-sync.test.mjs (1 assertion) updated to assert on user_hash.

Note on per-row `user_id` fields

Tests

[0.12.4] — 2026-05-14 — Sprint U: participation contract — agreement + sync-required gating

Added — `lib/sync-gate.mjs`

  • describeSyncGate(){ stale, reason, age_hours|age_days } for surfacing in doctor + dashboard
  • isSyncStale() boolean shortcut
  • shouldGatePack(name) → true when the pack is in ACTIVE_PACK_NAMES AND sync is stale
  • Observability packs (cost-control, skill-hygiene, tool-efficiency, claude-md-budget, intelligence) are exempt — they keep running so the tester still sees stats locally
  • Active packs (big-read-truncate, bash-output-redact, post-edit-validate, grep-truncate, mcp-output-truncate, read-dedup-cache, auto-compact-nudge, haiku-router, subagent-budget) gate when sync is stale
  • Stale threshold: 7 days (tunable via SDET_PULSE_SYNC_STALE_MS for tests)
  • Dev override: SDET_PULSE_SKIP_SYNC_GATE=1 bypasses everything (local dev / dogfood without configured sync)
  • First-run grace: a newly configured but never-synced tester gets not stale until the first SessionStart auto-sync runs

Wired into `lib/packs.mjs` `loadPacks`

Dashboard `/health` exposes the gate

Added — invite email "Participation agreement" section

  • Keep source private during the preview
  • Don't build a commercial product from it without asking
  • Enable cloud sync to participate fully (sync-gate enforces this)
  • Honest feedback within 6 weeks

Tests

Roadmap reference

[0.12.3] — 2026-05-14 — Sprint N+: tester onboarding helper + VPS ops discipline

Added — `bin/add-tester.mjs` + `npm run add-tester`

  • Shell-metacharacter ban on --name (rejects " ' \ $ newline ;`) before any string interpolation
  • Simple email regex on --email
  • --tarball-version defaults to the highest manifest in dist/; fails if the manifest doesn't exist
  • --dry-run skips every side effect, prints <dry-run> as the bearer placeholder

Fixed — invite template substitution edge

Operational (on VPS, not in repo)

  • /usr/local/bin/sdet-pulse-monitor.sh — every 5 min via /etc/cron.d/sdet-pulse. Probes /health + /ping (locally via curl --resolve to bypass CF challenge), checks sdet-pulse-receiver.service + nginx are active. Logs to /var/log/sdet-pulse-monitor.log. Failure marker in /tmp/sdet-pulse-alert-<check> with 60-minute cooldown. Auto-detects available MTA (msmtpsendmailmailmailxssmtp); falls back to WOULD ALERT log line if none installed. MONITORING.md documents how to wire apt install msmtp-mta later.
  • /usr/local/bin/sdet-pulse-backup.sh — daily at 03:00. Tarballs /etc/sdet-pulse/tokens.map, /var/www/sdet-pulse/batches/, plus the receiver source + systemd unit into /opt/sdet-pulse-receiver/backups/<YYYY-MM-DD>.tgz (chmod 600). 14-day rotation. Single tarball = single restore.

README

Tests

[0.12.2] — 2026-05-14 — Sprint K.1 + L: live ingest endpoint + SessionStart auto-sync

Added (VPS side — no code in this repo, deployed manually)

  • https://pulse-test.sdet.it/health (public) — nginx liveness probe
  • https://pulse-test.sdet.it/dist/<file> — bearer-gated tarball downloads (existing from v0.11.4)
  • https://pulse-test.sdet.it/ping (public, proxied) — receiver liveness
  • https://pulse-test.sdet.it/intake (POST, bearer-gated, max 10MB) — JSON batch ingest
  • Receiver: pure-Node stdlib (no deps), /opt/sdet-pulse-receiver/server.mjs, 138 LOC, listens 127.0.0.1:8889. Validates schema sdet-pulse-batch-v1, scopes writes to /var/www/sdet-pulse/batches/<bearer_hash>/, returns 202 with batch id + counts.
  • systemd unit sdet-pulse-receiver.service (hardened: NoNewPrivileges, PrivateTmp, ProtectSystem=full).
  • nginx vhost adds /intake and /ping reverse-proxied to receiver; defense-in-depth (nginx bearer check + receiver re-check).

Added (plugin code) — `maybeTriggerAutoSync()` in `hooks/stats-tap.mjs`

  • sync is enabled
  • endpoint + auth_token configured
  • last successful sync ≥ 24h ago (env override SDET_PULSE_AUTO_SYNC_INTERVAL_MS for tests)

Fixed

  • hooks/stats-tap.mjs main() was running unconditionally on import, hanging tests on readStdin. Now guarded behind import.meta.url === process.argv[1].

Tests

Known CF quirk

[0.12.1] — 2026-05-14 — Sprint S: dashboard panels + tester install docs + dedup-cache offset awareness

Added — Dashboard panels

  • Pack savings by technique — table from the savings table grouped by technique. Shows fires, sessions, tokens saved, cost saved, % share. Answers "which pack is doing real work" at a glance.
  • Advisor packs — fire counts — table from the alerts table for packs that nudge rather than transform (haiku-router, auto-compact-nudge, intelligence, etc.). Per-code counts + last-fire age.

Added — Tester-facing install section in README.md

Fixed — `read-dedup-cache` ignores legitimate range re-reads

Tests

[0.12.0] — 2026-05-14 — Sprint R: enterprise-class optimization suite

Added — 6 new packs

  • grep-truncate — PostToolUse on Grep/Glob. Head + tail truncation with elision marker + savings recording. Stops grep -r from dumping monorepo matches into context.
  • mcp-output-truncate — PostToolUse on mcp__* + verbose git (log/diff/show/reflog/blame). Generalises bash-output-redact to MCP/git long tails. Single-responsibility split keeps the savings table readable.
  • read-dedup-cache — PreToolUse on Read. Per-session JSON state at data_dir/read-cache/<session_id>.json tracks mtime + first-8KB sha256 fingerprint. Re-read of unchanged file gets an advisory; we never block. Cache trims to cache_max_entries LRU.
  • auto-compact-nudge — UserPromptSubmit. Proactive /compact suggestion when input/output tokens or turn count cross thresholds. Pre-empts the cost-control ceiling alerts which fire only post-Stop. Cooldown muting per session.
  • haiku-router — UserPromptSubmit + PreToolUse(Task). Advisor only: detects "Haiku-class" prompts (summarise/list/short lookups) and short Task dispatches; injects context recommending the user route to Claude Haiku 4.5 explicitly. CC plugins can't mid-session swap models, so this is the realistic 80% of the win.
  • subagent-budget — PostToolUse on Task. Two-band response: alert between alert_bytes and truncate_bytes, alert + truncate above. truncate_bytes: null switches to observability-only mode.

Added — `enterprise` preset

Updated

  • active-full preset now includes all 14 packs
  • ALL_PACK_NAMES synced
  • test/bin-preset.test.mjs references ALL_PACK_NAMES.length instead of hardcoded 8

Tests

Why minor bump (0.12.0)

[0.11.5] — 2026-05-14 — Sprint Q: active packs close the measurement loop

Added — `lib/savings-helper.mjs`

  • estimateTokensFromBytes(bytes) — conservative 4 bytes ≈ 1 token approximation
  • estimateInputCostUsd(tokens, model) — uses getPricing() from cost.mjs, returns 0 if model unknown
  • recordSaving(ctx, { technique, tokens_saved, note, calls_saved? }) — best-effort write:
    • resolves the session's model for accurate cost rate (falls back to Opus 4.7 if unknown)
    • silently no-ops on missing db/session_id, zero/negative tokens, FK violation
    • never throws — a stats write failure must not break a hook

Wired into active packs

  • big-read-truncate records tokens_saved ≈ (file_size − limit_lines × 80) / 4 on every truncation
  • bash-output-redact records tokens_saved ≈ (raw.length − working.length) / 4 whenever it actually trimmed or redacted output

Added — 11 tests in `test/savings-helper.test.mjs`

Operational

[0.11.4] — 2026-05-14 — Sprint P: release tarball builder + brand-protected distribution

Added — `bin/release.mjs`

  • node bin/release.mjs pack — builds dist/sdet-pulse-<version>.tgz from whitelist + dist/sdet-pulse-<version>.manifest.json with sha256
  • node bin/release.mjs verify <tarball> — sanity-checks any tarball against the forbidden-pattern set
  • Refuses to build with dirty working tree unless --allow-dirty
  • Stages whitelisted paths into a temp dir before tarring → new repo dirs never leak by accident
  • Second-line guard: forbidden-pattern regex set scans the produced tarball; if docs/superpowers/, .git/, .env, node_modules/, test/, data/, dist/, *.tgz, or *.log slip in, the tarball is deleted and the build fails

Added — `npm run release` + `npm run release:verify`

Added — 10 tests in `test/bin-release.test.mjs`

Not yet shipped (next sprints)

  • VPS-side bearer-token allowlist endpoint at pulse-test.sdet.it/dist/
  • Cloudflare Tunnel route binding
  • tokens.txt provisioning + invite UX
  • Auto-publish on tag (gh release create) — manual for now

[0.11.3] — 2026-05-14 — Sprint O: pack presets + bulk operations

Added — `lib/presets.mjs` — 8 named profiles

Added — `/sdet-pulse:preset` slash command

  • list — show all available presets with description + pack count
  • apply <name> — bulk switch (enables target packs, disables rest, applies config overrides)
  • current — show currently enabled packs + which preset (if any) matches
  • enable-all / disable-all — bulk operations without picking a preset

Tests

Brand discipline reminder

  • darco81/sdet-pulse repo stays private
  • Plugin distribution for preview-tier launch: NOT a public mirror — use tarball-from-VPS pattern (option C from ClickUp Sprint K.1 task). Testers download https://pulse-test.sdet.it/dist/sdet-pulse-0.11.3.tgz with their invite token; no public surface, no "takes it without thanks" exposure.

[0.11.2] — 2026-05-14 — Sprint N Phase 5+6: intelligence pack + Wozcode feature parity log

Added — `pack/intelligence`

Added — `docs/WOZCODE-COMPARISON.md` (restored, reframed)

  • Maps each of Wozcode's 7 named techniques to sdet-pulse status: shipped (3) / partial (1) / not started feasible (1) / infeasible via hooks (2)
  • Lists sdet-pulse extras NOT in Wozcode's set (efficacy framework, multi-user benchmarking, subscription disclosure, no paywall, plus 8 packs total)
  • Architectural honesty: Wozcode likely ships MCP-server for transform-y techniques; we stay in hook scope; both legitimate
  • ROI rank for next potential ports: fuzzy match → AST truncate → merged search

Tests

Sprint N final tally

  • 6 versions shipped (0.10.2 → 0.11.2) in one Sprint N pass
  • 5 new packs added: claude-md-budget, big-read-truncate, bash-output-redact, post-edit-validate, intelligence
  • 1 architecture refactor: hook decision aggregation (lib/packs.mjs::dispatchEvent)
  • 2 dashboard panels: CLAUDE.md tax meter + Tokenizer impact estimator
  • 31 new tests (218 → 249)
  • 3 of 7 Wozcode techniques now shipped + 1 partial

[0.11.1] — 2026-05-14 — Sprint N Phase 4: pack/bash-output-redact + pack/post-edit-validate

Added — `pack/bash-output-redact`

  • Redacts known secret patterns from output (AWS, GitHub PAT, Anthropic API key, JWT, RSA) using lib/logger.mjs::redactSecrets
  • Truncates outputs > 100 lines to head + tail with elision marker
  • Returns updatedToolOutput (CC's hookSpecificOutput.updatedToolOutput mechanism) — Claude sees the cleaned version, not raw
  • Pure value: prevents secrets bleed-through to context cache (where they'd cost cache_read tokens AND lurk for the 5-min TTL) + reduces noise from spam commands like find / -name foo

Added — `pack/post-edit-validate`

  • Detects file extension (.ts/.tsx/.mts/.mjs/.js/.jsx)
  • Walks up to find package.json for project root
  • If node_modules/.bin/tsc or eslint exists, runs validators on the single touched file (timeout 8s, output capped 2KB each)
  • Injects results via additionalContext so Claude sees errors immediately
  • Saves round-trip where user otherwise had to copy-paste error feedback

Tests

Notes

  • Both packs default_enabled: false. User opts in explicitly.
  • bash-output-redact is safe across all scenarios (silent when nothing matches). post-edit-validate is project-aware (silent when no package.json or no validators found).
  • additionalContext from multiple packs gets concatenated by aggregator with [pack-name] headers, so user can see who said what.

[0.11.0] — 2026-05-14 — Sprint N Phase 2+3: Active hook architecture + first active pack

Changed — pack architecture (backward-compatible)

  • lib/packs.mjs::dispatchEvent now collects + aggregates pack handler return values:
    • Handler may return {permissionDecision, reason, updatedToolInput, updatedToolOutput, additionalContext}. Returning undefined = no opinion (silent), same as before — all existing packs keep working unchanged.
    • Merge rules: permissionDecision priority deny > ask > allow. updatedToolInput / updatedToolOutput first-pack wins (conflicts logged to error.log). additionalContext from all packs is concatenated with [pack-name] headers.
    • hasHookOutput(merged) helper: true if merged result has any field beyond hookEventName.
  • hooks/stats-tap.mjs::emitHookOutput — writes {hookSpecificOutput: merged} JSON to stdout only if hasHookOutput is true; silent otherwise (= CC proceeds normally).
  • PreToolUse dispatch added — previously only Stop / PostToolUse / UserPromptSubmit / SessionStart dispatched. PreToolUse is the foundation for tool-input transformation packs.

Added — `pack/big-read-truncate` (first ACTIVE pack)

  • On PreToolUse for Read tool calls, checks file size locally via statSync.
  • If file > 100 KB AND caller didn't already pass offset / limit, injects offset:1, limit:200.
  • Skips excluded extensions (.png, .pdf, etc — binary, partial truncation misleading).
  • Returns additionalContext explaining the truncation so the agent knows it can call Read again with explicit args if needed.

Tests

Notes

  • Pack big-read-truncate ships with default_enabled: false — user explicitly enables. Reason: input modification changes Claude's behavior, opt-in only.
  • All previous packs (cost-control, skill-hygiene, tool-efficiency, claude-md-budget) are observation-only and unchanged.
  • This release sets the architecture for Phase 4 (bash-output-redact, post-edit-validate) and Phase 5 (cache-miss-detector, subagent recommender, daily anomaly) — same merge rules, different events.

[0.10.2] — 2026-05-14 — Sprint N Phase 1: CLAUDE.md tax + tokenizer estimator

Added — `pack/claude-md-budget`

  • claude_md_oversized — project CLAUDE.md exceeds 200 lines threshold (~3000 tokens / turn tax)
  • claude_md_global_oversized — global ~/.claude/CLAUDE.md exceeds threshold (loaded into EVERY session)
  • too_many_skills — > 30 auto-loaded skills (≈ 600+ tokens of metadata before any work)
  • *"a 5,000-token CLAUDE.md is a 5,000-token tax on every turn"* — buildtolaunch.substack.com
  • Author's own skills-radar finding: *"~6,000 tokens before the user types anything"* with 80+ skills

Added — `hooks/stats-tap.mjs` SessionStart dispatch

Added — Dashboard "CLAUDE.md tax meter" panel

  • Global CLAUDE.md lines + token estimate
  • Auto-loaded skill count + token estimate
  • Per-turn tax (combined)
  • Tax this window (per-turn × turns × cache_read rate $1.50/Mtok)

Added — Dashboard "Tokenizer impact estimator" panel

  • Opus 4.7 has new tokenizer reporting ~35% more tokens than 4.6 (Anthropic April 2026 release notes)
  • Sonnet 4.6 is $3/$15 vs Opus $15/$75 — 5× per-token rate difference
  • Combined: same workload ~$3.70 on Sonnet vs $30 on Opus 4.7

Tests

[0.10.1] — 2026-05-14 — Plugin efficacy framework (replaces competitive framing)

Removed

  • docs/WOZCODE-COMPARISON.md — dropped. Wasn't aligned with brand intent (measurement over competition).

Added

  • bin/efficacy.mjsgatherEfficacy(db) + formatEfficacy(data) + runEfficacy():
    • Reference point = MIN(pack_state.enabled_at) — when the first pack was enabled
    • Splits all non-zero-cost sessions into "before pack era" vs "after pack era" cohorts
    • Computes: avg_cost, total_cost, avg_turns, avg_output, tokens_per_turn, avg_cache_read for each cohort
    • Delta in % for each metric
    • Small-sample honesty: flags INCONCLUSIVE if either cohort has < 5 sessions
    • Per-pack alert counts since reference: which packs are firing on real signal, which are quiet
  • /sdet-pulse:efficacy slash command — relays the text report
  • Dashboard "Plugin efficacy" panel at TOP of dashboard (above Total Savings):
    • 4-row table: before / after / Δ for each metric, color-coded green-down / red-up
    • Significance callout (warn-yellow box if small sample, green-box if real signal, red-box if regression)
    • Per-pack alerts-fired + affected-sessions counts
    • Reference timestamp + methodology footnote
  • docs/EFFICACY.md — methodology doc:
    • Reference point definition
    • Sample selection rules
    • Per-metric formulas
    • Delta + significance heuristic
    • What we DO and DON'T claim (honest scope)
    • Future polish (t-test, weekly trend chart, A/B framework)
  • Tests — 5 new (no-packs-enabled, before/after split, small-sample flag, per-pack alert correlation, no-data CLI path). Total 213.

Real-data check (against author's live DB)

Why this matters for preview tier

[0.10.0] — 2026-05-13 — Total Savings panel + pack/tool-efficiency + Wozcode comparison

Added — dashboard

  • Total Savings panel at top of dashboard, full-width:
    • Stacked horizontal bars showing 4 cost tiers (no-cache hypothetical → with-cache actual → with-/compact-discipline → subscription flat)
    • Single "Total saved" number split by source: cache savings (automatic) / /compact savings (achievable) / subscription savings (already)
    • Plain-language disclaimer: subscription is the single biggest saving for most users
  • Responsive grid layout for hero + optimization cards — uses repeat(auto-fit, minmax(...)) instead of flex: 1, so cards stack properly on narrow viewports (user reported seeing only one card on a narrow window)

Added — `pack/tool-efficiency` (Wozcode-inspired, all live alerts)

  • Agent avg 88s (max 186s)
  • mcp__claude_ai_ClickUp__clickup_create_task 154 calls × 4.1s = 10 min cumulative wait
  • These now surface as actionable alerts in dashboard.

Added — `docs/WOZCODE-COMPARISON.md`

Tests

[0.9.8] — 2026-05-13 — Optimization tracker — before/after scorecard

Added: dashboard "Optimization opportunities" section

    • Δ ±% with green-down/red-up coloring
    • cur / prev / delta in dollars
    • Direct quantification of "did the optimization actually save money this week?"
    • For every session > 200 turns in window, simulate "what if user had /compact'd at turn 200?"
    • Model: post-threshold cache_read tokens × 50% reduction × Anthropic cache_read rate ($1.50/Mtok)
    • Output: total simulated savings + count of long sessions + top-5 table with per-session would-have-saved
    • This is an upper-bound; gives an order-of-magnitude answer to "is /compact discipline worth it for me?"
    • Total alerts fired in window
    • Alerts acked vs unacked (% acked)
    • Digest runs in window + total tuning actions applied (SUM(json_array_length(applied_action_ids_json)))
    • Measures how much of the plugin's signal user actually acted on

Notes

  • Simulated /compact savings model is intentionally conservative (50% reduction assumption). Real /compact discards more than 50% in typical cases, so this is a *floor* estimate. If the simulation shows "$X" you'd save, the real saving is likely $X to $2X.
  • 1d window now correctly routes to digest('1d') (previously the if ['7d','14d','30d'].includes check excluded 1d and fell back to 7d on the digest side — fixed in the same commit).
  • Test count 199 → 202 (+3): tracker on seeded long session, empty DB graceful, db=null backward compat.

Why this matters

  • Period-over-period: time-based comparison (real before/after if user changed behavior between windows)
  • Counterfactual /compact: per-session what-if (theoretical optimization that hasn't been applied yet)
  • Pack effectiveness: meta-metric (is the plugin's signal converting to action?)

[0.9.7] — 2026-05-13 — Real bug: projectKeyFromCwd off-by-one + ghost session filter

Fixed

  • packs/skill-hygiene/handler.mjs::projectKeyFromCwd — drop the redundant - prefix. cwd.replace(/\//g, '-') already turns the leading / into -. The fix: ``js // OLD: return '-' + cwd.replace(/\//g, '-'); → '--Users-dariusz' (miss) // NEW: return cwd.replace(/\//g, '-'); → '-Users-dariusz' (hit) ` Verified live: now finds ~/.claude/projects/-Users-dariusz/memory/MEMORY.md` (38 lines, under threshold, no alert needed — but the lookup *works*).
  • Test mirror updatedtest/pack-skill-hygiene.test.mjs was constructing the expected path with the same buggy formula, so it never caught the bug. Now uses the correct CC convention.
  • bin/stats.mjs::gatherStats byProject query — filter out ghost sessions (model='unknown' + zero activity) so /tmp test artifacts can't pollute the "top projects" table at $0.

Database hygiene

  • Manual one-time cleanup: 91 zombie sessions deleted from production DB (cwd LIKE '/private/tmp%' OR cwd LIKE '/private/var/folders/%', all with 0 activity, all test/development artifacts from this build day). DB shrunk 119 → 28 real sessions. VACUUM ran.
  • Plugin's ongoing inserts won't produce these — they only existed because heavy iteration on the plugin itself spawned many short-lived CC sessions in /tmp directories.

Audit findings — informational

  • Agent tool calls take avg 88 s (max 186 s) — subagent dispatches are the slowest tool by far. Worth a future "tool-wait time" pack that surfaces per-tool wait totals.
  • mcp__claude_ai_ClickUp__clickup_create_task — 154 calls × 4.1 s avg = ~10 minutes of pure wait on ClickUp API today. Worth pinning as a power-tip in invite docs.
  • server.log doesn't rotate (4 lines per restart accumulating). Deferred — non-issue at current volume.

Test count

[0.9.6] — 2026-05-13 — Housekeeping + rich /health + dashboard reconciliation card

Cleanup

  • Removed stale ~/.claude/sdet-pulse/debug.log (197 KB, 779 lines) — leftover from F1 troubleshooting before Sprint B v0.6.1 removed the append-log code. The file lingered because we only removed the *writer*, not the *file*. One-time manual rm; future plugin sessions don't recreate it.
  • tx.json TTL pruning — entries now carry a ts timestamp; reads auto-drop anything older than 1 hour (typical PreToolUse→PostToolUse delay is <30 s; stale entries usually mean CC crashed mid-tool). Legacy entries without ts are also pruned. This bounds the disk footprint and reduces the residency time of cached input previews (which can contain ssh hosts, paths, commands).

Added

  • /health endpoint enriched for Sprint K.1 VPS receiver: ``json { "ok": true, "ts": <ms>, "version": "0.9.6", "schema_version": 3, "uptime_ms": <ms-since-server-start>, "packs_enabled": ["cost-control", "skill-hygiene"], "db_size_bytes": 675840, "last_session_at": <ts>, "last_alert_at": <ts>, "sync": { "enabled": false, "configured": false, "last_success_at": null, "failed_attempts": 0 } } `` Used by VPS-side health probe to verify a tester's plugin is actually feeding data.
  • Cost reconciliation card on dashboardAnthropic API rate breakdown showing per-token-type cost split (input / output / cache_creation / cache_read), "without cache" hypothetical with savings highlight, and a subscription disclaimer (If you're on Claude Pro/Max this is NOT your real bill — figures are API-equivalent).
  • Tests — 4 new (3 tx.json prune scenarios + 1 rich /health). Total 199.

Notes

  • debug.log will never come back — code path that wrote it was deleted in v0.6.1. Document for any historical confusion.
  • tx.json TTL is 1 hour by design. Tools can run that long (e.g. heavy mcp__* searches); shorter TTL would risk losing the input_preview cache before PostToolUse arrives.

[0.9.4] — 2026-05-13 — Defensive: [1m] context-tier suffix normalization

Fixed

  • lib/cost.mjs::computeCost + getPricing — now normalize the model string by stripping [\d+[a-z]*] suffix before pricing lookup. Exact-match takes priority (so an explicit premium-tier entry would win), then normalized fallback. Both claude-opus-4-7[1m] and claude-sonnet-4-6[200k] map to base pricing.
  • bin/dashboard-server.mjs::buildModelsTable — displays the tier suffix as a small muted annotation (claude-opus-4-7 [1m]) instead of jamming it into the model id. Cleaner UI, no functional change.

Tests

Verified against real DB

[0.9.3] — 2026-05-13 — Sprint J.2.1: Sync hardening (post-review)

Critical fixes

  • 4xx is now terminal — no token re-emission. Previously if (res.ok) branched success; everything else retried 3× including 401/403/413/422. A bad token would be replayed 3 times to a possibly-compromised endpoint, amplifying exposure. Now: shouldRetryStatus(s) → s ≥ 500 || s === 408 || s === 429. 4xx fail fast on first attempt.
  • redactSecrets for log/state writeslib/logger.mjs::redactSecrets(text, [...callerSecrets]) strips bearer-token strings, JWT shapes, AWS keys, GH PATs, Anthropic API keys. Applied to: lastErr before persisting to state.last_error, error.log via logError, and server_response returned to caller. Tested by mocking a server that echoes the bearer in its 401 body — verifies token does NOT appear in persisted state.
  • configure({token: ''}) throws — previously empty string fell through to null, silently wiping the token. Now explicit error: "--token cannot be empty — use 'reset' to clear".

High-severity fixes

  • AbortController timeoutfetch had no timeout (Node 20 default = forever). Configurable via SDET_PULSE_SYNC_TIMEOUT_MS (default 30 s). Tested with a hanging mock server that never responds.
  • Network error classification — DNS failures (ENOTFOUND), invalid URL, TLS cert mismatches, expired certs now treated as terminal. Connection resets, sockets dropped mid-stream, timeouts remain retryable.
  • Post-success state-write guard — if updateSyncState throws after a successful POST (disk full, RO FS), we now return {ok: true, warning: '... next sync may duplicate. Check disk space.'} instead of pretending nothing happened.
  • reset() now wipes last_success_at + last_success_counts — previously preserved misleading display in status after reset.
  • Configurable backoff via SDET_PULSE_SYNC_BACKOFF_MS (default 1 s) — eliminates flaky elapsed >= 2900 test timing; tests now run with 10 ms.
  • Lockfile for concurrent syncOnce~/.claude/sdet-pulse/sync.lock (O_EXCL open). Two concurrent CC sessions calling once no longer race on last_sync_at. Second caller returns "another syncOnce is in progress".

Medium-severity fixes

  • Corrupt sync-state.json quarantine — was silently falling back to defaults (user thinks "never configured"). Now rename to sync-state.json.corrupt-<ts>, log via logError, return {...DEFAULTS, _recovered_from_corruption: true} so status can flag it.

Test count

  • 4xx terminal (no retry, single attempt)
  • 5xx retry (configurable backoff)
  • 429 retry
  • Token leak verification (server echoes back, must not appear in state.last_error or error.log)
  • Empty token throws
  • Reset wipes success markers
  • AbortController timeout against hanging server
  • ENOTFOUND fast-fail
  • Lockfile prevents concurrent sync
  • Corrupt state file quarantine

Not changed (deferred to J.2.2)

  • Batch row-count cap — first sync from a 12-month-old DB still pulls everything into memory. Fine for current MVP; receivers will see 4xx 413 (now terminal so no retry amplification), user can manually adjust last_sync_at to chunk. Real fix = LIMIT N + iterate.
  • CLI subprocess test for node bin/sync.mjs once — programmatic API is well-tested; CLI parsing untested. Bug surface small.
  • Mock-server helper extractiondashboard-server.test.mjs and bin-sync.test.mjs both inline the pattern. Defer until a 3rd consumer needs it.

[0.9.2] — 2026-05-13 — Sprint J.2: Sync agent (opt-in cloud telemetry)

Added

  • lib/sync-state.mjs — file-based KV at ~/.claude/sdet-pulse/sync-state.json (chmod 600). Defaults all to "not configured". Resilient to malformed JSON (falls back to defaults). summaryForDisplay masks auth_token to last-4 + replaces full endpoint with origin + /... so logs/status never leak the secret or full path.
  • bin/sync.mjs — programmatic API:
    • syncOnce({force?, db?, secret?}) — builds anonymized batch since last_sync_at, POSTs with Authorization: Bearer <auth_token> header, 3 retries with exponential backoff (1 s, 2 s), persists last_sync_at on success, failed_attempts++ + last_error on terminal failure.
    • configure({endpoint?, token?, userId?}) — validates endpoint protocol + user_id hex format. Throws on bad input.
    • enable() / disable() / status() / reset() — lifecycle.
    • CLI: enable | disable | status | configure --endpoint URL --token TOKEN --user-id HEX | once | reset.
  • /sdet-pulse:sync [subcommand] — slash command wrapping the CLI.
  • Tests — 16 new (5 sync-state, 11 bin/sync with real mock HTTP server testing retries + backoff timing + auth header + leak check on transmitted body). Total 182.

Privacy & safety properties

  • Double opt-in: configure sets the endpoint+token+identity but does nothing alone. enable then unlocks once. Without both, syncOnce refuses with a clear error.
  • LOCAL_EXPORT_SENTINEL refusal: syncOnce rejects the local-only sentinel user_id explicitly. Tester must provide a real per-machine identity (via --use-machine-id to generate one, or --user-id <hex> from the preview tier invite).
  • HMAC salt is NOT shipped in the batch — the receiving server cannot reverse the local cwd/session hashes even with full DB access.
  • Mock-server test asserts that no /secret, priv, or s1 substring appears in the body actually transmitted to the receiver.
  • Auth header: bearer token only. No basic auth, no plaintext token in URL.

Retry semantics

  • Network errors + 4xx/5xx both trigger retry. After 3 failed attempts within ~3 s, the call returns {ok: false} and persists last_error for /sdet-pulse:doctor and the dashboard to surface.
  • Backoff is 1 s × 2^attempt = 1 s, 2 s between attempts. Tuned to be fast enough for interactive once calls but slow enough to not hammer a temporarily-down endpoint.
  • No state pollution on success/failure beyond the sync-state.json patch. DB rows are not mutated by sync (read-only sync semantics).

Test count

Not yet

  • Cloud backend at pulse.sdet.it — separate repo, separate decision (Postgres schema, ingest API, Cloudflare Tunnel, auth token validation against invite list). When that's live, /sdet-pulse:sync configure --endpoint https://pulse.sdet.it/api/ingest --token <invite> is the user flow.
  • Automatic sync on schedule — currently user runs /sdet-pulse:sync once manually. Future: SessionStart auto-trigger debounced to daily, similar to weekly digest pattern.
  • gzip compression for large batches — observed batch size for 100 sessions / 1023 tool_calls is ~250 KB pretty-printed; switching to JSON.stringify(batch) (no indent) + gzip would shrink to ~30 KB. Defer until measured bandwidth pain.

[0.9.1] — 2026-05-13 — Sprint J.1.1: Privacy hardening (post-review)

Critical privacy fixes

  • HMAC-SHA256 salt for shortHash — was straight sha256(s).slice(0,16), brute-forceable for low-entropy inputs (/Users/<username>/<repo> patterns enumerable from a cloud DB in seconds). New lib/identity.mjs::getHashSecret() lazily generates 32 random bytes at ~/.claude/sdet-pulse/hash-secret.bin (chmod 600) and uses HMAC-SHA256 instead. Secret never leaves the machine. Same cwd still hashes identically across batches (joins work) but preimage attacks become infeasible. Helper throws if secret missing — fail-loud, no silent re-fall to unsalted.
  • mcp__ tool name scrubbing — internal MCP server names (mcp__company-internal__deploy) used to ship verbatim, leaking the org running the server. Now scrubToolName replaces them with mcp__<hash> (HMAC-salted, stable across batches, preserves cardinality + frequency analytics). CC built-ins (Read, Bash, Edit, ...) still pass through as-is (public enums).
  • --user-id regex validation — must match [0-9a-f]{16,64}; previously accepted darek@sdet.it or any free-form string. CLI now rejects with exit 1 + clear error.
  • Schema rename'1''sdet-pulse-batch-v1'. Future schemas get -v2, -v3 etc., so a receiver can branch unambiguously.

Robustness fixes

  • CLI IIFE wrapped in try/catch — any throw (perm denied, disk full, validation error) now emits {ok: false, error, code} JSON to stderr + exit 1, never raw stack traces. Machine-readable contract preserved.
  • Unknown flag rejection — typos like --use-id previously silently fell through to default local-export-anonymous. Now collected in args.unknown and reported.
  • lib/identity.mjs::getUserId() — stable pseudonymous per-machine identity (32 hex chars, random one-shot generation). --use-machine-id flag opts in.
  • LOCAL_EXPORT_SENTINEL export — 'local-export-anonymous' is now a named constant. Sprint J.2's sync client will refuse this value symmetrically.

Test hardening

  • Exact keyset tests for every anonymizer — if a future refactor adds a new field, assert.deepStrictEqual(Object.keys(out).sort(), WHITELIST) catches it loudly.
  • chmod 600 / dir 700 assertions — privacy guarantee was undeclared in tests, now verified via statSync().mode & 0o777.
  • Extended PII leak audit — 18 SECRETS strings (JWT prefix, AWS access key, GitHub PAT, Anthropic API key shape, RSA private key header, email patterns, credit card test number, MCP server name, internal field names input_preview/context_json/tool_use_id).
  • shortHash salt independence test — same input with different secrets MUST produce different hashes.
  • 1000-input collision test for shortHash distinct-output property.
  • HMAC missing-secret throw test — anonymizer fails loudly, never falls back to unsalted hash.
  • savings table explicit exclusion test — INTENTIONALLY-OMITTED contract pinned with regression test (note field is free-text, can't ship without dedicated scrubber).

Test count

[0.9.0] — 2026-05-13 — Sprint J.1: Anonymization + local export

Added

  • lib/anonymize.mjs — 4 row-level anonymizers + buildBatch(db, opts) aggregator:
    • anonymizeSession — hashes id/cwd/project; keeps model, counters, timestamps
    • anonymizeToolCall — drops tool_use_id + input_preview; keeps tool_name, duration_ms, bytes_out
    • anonymizePrompt — drops preview; keeps chars only
    • anonymizeAlert — drops message + context_json; keeps pack, code, severity, acked
  • bin/export.mjs — CLI emitting {schema, generated_at, user_id, since, counts, sessions[], tool_calls[], prompts[], alerts[]} JSON. Writes to ~/.claude/sdet-pulse/exports/batch-<ts>.json (chmod 600). Accepts --days N, --all, --out PATH, --user-id HASH.
  • /sdet-pulse:export [7d|30d|<days>|--all] [--out path] — slash command shell.
  • Tests — 15 new (9 anonymization unit, 5 export CLI). Total 148.

Anonymization contract (audit reference)

Real-data smoke check (against author's live DB)

Not yet

  • No network call. Sprint J.2 adds POST to a configurable endpoint with auth.
  • No per-user identity yet--user-id defaults to local-export-anonymous. Sprint J.2 introduces a stable per-machine hash (sha256 of machine UUID + plugin install timestamp).
  • Cloud backend + multi-user dashboard at pulse.sdet.it — separate repo, separate sprint, not in plugin scope.

[0.8.0] — 2026-05-13 — Sprint I: Auto-tuning weekly digest

Added

  • Schema v3 (lib/migrations/v3-digest-runs.sql) — digest_runs table: id, generated_at, window, totals_json, recommendations_json, tuning_actions_json, applied_action_ids_json, dismissed_at.
  • computeTuningActions({...}) in bin/digest.mjs — structured action engine producing {id, kind, pack?, key?, currentValue?, proposedValue?, message} proposals:
    • enable_pack — when avg session cost ≥ $1 or prompt p95 ≥ 25k AND pack disabled
    • tune_threshold — raise skill-hygiene.prompt_chars_threshold when p95 > 1.5× current; tighten cost-control.cost_threshold_usd when avg session cost ≪ default
  • SessionStart auto-trigger — first SessionStart after 7 days since last digest_runs.generated_at (or no run yet) calls gatherDigest('7d') and persists. Synchronous, ~15-30 ms typical.
  • bin/tune.mjs + /sdet-pulse:tune [list|view <id>|apply <digestId> <actionId>|dismiss <digestId>] — review and selectively apply suggestions. Returns structured JSON.
    • apply mutates pack_state (enable/disable or update config_json). Idempotent (second apply = alreadyApplied: true).
    • dismiss records dismissed_at for history without applying anything.
  • DB helpers in lib/db.mjs: insertDigestRun, listDigestRuns, getDigestRun, getLatestDigestRun, markDigestActionApplied, markDigestDismissed, setPackConfig, getPackConfig.
  • Tests — 20 new (5 db helpers, 5 computeTuningActions branches, 8 tune CLI flow, 2 SessionStart integration). Total 133.

Changed

  • gatherDigest now also computes per-pack config snapshot + tuning actions. Returns extra fields: tuningActions, packConfigs. Backward-compatible — existing callers ignore new fields.
  • SCHEMA_VERSION bumped 2 → 3 with idempotent migration via existing transaction wrapper.

Notes

  • Tuning actions are conservative. Threshold tuning requires ≥ 10 sessions in window; pack-enable suggestions require ≥ 5. Cooldown is per-digest (not time-based) — same suggestion can re-appear next week if user dismissed but didn't apply.
  • Default pack thresholds are mirrored in bin/digest.mjs::PACK_DEFAULTS so the digest can suggest tuning without dynamic handler imports. Update lockstep when handler defaults change.
  • No silent flips. Plugin never enables/disables a pack or changes thresholds without explicit /sdet-pulse:tune apply. This is design discipline — preview tier testers will trust the plugin only if it asks first.
  • Real dogfood demo (this session's DB after seeding 20 high-cost sessions): digest correctly proposed enable_pack: cost-control ($4.50 avg) and enable_pack: skill-hygiene (75 k p95 prompt) with data citations.

Test count

[0.7.1] — 2026-05-13 — Sprint G: Live in-session alerts

Added

  • lib/transcript-cache.mjs — extracted shared helper: getCachedUsage({transcriptPath, cacheFile, ttlMs}) parses transcript JSONL once and caches result by (path, mtime). Used by statusline AND cost-control to avoid repeated parsing on high-frequency PostToolUse events.
  • packs/skill-hygiene/handler.mjs::onUserPromptSubmit — instant alert on prompt_bloat_live when prompt characters exceed threshold (default 30 000). No transcript parse, just event.prompt.length from CC hook payload. Fires immediately, not at Stop.
  • packs/cost-control/handler.mjs::onPostToolUse — mid-session alerts (cost_ceiling_live, output_ceiling_live, turn_ceiling_live) derived from cached transcript usage. Each alert code rate-limited to 1 emit per 5 minutes per session (no spam from rapid tool-use bursts).
  • hooks/stats-tap.mjs — both handleUserPromptSubmit and handlePostToolUse are now async and dispatch pack events after their primary DB write. Errors go to error.log like the Stop path.
  • Tests — 11 new tests (5 transcript-cache, 2 skill-hygiene live, 3 cost-control live including cooldown verification, 1 stats-tap integration via real subprocess).

Changed

  • bin/statusline.mjs — refactored to use the new shared getCachedUsage helper instead of inline cache logic. Behaviour unchanged; better-tested code path.

Notes

  • Live alert codes carry live: true in context_json so dashboard / digest can distinguish live (mid-session) from post-Stop alerts.
  • Cooldown design — 5-minute rate limit prevents PostToolUse spam. Crossing a threshold once per session is the signal; re-crossing 30 seconds later is noise.
  • PostToolUse fires VERY often during heavy work (multiple tool calls per second is possible). Transcript-cache TTL is 2 s, and cooldown is 5 min, so per-session burden is bounded: at worst ~30 transcript parses per minute (each ~10-50 ms for typical JSONL sizes).

Test count

[0.7.0] — 2026-05-13 — F4-Local: Live Dashboard Daemon (Sprint F)

Added

  • bin/dashboard-server.mjs127.0.0.1-bound HTTP server (Node http, no Express). Routes:
    • GET / — server-rendered HTML dashboard (auto-refresh every 10 s, window selector links).
    • GET /api/stats?window=<7d|30d|all> — JSON of session totals, projects, tools, models, hours, prompt distribution.
    • GET /api/digest?window=<7d|14d|30d> — JSON of digest with cost delta + recommendations.
    • GET /api/alerts?status=<unread|all>&limit=<n> — JSON list of pack alerts.
    • GET /health{ok: true, ts} for uptime probes.
  • templates/dashboard-live.html — dark Apple-flavored single-file template. Hero cards (sessions / cost / output / cache / open-alerts / prompt p95) with threshold colors (gray/yellow/red). Six tables: top projects, top tools, model split, recommendations, top alerts, most expensive sessions. No external dependencies (CSS-only, no JS framework).
  • bin/server-ctl.mjs — daemon lifecycle: start/stop/status. Spawns detached background Node process tracked via ~/.claude/sdet-pulse/server.pid (chmod 600). Server logs to ~/.claude/sdet-pulse/server.log. Idempotent: second start returns alreadyRunning: true.
  • /sdet-pulse:server [start|stop|status] [port] — slash command wrapping server-ctl.mjs. Returns JSON; CC relays the URL or status to user.
  • Tests — 13 new tests:
    • test/dashboard-server.test.mjs (8) — every route (/health, /api/stats, /api/digest, /api/alerts, /, 404 path), real server bind on random port, HTML XSS escaping verification.
    • test/server-ctl.test.mjs (5) — empty PID file, dead-PID detection, full start-status-stop lifecycle on real spawned child, idempotent start, no-op stop.

Changed

  • bin/stats.mjs refactored — gatherStats(db, window) (data) and formatStats(data) (text rendering) split out. runStats now composes them. HTTP API uses gatherStats; CLI keeps the formatted text output unchanged.
  • bin/digest.mjs refactored the same way — gatherDigest + formatDigest. HTTP API reuses the same data layer as CLI.

Test count

Notes

  • Server uses synchronous better-sqlite3 reads (cheap, ~1 ms typical). No connection pool — opens + closes DB per request. Fine for single-user localhost.
  • Default port 3535 is hardcoded but can be overridden via the second $ARGUMENTS arg to /sdet-pulse:server start <port>.
  • HTML output escapes all dynamic fields (project names, tool names, recommendations) — verified by an explicit XSS-input test.
  • Auto-refresh meta tag = 10 s. For instant feedback during heavy work, point your browser tab to the URL and leave it open.

[0.6.2] — 2026-05-13 — Sprint C Test Hardening (post-review)

Added

  • bin/stats.mjsrunStats({ window, db? }) exported. Inline-node script in commands/stats.md extracted here. Now unit-testable.
  • bin/digest.mjsrunDigest({ window, db? }) + computeRecommendations({...}) exported. Inline-node script in commands/digest.md extracted here. The recommendation rule engine (no-packs / high-cost / low-cache / prompt-bloat / all-green) is now individually testable.
  • test/bin-stats.test.mjs — 4 tests covering: empty DB CLI run, seeded totals + cache_pct math, no-DBZ when cacheTotal=0, prompt percentile correctness.
  • test/bin-digest.test.mjs — 7 tests covering: empty DB run, all 5 recommendation branches independently, low-cache-but-few-sessions branch, prev-window cost delta math.
  • test/db.test.mjs — 2 new tests for v1→v2 schema upgrade path (legacy v1 DB upgrades cleanly, preserves session data) and idempotent reopen (no row duplication on repeat open).
  • test/pack-skill-hygiene.test.mjs — 3 new tests for the memory_bloat branch (uses HOME env override to point at tmp dir): bloat detection, under-threshold no-op, missing-file no-op. The 0% coverage gap from the review is now 100%.
  • test/stats-tap.test.mjs — 1 new integration test that enables pack/cost-control, spawns a real Stop hook subprocess, asserts cost_ceiling alert lands. Confirms the full F2 dispatch chain works end-to-end.

Changed

  • lib/db.mjs:initSchema — wrapped all 3 SQL migration calls plus the schema_meta upsert in a single db.transaction(...). A crash mid-migration now rolls back cleanly instead of leaving a partial state.
  • commands/stats.md and commands/digest.md — reduced to 3-line shells that invoke bin/*.mjs. No more SQL+formatting logic locked in markdown heredocs.

Test count

[0.6.1] — 2026-05-13 — Sprint B Hardening (post-review)

Added

  • lib/logger.mjs — structured JSON logger writing to ~/.claude/sdet-pulse/error.log (chmod 600) with size-based rotation (1 MB → error.log.1). Each entry: {ts, source, message}. Replaces silent process.stderr.write calls from hook subprocesses (where stderr is invisible).
  • /sdet-pulse:doctor now surfaces error_log check: counts errors in last 7d, tails last 5 entries with timestamp + source + message snippet. Silent failures are now discoverable.
  • Tests — 4 new logger tests (append, multi-append, string-message, rotation). Total 71+.

Changed

  • lib/packs.mjs:loadPacks — entire per-pack body wrapped in try/catch. A malformed pack.json or a handler with a top-level import error now skips just that pack (logged) instead of aborting the whole loader. Tested via 2 new isolation tests.
  • lib/packs.mjs:dispatchEvent — now async. Awaits each handler so async rejections are caught by the per-pack try/catch instead of becoming unhandled rejections. Tested via 1 new async-handler test.
  • hooks/stats-tap.mjs — all process.stderr.write replaced with logError(source, err). dispatchEvent call now await-ed. Affected sources: stats-tap:backfill:*, stats-tap:dispatch:*, stats-tap:invalid-json, stats-tap:main.
  • commands/dashboard.mdprocess.env.HOME replaced with os.homedir() (broke on Windows / $HOME unset).
  • README.md — removed hardcoded /Users/dariusz/... statusline fallback path; replaced with <HOME>/<VERSION> placeholders + a symlink recipe for stable command paths.

Fixed

  • Silent failure: pack manifest parse errors no longer kill the entire pack loop.
  • Silent failure: pack handler import errors (e.g. broken import statement in handler.mjs) no longer kill the loop.
  • Silent failure: future async pack handlers no longer leak unhandled rejections.
  • Silent failure: backfill / dispatch / main errors are now persisted to disk where the user can find them via /sdet-pulse:doctor.

Not changed (defensible per review)

  • bin/statusline.mjs cache catch {} blocks — graceful degradation, recompute on next render. Confirmed OK.
  • lib/transcript.mjs per-line JSON.parse catch {} — documented behavior (partial JSONL writes during live sessions).
  • lib/db.mjs — no error suppression detected; errors propagate naturally.

[0.6.0] — 2026-05-13 — F3 M4 Weekly digest

Added

  • /sdet-pulse:digest [7d|14d|30d] — single consolidated report:
    • § TOTALS: sessions, cost, output, turns, cache hit % — with cost delta vs previous equal-window (e.g. Δ +18% if this week vs last week).
    • § TOP ALERTS: grouped by (pack, code) with firing counts.
    • § MOST EXPENSIVE SESSIONS: top 5 by cost_usd with project/turns/output.
    • § PACK HEALTH: enabled packs + unacked alert count.
    • § RECOMMENDATIONS: data-driven suggestions — e.g. "No packs enabled" or "Cache hit only X% — investigate prompt churn" or "Prompt p95 = N chars — enable pack/skill-hygiene to track bloat".
  • Digest meant to be read once-per-week as the F6 portfolio article material.

Notes

  • Recommendations engine is rule-based (no ML). Thresholds: spend > $50/window, cache < 80% with > 5 sessions, prompt p95 > 50 000 chars. Tunable in code; future polish — make configurable via pack_state.config_json for a dedicated core/digest pack.

[0.5.2] — 2026-05-13 — F3 M3 Stats enhancements

Added

  • /sdet-pulse:stats now reports:
    • Turns total — sum of turn_count across sessions.
    • Cache hit %cache_read / (cache_read + cache_creation + input). Surfaces how much prompt caching saves.
    • Model split — per-model session count + share of total cost. Reveals when Opus dominates spend vs cheaper models.
    • Tool duration averagesavg_ms next to each tool's call count.
    • Prompt size distribution — n / min / p50 / p95 / max. Surfaces skill-loading bloat.
    • Peak hours — top 3 local hours by session count. Reveals when the user actually works.
  • All 7 totals + 4 breakdown tables on a single /sdet-pulse:stats 7d.

Notes

  • Real-world dogfood snapshot (2026-05-13 local DB): 68 sessions / $1228 / 98% cache hit / Opus owns 99.9% of cost / p95 prompt = 191 k chars. This is the kind of insight a F6 sdet.it article hangs on.

[0.5.1] — 2026-05-13 — F3 M2 Alert acknowledgement

Added

  • ackAlert(db, id) and ackAllAlerts(db, opts?) helpers in lib/db.mjs. ackAllAlerts accepts { pack } filter for per-pack bulk-ack.
  • /sdet-pulse:alerts ack <id> — mark a single alert acknowledged. Acked alerts are hidden from default unread view but visible under all.
  • /sdet-pulse:alerts ack-all [pack] — bulk-acknowledge all unacked alerts (optionally scoped to one pack).
  • Alert table now shows id column so users can copy-paste IDs into ack.
  • Tests — 4 new ack helper tests (single-id ack, missing-id no-op, ack-all, ack-all with pack filter). Total 64.

[0.5.0] — 2026-05-13 — F3 M1 Statusline (live cost / turns / output)

Added

  • bin/statusline.mjs — reads CC's statusline JSON payload from stdin, parses transcript_path via existing lib/transcript.mjs + lib/cost.mjs, returns an ANSI-colored single-line summary: sdet-pulse | $X.XX | N turns | Yk out. Color thresholds gray (safe) → yellow (approaching) → red (over).
  • Plugin-scope registration — added statusLine.type=command to .claude-plugin/plugin.json pointing at ${CLAUDE_PLUGIN_ROOT}/bin/statusline.mjs. CC versions that support plugin-provided statusLine pick this up automatically on plugin (re)install.
  • 2-second mtime-keyed cache~/.claude/sdet-pulse/statusline-cache.json avoids re-parsing the JSONL on every CC re-render. Cache key = transcript_path:mtimeMs.
  • Tests — 5 new subprocess tests (no transcript, missing path, real transcript parse, ANSI presence, cache idempotence). Total 60.

Notes

  • Color thresholds match pack/cost-control defaults: cost > $5, turns > 200, output > 100 000. Yellow kicks in at 60% of the red threshold (cost > $3, turns > 100, output > 50 000).
  • Statusline does NOT depend on the pack being enabled — it's a separate observability surface, always on once the plugin is installed.
  • For CC versions that do NOT honor plugin-scope statusLine, user can add ~/.claude/settings.json snippet (see README).

[0.4.0] — 2026-05-13 — F2 M3 pack/skill-hygiene + doctor pack health

Added

  • pack/skill-hygiene — emits two alert codes from the Stop hook:
    • prompt_bloat (severity info) when the last prompt exceeds prompt_chars_threshold (default 30 000 chars).
    • memory_bloat (severity info) when the active project's MEMORY.md exceeds memory_line_threshold lines (default 200, the truncation point). Memory path derived from cwd via CC's -<slash-replaced> convention; works for any user, not just darco81.
  • Doctor enhancement/sdet-pulse:doctor now reports f2_pack_tables, packs_enabled, and open_alerts per pack.
  • Tests — 3 new tests (prompt_bloat detection, under-threshold no-op, missing-prompts no-op). Total 55.

[0.3.0] — 2026-05-13 — F2 M2 pack/cost-control

Added

  • pack/cost-control — emits three alert codes from the Stop hook based on session totals:
    • cost_ceiling (severity warn) when cost_usd > 5
    • output_ceiling (severity warn) when output_tokens > 100 000
    • turn_ceiling (severity info) when turn_count > 200 Each threshold is configurable via pack_state.config_json. Pack ships disabled — enable with /sdet-pulse:pack enable cost-control.
  • Slash command /sdet-pulse:alerts [unread|all] [limit] — review alerts emitted by enabled packs.
  • Tests — 5 new tests (3 alert codes + no-op-under-threshold + missing-session no-op). Total 52.

[0.2.0] — 2026-05-13 — F2 M1 Rule Packs Scaffold

Added

  • Rule packs system (lib/packs.mjs) — pack loader plus event dispatcher. Each pack lives under packs/<name>/ with a pack.json manifest and a handler.mjs ESM module. Loader respects the pack_state table; disabled packs are skipped entirely. Supports SDET_PULSE_PACKS_DIR env override for test isolation.
  • Schema v2 (lib/migrations/v2-packs-alerts.sql) — pack_state (name PK, enabled, enabled_at, config_json) and alerts (id, session_id, ts, pack, severity, code, message, context_json, acked) tables, plus indexes.
  • Pack state helpers in lib/db.mjsenablePack, disablePack, listPackStates, insertAlert.
  • Hook wiringStop handler dispatches pack events after the F1 JSONL backfill.
  • Slash command /sdet-pulse:pack [list|enable <name>|disable <name>].
  • Tests — 10 new tests bringing total to 47 (5 db schema/helpers, 5 packs loader/dispatch).

Changed

  • Removed temporary debug appendFileSync block from hooks/stats-tap.mjs (F1 troubleshooting leftover).
  • SCHEMA_VERSION bumped to 2; live DBs auto-migrate on next openDb (both plugin_version and schema_version).

Housekeeping pending

  • 5 of 13 db.test.mjs tests still omit db.close() — afterEach rmSync masks the leak, not blocking but inconsistent.

[0.1.0] — 2026-05-12 — F1 Full Stats Infrastructure

Added

  • Full event tap — hook dispatches all 5 CC events: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop. Tool call duration measured via PreToolUse/PostToolUse pairing through ~/.claude/sdet-pulse/tx.json transaction cache (mode 0600).
  • F1 schema — added tool_calls, savings, prompts tables with indexes. Schema migration script lib/migrations/v1-full-schema.sql loaded automatically by openDb(). Fully idempotent (CREATE IF NOT EXISTS) so re-open is safe.
  • Cost calculationlib/cost.mjs with model pricing table for Claude 4.5/4.6/4.7 (Opus, Sonnet, Haiku). Cache_creation tokens charged at 1.25x input rate, cache_read at 0.10x (Anthropic 2026 schedule).
  • JSONL transcript parserlib/transcript.mjs reads ~/.claude/projects/{cwd}/{session_id}.jsonl, aggregates per-assistant-turn usage fields. Resilient to malformed lines (partial writes during live sessions).
  • Slash commands:
    • /sdet-pulse:stats [7d|30d|all] — totals + top projects by cost + top tools by call count
    • /sdet-pulse:savings — per-technique breakdown (empty in F1, populated from F2)
    • /sdet-pulse:doctor — self-check (schema integrity, recent activity, Stop hook firing, tool_calls ratio)
    • /sdet-pulse:dashboard [7d|30d|all] — generates HTML to ~/.claude/sdet-pulse/dashboards/dashboard-YYYY-MM-DD-HH-MM.html
  • HTML dashboard template (templates/dashboard.html) — Apple-flavored styling, hero metrics, timeline table, top projects with cost bars, top tools with call bars. Single self-contained file, no JS framework.
  • Test fixturestest/fixtures/sample-session.jsonl (sanitized, hand-crafted) for transcript parser golden testing.
  • plugin_version auto-migrationinitSchema now UPDATEs schema_meta.plugin_version when it differs from current. Live DBs from F0 (0.0.1) automatically reflect new version on next open.
  • Version bump — 0.0.1 → 0.1.0. Plugin version, package.json, marketplace.json synced.

Schema details

  • tool_calls: 9 columns (id, session_id FK, tool_use_id, tool_name, ts, duration_ms, input_preview, success, bytes_out). Indexes on session_id + ts.
  • savings: 8 columns (id, session_id FK, ts, technique, tokens_saved, calls_saved, cost_saved_usd, note). Indexes on technique + ts.
  • prompts: 5 columns (id, session_id FK, ts, chars, preview). Preview truncated to 120 chars at insert time.

Test count

Privacy posture (unchanged)

[0.0.1] — 2026-05-12

Brand

  • Rebrand: sdet-ccsdet-pulse for marketing clarity (commit 12121f2). Pulse metaphor fits dual scope: observability (vital signs) + optimization (heartbeat detects waste). Brand-coherent z sdet.it ecosystem (brain.sdet.it, skills.sdet.it, future pulse.sdet.it).
  • GitHub repo renamed: darco81/sdet-ccdarco81/sdet-pulse.
  • All paths, env vars (SDET_CC_DATA_DIRSDET_PULSE_DATA_DIR), slash command prefixes (/sdet-cc:*/sdet-pulse:*), and data dir (~/.claude/sdet-cc/~/.claude/sdet-pulse/) updated.

Added (F0 — Bootstrap)

  • Plugin manifest (.claude-plugin/plugin.json) — name, version, description, author.
  • Hook registration (.claude-plugin/hooks.json) — SessionStart matcher wires to hooks/stats-tap.mjs.
  • Paths resolver (lib/paths.mjs) — dataDir(), dbPath(), ensureDataDir() with SDET_PULSE_DATA_DIR env var override. Mode 0700 on parent dir for privacy.
  • SQLite layer (lib/db.mjs) — openDb(), insertSession(), countSessions(). WAL mode + foreign_keys ON. Auto-init schema on first run via schema_meta table (tracks version + plugin_version).
  • Forward-compat schemasessions table has all 13 columns (input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens, cost_usd, turn_count, ended_at, schema_version) ready for F1 token tracking. Zero migration needed.
  • SessionStart hook (hooks/stats-tap.mjs) — reads stdin JSON, dispatches by hook_event_name. On SessionStart: opens db, inserts session row with id, cwd, project=basename(cwd), model, started_at=Date.now(). Closes db in finally.
  • Slash command (commands/stats.md) — /sdet-pulse:stats runs inline Node script that queries SQLite and reports session count.
  • Integration test (test/stats-tap.test.mjs) — spawns hook subprocess with synthetic stdin payload + env override, asserts DB row insertion. Plus unit tests for lib/paths.mjs (4 tests) and lib/db.mjs (4 tests).
  • TDD discipline — all lib/ and hooks/ modules implemented with RED → GREEN cycle observed by subagent implementer.
  • README + design docs — full spec (518 lines) + F0 implementation plan (739 lines) + Linear project tracking.

Architecture decisions

  • Stack locked: Node.js 20+ + better-sqlite3 (sync API, prebuilt binary for Node 24) + tree-sitter WASM (for F2 AST truncate) + Ollama (for F4 Recall embeddings only). No external deps in F0 core.
  • No MCP for AST tools — RPC overhead ~50ms vs <10ms subprocess. Hooks + skills + slash commands sufficient. (Rejected: MCP-first architecture.)
  • Two repos plannedsdet-pulse core + sdet-pulse-recall opt-in (Ollama-dependent). Module C decomposition. (Rejected: marketplace of 3 micro-plugins; rejected: single monolith.)
  • Suggestion-only fuzzy match (F3) — no auto-fix. Risk of file corruption. (Rejected: Wozcode's auto-fix approach.)
  • Privacy by default — no source code in SQLite previews (max 200 chars). F5 sync uses sha256(cwd), strips previews. (Rejected: public showcase dashboard with anonymized client data — too risky.)
  • VPS backend for F5 — reuse existing Docker stack + PostgreSQL on personal VPS. (Rejected: Cloudflare D1/Workers — extra stack to maintain.)

Fixed

  • fix(npm): test script glob (commit 778d361) — node --test test/ doesn't glob directory on Node 24, fails with Cannot find module test. Changed to node --test 'test/*.test.mjs'.
  • fix(commands): remove backslash escape on CLAUDE_PLUGIN_DIR (commit 85d0577) — heredoc escape leaked into slash command markdown, would have broken CC env substitution at runtime. Removed.
  • Dependency bump (commit 0e96365) — better-sqlite3 from ^11.3.0^12.6.0. v11 has no Node 24 prebuilts; native build failed on macOS 26 (Tahoe) due to CLT detection bug in node-gyp.

Metrics

  • Total commits: 16 (spec + plan + 6 task commits + 4 fixes + rebrand + docs)
  • Tests: 10/10 passing (~91ms total)
  • Source LOC: ~150 lines across lib/, hooks/, commands/
  • Spec LOC: 518 (design doc)
  • Plan LOC: 739 (F0 implementation plan)
  • Effort actual: ~1.5h evening (matches plan estimate)

Pending

  • Task 7 (manual): User must run /plugin marketplace add + /plugin install sdet-pulse in Claude Code, restart, then verify /sdet-pulse:stats returns count ≥ 1.
  • After dogfooding for 2-3 sessions to confirm reliability: start F1 implementation plan.

Pre-history

2026-05-12 — Brainstorming session

  • Inspired by Wozcode commercial CC plugin ($20/wk, patent-claimed token reduction).
  • Explored 3 architectural approaches: monolith, marketplace of micros, hybrid core+opt-in. Chose hybrid (C).
  • Decided full port of 7 Wozcode techniques over 8 phases (F0-F7), ~3 months evening work.
  • Strategic insight from user: stats from day 1 = real-data POC after 6 months → sdet.it case study → brand exposure.
  • Saved design spec + F0 implementation plan to docs/superpowers/. Initial commits 1a64393 (spec) and 1a13b38 (plan).