David Victor
Case study · 2025–2026

Phos4

Make SQL feel like thinking, not typing.

A browser-based SQL IDE shipped with Mozart Data. I led three workstreams over the engagement: AI transformation for the team and the product, a frontend rebuild on a modern foundation with a deep theming engine, and the brand and web design Phos4 launched under.

AI transformationFrontend rebuildBrand & web designLive site
Phos4 with the seventh theme palette applied: warm cream light theme.
Context

How an AI training engagement widened into product, frontend, and brand.

Mozart Data was building Phos4 with an 11-person engineering team. They brought me in as an AI transformation consultant. The starting brief was to install agentic development as a discipline on the team.

Within the first few weeks the scope had to widen. The existing frontend wasn't built for what shipped on it: the SQL editor, the agent surfaces, the theming, the chart layer. The marketing site and brand were placeholder. The velocity needed to ship a product of this size was several multiples ahead of where the team was.

The engagement landed on three concurrent workstreams. I led all three.


AI transformation

Pair sessions, prompt patterns, review workflows, quality gates.

Mozart's engineers had AI tools available but no shared practice. Nobody on the team had used agentic development with discipline: no prompt patterns, no review conventions for AI-generated code, no quality gates, no shared mental model for when to drive the agent and when to take over. The work hadn't been compounding.

I embedded with the team. The work was concrete: pair sessions on real production tickets; a prompt architecture standardised across the team; review workflows that distinguish AI-authored from human-authored code; quality gates for AI-touched changes on critical paths; and shared templates, harnesses, and examples.

Team velocity

Commits per week

Weekly engineering output across an eleven-engineer team. The CEO learned the agentic practice alongside the team and moved from typical-engineer commit volume to roughly thirty percent of all commits.

Weekly commit history, Jul 2025 through Apr 2026. Full-rate reference: 924 commits/week.
Team velocity

Weekly commits climbed from about fifty to nearly a thousand.

The CEO learned the practice alongside the team. He moved from typical-engineer commit volume to producing about thirty percent of all commits, and held that lead by a wide margin. The chart above is the modeled view of that transition. A model projects 924 commits per full-rate week from sustained per-engineer throughput across the team. Weekly lines changed moved from 5K–10K to 30K–80K. The practice is still in place after I left.


Frontend rebuild

Rebuilt the frontend: React 19, Vite, 285 components.

A SQL editor with dialect-aware intelligence, an agent surface, a chart engine, a schema browser, a profiler. All running concurrently, all sharing state, all driven by the same theming system. The existing codebase wasn't built for it.

I rebuilt the foundation. React 19 and TypeScript over Vite. 285 components organised across 33 feature categories. 66 reusable UI primitives built on Radix and shadcn/ui with Tailwind v4: 13 button variants and 5 input variants, every primitive documented in Storybook. Nine React Context providers for cross-cutting concerns (auth, file management, warehouse metadata, focus mode, user preferences). Ten Redux slices for domain logic (editor state, query execution, AG Grid configuration, formatting, profiler, schema browser, AI assistant). TanStack Query for fetching. AG Grid Enterprise for tabular data. Vitest for unit tests, Playwright for end-to-end, Storybook for visual regression.

Components
285
UI Primitives
66
Palettes
10
The theming engine

Ten palettes flowing through every layer.

Engineers tune their environment all day, and the product had to absorb that without losing state, switching builds, or looking inconsistent across panels.

Phos4 ships ten palettes across light and dark, driven by a dynamic CSS-variable system and cached in localStorage so a switch is instant on the next reload. Every layer reads from the same tokens (chart primitives, editor decorations, Monaco themes, surface chrome), so a palette change flows through the whole product without anything looking grafted on. The viewer at the top of this page is the proof.


In the product

Four surfaces: the agents, the memory, the data, the editor.

Four surfaces run on the foundation: the routed agent system, the memory layer, the data surfaces around the editor, and the editor itself.

In-product agents

Three agents: write, review, pressure-test.

Alongside the coaching, I designed and built Phos4's first agentic AI surfaces inside the product. The architecture is a routed multi-agent system: one agent writes SQL, another reviews it, a third pressure-tests the data.

Backlog · 20
INV-412Investigate null_pct spike in signup.events
PIV-118Pivot ARR by cohort quarter
COR-089Correlate retention × onboarding completion
SCH-077Retention_cohort column on users
ANO-223Flag anomalous refund surge (Q3)
OPT-056Optimize quarterly_report.sql
AUD-145Audit PII access on prod mirror
SEG-091Segment power users by tenure
INV-318Investigate latency regression on /search
PIV-224Pivot churn by acquisition channel
COR-156Correlate NPS with ticket volume
SCH-104Add cohort_activity_daily rollup
OPT-087Index scan → seq scan on events_log
AUD-210Audit row-level policies on reporting
INV-499Investigate duplicate event_ids after replay
PIV-301Pivot feature adoption by plan tier
COR-402Correlate session length × repeat_purchase
SCH-142Drop deprecated user_metadata_v1
OPT-033Rewrite dashboard_view to avoid full scan
AUD-055Verify PII encryption on backups
Agents working · 8
CoddPIV-118Pivot ARR by cohort quarter
expecting ~981 rows post-aggregate
TukeyINV-412Investigate null_pct spike in signup.events
lining up deploy log against event pattern
KimballCOR-089Correlate retention × onboarding completion
splitting by self-serve vs sales-led
KnuthSCH-077Retention_cohort column on users
drafting rollback alongside migration
HopperANO-223Flag anomalous refund surge (Q3)
lining up deploy log against event pattern
ShannonOPT-056Optimize quarterly_report.sql
shortlisting window-function vs indexed scan
WickhamAUD-145Audit PII access on prod mirror
pulling access logs · last 30 days
BayesSEG-091Segment power users by tenure
splitting by self-serve vs sales-led
Ready for you · 5
INV-240Refactor cohort_activity to incremental model
new
SCH-019Add retention_cohort column to users
approved
PIV-088Pivot signups by campaign × week
new
OPT-044Rewrite fact_sessions for partition pruning
new
COR-107Correlate D30 retention × onboarding steps
comment
Three agents working a single query: write, review, pressure-test.

Under the hood: multiple LLM providers through OpenRouter. Streaming insights over server-sent events. A tool layer that dispatches to specialised handlers for dbt commands, git operations, schema lookups, and query analysis. Model selection so users can choose their provider.

Watch the routed system on a single SQL file. The agent surfaces the problem, drafts a reviewed fix, shows the diff, and benchmarks the output against the baseline. The engineer judges the work.

1
WITH scored AS (
2
SELECT
3
g.week,
4
t.abbr AS team,
5
t.division,
6
g.points_scored AS pts,
7
AVG(g.points_scored) OVER (
8
PARTITION BY t.abbr
9
ORDER BY g.week
10
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
11
) AS rolling_3,
12
AVG(g.points_scored) OVER (
13
PARTITION BY t.abbr
14
) AS season_avg
15
FROM games g
16
JOIN teams t USING (team_id)
17
WHERE g.season = 2025
18
)
19
SELECT team, division, week, pts,
20
ROUND(rolling_3, 1) AS last3_avg,
21
ROUND(rolling_3 - season_avg, 1) AS vs_season
22
FROM scored
23
WHERE week >= 5
24
ORDER BY vs_season DESC
25
LIMIT 25;
Running nightly scan…
Ask phos4 to refactor a model…
Claude Thanos 4.8
An agent turning a flagged SQL problem into a reviewed fix, a diff, and a benchmark.
Memory and context

A working memory the agent earns through the session.

Conversation history carries context chips so users can pin specific tables, schemas, or prior queries to the conversation. Auto-analysis surfaces insight about query results inline. Findings accumulate across the session. The ones that hold up reaffirm, the ones that don't get invalidated, and what's left is a working memory of what the engineer has actually learned about the data.

Memory graph · 100 lobe · 380 totalentity tour
Acme Inc.
entity

Acme Inc.

reaffirmed
CRM · synced 2h ago2023-04-14confidence high
Enterprise customer since 2023. VP Data: Maya Chen. 120-seat team, pays quarterly, renewed twice.
Maya ChenContract #8241Acme data team
Reaffirmed 4× · last accessed 2h ago
Linked from onboarding intake
The memory graph as it accumulates, reaffirms, and invalidates findings across a session.
Data surfaces

Around the editor: pivot, profile, browse.

The editor is one panel; the rest of the tool is a working analytics surface. AG Grid Enterprise drives the result table. A custom column profiler reads distribution, density, and drift on every field. The schema browser, the pivot configuration, and the profiler share state with the editor through Redux slices, so a query, its results, and a pivoted view of those results are reading from the same ground truth.

Pivot settings
Rows02
Aateam.divisionasc
Aateam.nameasc
Columns01
123game.weekasc
Values01
123points_scoredsum
Filters01
123season=2025
123game.week4
Add filter
synced
Team
W1
W2
W3
W4
W5
W6
W7
W8
Total
Avg
AFC East3 teams
BUFBills
34
38
22
18
31
30
24
197
28
MIADolphins
20
24
17
30
16
28
27
162
23
NEPatriots
17
14
23
20
16
21
13
124
18
AFC North3 teams
BALRavens
24
20
33
35
20
41
28
201
29
CINBengals
14
17
27
24
22
31
19
154
22
PITSteelers
22
17
24
28
24
17
20
152
22
AFC West2 teams
KCChiefs
27
31
19
22
27
26
31
183
26
LACChargers
34
20
27
30
22
24
17
174
25
NFC East2 teams
PHIEagles
28
34
24
21
34
27
24
192
27
DALCowboys
33
28
17
25
10
41
20
174
25
NFC North2 teams
DETLions
21
38
26
31
24
31
28
199
28
GBPackers
18
24
18
29
23
21
27
160
23
NFC West2 teams
SF49ers
30
35
17
35
24
30
34
205
29
LARRams
24
20
30
16
21
27
24
162
23
League Avg
25
26
23
25
25
25
28
23
199
25
Column profile
nfl_combine · 100 rowspulled from BAL · W6
player_name
100 uniq
overall_pick
1 → 224
drafted_by
28 teams
position
WR · 21
wingspan_in
μ 77.1
composite
left-skew
athletic_rank
uniform
bench_reps
mode 14
forty_yard
μ 4.58
vertical_in
μ 34.6
is_senior
65% T
shuttle_run
μ 4.32
The pivot surface, settings panel, and column profiler: three layers around the editor sharing one state model.
The editor

Monaco that comprehends SQL.

Phos4's Monaco integration goes well past highlighting. 18 themes, semantic token mapping, CSS-variable integration. Dialect-aware behaviour for Snowflake, BigQuery, Databricks, and Redshift: CTE-aware code folding that follows the query's logical structure, a join profiler with predicate analysis and join-type detection, dialect-specific function autocomplete, Jinja templating for dbt workflows, and a baseline diff for comparing query versions. Focus Mode dims inactive panels; Zen Mode strips everything but the query and the result.

phos4 / queries / weekly_trends.sql
1
WITH scored AS (
2
SELECT
3
g.week,
4
t.abbr AS team,
5
t.division,
6
g.points_scored AS pts,
7
AVG(g.points_scored) OVER (
8
PARTITION BY t.abbr
9
ORDER BY g.week
10
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
11
) AS rolling_3,
12
AVG(g.points_scored) OVER (
13
PARTITION BY t.abbr
14
) AS season_avg
15
FROM games g
16
JOIN teams t USING (team_id)
17
WHERE g.season = 2025
18
)
19
SELECT team, division, week, pts,
20
ROUND(rolling_3, 1) AS last3_avg,
21
ROUND(rolling_3 - season_avg, 1) AS vs_season
22
FROM scored
23
WHERE week >= 5
24
ORDER BY vs_season DESC
25
LIMIT 25;
connected · prod-readonlypostgres 16
SQLUTF-8LFLn 20, Col 12
The editor itself: a four-tab Monaco surface with dialect-aware highlighting, CTE folding, and inline join profiling.

The team practice and the in-product agents were built side by side. Same scoping rules, same review conventions, same line between agent judgment and human judgment. That's why the team practice didn't unwind the day I left, and why the in-product AI rides the same review patterns the team uses.

Credit where it's due. The VP of Frontend Engineering built the DuckDB WASM integration that handles client-side SQL processing with Apache Arrow and Parquet, eliminating server round-trips for query execution. Other engineers built backend services, the profiler pipeline, and the warehouse connection layer. I owned the design, the frontend architecture, the theming engine, and the AI integration layer. The team built on top of it.


Brand and web design

An Edison-bulb visual language for engineers close to the metal.

Phos4 had no mark, no typographic system, and no voice. The marketing site at phos4.ai was placeholder.

I designed the brand from scratch. Mark, palette, typography, motion. The visual language reaches back to the Edison-bulb and CRT-phosphor era of computing, when builders were tinkerers and the path from intent to result ran through equipment they could open up and adjust by hand. That era is the right reference for SQL because the work hasn't actually changed shape. Engineers still run real queries against real warehouses with real consequences, and the agentic layer compounds the parts that should compound without removing the engineer from the metal.

The marketing site at phos4.ai runs on the same tokens as the product, so an engineer arriving from a peer recommendation lands on the same surface they'll work inside all day. The hero on phos4.ai has a quiet trick: drag the cursor across the surface and the screen smudges along your trail like a finger across a CRT, exposing the code that's been there the whole time.

phos4.ai in motion.

The web presence does its own work: positioning Phos4 against legacy SQL IDEs on one side and generic AI-wrapper products on the other; demoing the editor in motion; surfacing the agent's behaviour without making it the headline; converting the engineer who landed there from a peer recommendation.


Outcome

What shipped.

Phos4 launched as a production SQL IDE at phos4.ai. The CEO is now the team's top contributor by a wide margin, the team is sustaining the new velocity, and a model projects a 924-commit full-rate week. The frontend ships 285 components on a foundation that carries the product. The brand and the marketing site put Phos4 in front of the engineers it's for.

Top contributor share
30%
Projected commits/wk
924
Components shipped
285
Colophon
Every figure here is a live React component. Each is scoped so it can't leak into the page, and the same component renders inside chat cards through a shared registry. Ask the agent to show you one.

Praise


Honestly, I don't think we'd have shipped on time without David. We had a broken front end, a team that hadn't really used AI in production, and no brand to put on the door. He worked all three at the same time, and we made it to our public beta.

Leon TchikindasCEO, Phos4
LT