Getting started¶

Warning

Pre-alpha whatif is pre-alpha through M9. v0.1 (the first installable release) lands in M10. The pages in this section describe the intended workflow; commands and outputs may change before v0.1 ships. Watch the GitHub repo for release notes.

What you’ll do¶

This section walks you from a blank machine to a verdict report:

Install whatif and verify the CLI is on your $PATH.
Implement the runner contract for your agent-a single Python function.
Run your first experiment-fork production traces, replay with a proposed change, read the verdict.

What you’ll need¶

Python 3.11+.
A tracer with at least 20 recent traces-Langfuse v0.1 (Phoenix and OpenTelemetry GenAI in v0.2).
Read API access to that tracer.
An agent you can re-execute programmatically-a function or class you can construct and call from Python. If your agent only runs as a hosted service, the runner contract pattern doesn’t fit yet (this is a known limitation; tracked for v0.3+).
An eval framework-Inspect AI is the v0.1 default. Bring your own task definition.

What you won’t need¶

A whatif server or hosted account. It’s a CLI.
A new tracer. whatif reads from yours.
A new SLO platform. whatif’s output flows into yours.

Who this is for¶

SREs / platform engineers running LLM systems in production.
ML engineers shipping prompt or model changes weekly.
Anyone debating “is this prompt change actually better” in standup more than once a week.

If those don’t sound like you yet, Concepts might be a better starting point.