Getting started¶
Warning
Pre-alpha
whatif is pre-alpha through M9. v0.1 (the first installable release) lands in M10. The pages in this section describe the intended workflow; commands and outputs may change before v0.1 ships. Watch the GitHub repo for release notes.
What you’ll do¶
This section walks you from a blank machine to a verdict report:
Install
whatifand verify the CLI is on your$PATH.Implement the runner contract for your agent-a single Python function.
Run your first experiment-fork production traces, replay with a proposed change, read the verdict.
What you’ll need¶
Python 3.11+.
A tracer with at least 20 recent traces-Langfuse v0.1 (Phoenix and OpenTelemetry GenAI in v0.2).
Read API access to that tracer.
An agent you can re-execute programmatically-a function or class you can construct and call from Python. If your agent only runs as a hosted service, the runner contract pattern doesn’t fit yet (this is a known limitation; tracked for v0.3+).
An eval framework-Inspect AI is the v0.1 default. Bring your own task definition.
What you won’t need¶
A
whatifserver or hosted account. It’s a CLI.A new tracer.
whatifreads from yours.A new SLO platform.
whatif’s output flows into yours.
Who this is for¶
SREs / platform engineers running LLM systems in production.
ML engineers shipping prompt or model changes weekly.
Anyone debating “is this prompt change actually better” in standup more than once a week.
If those don’t sound like you yet, Concepts might be a better starting point.