TaskFiDocs

LLM Providers

The agent and the backend jury both expose an LLM_PROVIDER knob. This lets you choose between a free deterministic stub for plumbing tests and real Claude calls for production quality.

Modes

ValueAgent behaviourBackend juryAPI key
stub (default)Deterministic templated responsesDeterministic scoringNone
claudeReal Anthropic completions (one per vision)Real 5-judge consensusANTHROPIC_API_KEY
Recommended workflow
Validate the entire stack on stub first — it's free, deterministic, and surfaces wiring bugs immediately. Switch to claude for an end-to-end quality pass once everything else is green.

Wiring up Anthropic

Set both the agent's and the backend's environment to LLM_PROVIDER=claude and provide ANTHROPIC_API_KEY. Optionally override ANTHROPIC_MODEL (default claude-sonnet-4-6).

Adding another provider

The backend factory is a thin wrapper. If you ship a new provider, implement the same prompt/response interface and switch on the env var. Cost characteristics differ wildly between providers — keep an eye on the per-mission jury budget (5 judges per submission per round).