Playbooks
Tool‑Call Parsing: Safe Multi‑Format Fallback (Tags / JSON / Brackets)
In production, models often intend to call tools but emit non-standard syntax. Strict parsers undercount capability; permissive parsers cause accidental execution. This playbook gives a fail-closed, auditable, multi-format parser design.
tool-callingagentsevalparser
Tool‑Calling Judgment: Decision Framework + Minimal Regression Suite
Tool calling fails less often because of schemas and more often because of judgment: over-calling (loops) vs under-calling (hallucination). This playbook gives an explicit decision framework and a minimal regression suite to measure Action + Restraint + Recovery.
tool-callingagentseval