A JTBD-First Pattern for Agentic Tool Systems

I started this thought in this post over here.

Overview

High-accuracy agentic systems are not built by exposing APIs as tools. They are engineered by starting from the job to be done (JTBD), designing a minimal set of task-shaped tools, and using verification to drive iterative improvement.

This process turns probabilistic execution into a system that measurably improves over time.


1) Define the Job to Be Done

A JTBD must specify not just what to do, but what “done” means.

  • The job should imply clear success criteria.
  • Completion must be verifiable, not assumed.

If “done” cannot be checked, the job is underspecified.


2) Decompose the Job into Verifiable Sub-Jobs

Break the JTBD into a small number of meaningful sub-jobs, each with:

  • a purpose
  • an expected outcome
  • a way to verify success

This decomposition defines the logical shape of the system.


3) Design Tools Around Sub-Jobs, Not APIs

Tools should correspond to units of work, not raw endpoints.

Good tools:

  • bundle steps that always occur together
  • encode best practices and constraints
  • reduce the surface area for error

The goal is a minimal tool set that fully completes the job.


4) Engineer Planning and Orchestration

Use a planning prompt that:

  • understands the job
  • selects an appropriate strategy
  • sequences tools deliberately
  • respects constraints and safety rules

Planning quality often matters more than model capability.


5) Execute with Guardrails

Tool execution should be:

  • constrained
  • reviewable when necessary
  • safe by default

Execution alone is not success; it is only an attempt.


6) Verify Outcomes Against the Job

Verification is mandatory and explicit.

For each sub-job, verify:

  • correctness
  • completeness
  • compatibility with downstream use

Verification turns execution into accountable work and produces signals the system can learn from.


7) Use Verification Signals to Iterate

Verification results enable systematic improvement across:

  • planning prompts
  • tool definitions
  • tool instructions
  • orchestration order
  • tool count (merge or remove tools)

Iteration is driven by evidence, not intuition.


8) Fine-Tune Only After the System Is Stable

Fine-tuning is appropriate when:

  • the structure is sound
  • errors are repeatable
  • verification data is reliable

Fine-tuning reduces residual error; it does not fix poor system design.


Core Principle

Execution completes the job once.
Verification enables the system to do it better next time.

That feedback loop — from JTBD to tools to execution to verification and back — is what differentiates a robust agentic system from a brittle one.

Applied directly to NeonJS

1) Start with the Job to Be Done (JTBD)

Always begin by stating the outcome in plain language, focusing on what success must prove:

“Perform a safe, verifiable schema migration.” :contentReference[oaicite:0]{index=0}

This shapes all subsequent design decisions.


2) Collapse Low-Level APIs Into Task-Oriented Tools

Don’t give an LLM a catalog of generic endpoints. Instead, expose a small set of clearly named, job-centric tools.

Neon’s concrete example for schema migration uses just four tools:

Tool Purpose
prepare_database_migration Stage schema change safely in a temp branch
run_sql / run_sql_transaction Verify work on temp branch
describe_table_schema / get_database_tables Inspect structure for correctness
complete_database_migration Commit and clean up when checks pass

This small surface (instead of 100+ generic API calls) dramatically increases accuracy by reducing choice overload. :contentReference[oaicite:2]{index=2}

Design principles for tools:

  • Verb-first, goal-focused names (e.g., prepare_…, not POST /v1/db/…). :contentReference[oaicite:3]{index=3}
  • Encoded guardrails (temporary branches, sandbox staging). :contentReference[oaicite:4]{index=4}
  • Clear, outcome-oriented descriptions that implicitly teach workflow order. :contentReference[oaicite:5]{index=5}

3) Build Planning/Orchestration That Reflects the Job

A good planning prompt should map high-level intent to tool sequences with minimal ambiguity.

By keeping the toolkit small and intentional:

  • the agent’s decision space shrinks,
  • the correct sequence emerges naturally,
  • “tool roulette” disappears. :contentReference[oaicite:6]{index=6}

This is the essence of Your API is not an MCP — building an MCP server that shapes the assistant’s interaction to the job, not the API surface. :contentReference[oaicite:7]{index=7}


4) Execute with Explicit Safety and Stage Gates

Design tools so that harmful actions cannot occur before checks are complete. For Neon’s migration flow, that means:

  • stage changes in a throw-away environment,
  • run verification queries,
  • only complete the full migration after checks pass. :contentReference[oaicite:8]{index=8}

These built-in safety guardrails reduce silent failures and avoid catastrophic states.


5) Verification Is the Essential Feedback Loop

Verification is why iterative improvement is possible:

  • it yields concrete error signals,
  • it converts execution outcomes into evidence,
  • it gives you something to optimize against.

For schema migration, verification looks like:

  • running validation SQL (run_sql),
  • inspecting schema (describe_table_schema),
  • checking counts/constraints before commit. :contentReference[oaicite:9]{index=9}

This makes success observable and failures inspectable, which in turn means:

  • planning prompt tweaks can be measured,
  • tool definitions can be refined,
  • orchestration orders can be compared,
  • training data for fine-tuning can be generated.

Without verification, you have no learning signal.


6) Iterative Refinement Driven by Verification

Explicit verification generates the metrics that make iteration possible.

Use those signals to:

  • merge tools used together frequently,
  • remove rarely needed tools,
  • clarify documentation so the agent learns faster,
  • refine planning prompts to avoid predictable failure points.

Iterations shrink the decision space and improve accuracy by progressively removing ambiguity.


7) Fine-Tune Only After Structure Stabilizes

After a reliable baseline (small tool set + consistent verification) exists, fine-tuning models makes sense:

  • specialize models for tool selection accuracy,
  • reduce hallucination around tool usage,
  • handle domain-specific language patterns,
  • shorten planning loops and improve decisiveness.

But fine-tuning alone can’t fix poor tooling or underspecified JTBDs — the verification-driven structure does.


Core Insight

Execution completes the job once.
Verification connects one run to the next.
A small, task-focused toolset makes those links visible.

This is exactly the pattern Neon ships in their MCP workflow — and why your article emphasized:

  • small, well-named tools,
  • encoded guardrails,
  • verifiable transitions,
  • and job-level sequencing taught via documentation and labels. :contentReference[oaicite:10]{index=10}

Quick Practical Checklist

  1. State the JTBD with verifiable success criteria.
  2. Define ≤10 task-oriented tools that cover the job end-to-end.
  3. Label tools with verb-first, outcome-focused names.
  4. Build planning prompts that naturally select and sequence tools.
  5. Execute with built-in safeties (sandboxes, checks).
  6. Verify outcomes explicitly and early.
  7. Use verification signals to refine tools, prompts, and orchestration.
  8. Fine-tune only once the core system is stable.