I started this thought in this post over here.

Overview

High-accuracy agentic systems are not built by exposing APIs as tools. They are engineered by starting from the job to be done (JTBD), designing a minimal set of task-shaped tools, and using verification to drive iterative improvement.

This process turns probabilistic execution into a system that measurably improves over time.

1) Define the Job to Be Done

A JTBD must specify not just what to do, but what “done” means.

The job should imply clear success criteria.
Completion must be verifiable, not assumed.

If “done” cannot be checked, the job is underspecified.

2) Decompose the Job into Verifiable Sub-Jobs

Break the JTBD into a small number of meaningful sub-jobs, each with:

a purpose
an expected outcome
a way to verify success

This decomposition defines the logical shape of the system.

3) Design Tools Around Sub-Jobs, Not APIs

Tools should correspond to units of work, not raw endpoints.

Good tools:

bundle steps that always occur together
encode best practices and constraints
reduce the surface area for error

The goal is a minimal tool set that fully completes the job.

4) Engineer Planning and Orchestration

Use a planning prompt that:

understands the job
selects an appropriate strategy
sequences tools deliberately
respects constraints and safety rules

Planning quality often matters more than model capability.

5) Execute with Guardrails

Tool execution should be:

constrained
reviewable when necessary
safe by default

Execution alone is not success; it is only an attempt.

6) Verify Outcomes Against the Job

Verification is mandatory and explicit.

For each sub-job, verify:

correctness
completeness
compatibility with downstream use

Verification turns execution into accountable work and produces signals the system can learn from.

7) Use Verification Signals to Iterate

Verification results enable systematic improvement across:

planning prompts
tool definitions
tool instructions
orchestration order
tool count (merge or remove tools)

Iteration is driven by evidence, not intuition.

8) Fine-Tune Only After the System Is Stable

Fine-tuning is appropriate when:

the structure is sound
errors are repeatable
verification data is reliable

Fine-tuning reduces residual error; it does not fix poor system design.

Core Principle

Execution completes the job once.
Verification enables the system to do it better next time.

That feedback loop — from JTBD to tools to execution to verification and back — is what differentiates a robust agentic system from a brittle one.

Applied directly to NeonJS

1) Start with the Job to Be Done (JTBD)

Always begin by stating the outcome in plain language, focusing on what success must prove:

“Perform a safe, verifiable schema migration.” :contentReference[oaicite:0]{index=0}

This shapes all subsequent design decisions.

2) Collapse Low-Level APIs Into Task-Oriented Tools

Don’t give an LLM a catalog of generic endpoints. Instead, expose a small set of clearly named, job-centric tools.

Neon’s concrete example for schema migration uses just four tools:

Tool	Purpose
`prepare_database_migration`	Stage schema change safely in a temp branch
`run_sql` / `run_sql_transaction`	Verify work on temp branch
`describe_table_schema` / `get_database_tables`	Inspect structure for correctness
`complete_database_migration`	Commit and clean up when checks pass

This small surface (instead of 100+ generic API calls) dramatically increases accuracy by reducing choice overload. :contentReference[oaicite:2]{index=2}

Design principles for tools:

Verb-first, goal-focused names (e.g., prepare_…, not POST /v1/db/…). :contentReference[oaicite:3]{index=3}
Encoded guardrails (temporary branches, sandbox staging). :contentReference[oaicite:4]{index=4}
Clear, outcome-oriented descriptions that implicitly teach workflow order. :contentReference[oaicite:5]{index=5}

3) Build Planning/Orchestration That Reflects the Job

A good planning prompt should map high-level intent to tool sequences with minimal ambiguity.

By keeping the toolkit small and intentional:

the agent’s decision space shrinks,
the correct sequence emerges naturally,
“tool roulette” disappears. :contentReference[oaicite:6]{index=6}

This is the essence of Your API is not an MCP — building an MCP server that shapes the assistant’s interaction to the job, not the API surface. :contentReference[oaicite:7]{index=7}

4) Execute with Explicit Safety and Stage Gates

Design tools so that harmful actions cannot occur before checks are complete. For Neon’s migration flow, that means:

stage changes in a throw-away environment,
run verification queries,
only complete the full migration after checks pass. :contentReference[oaicite:8]{index=8}

These built-in safety guardrails reduce silent failures and avoid catastrophic states.

5) Verification Is the Essential Feedback Loop

Verification is why iterative improvement is possible:

it yields concrete error signals,
it converts execution outcomes into evidence,
it gives you something to optimize against.

For schema migration, verification looks like:

running validation SQL (run_sql),
inspecting schema (describe_table_schema),
checking counts/constraints before commit. :contentReference[oaicite:9]{index=9}

This makes success observable and failures inspectable, which in turn means:

planning prompt tweaks can be measured,
tool definitions can be refined,
orchestration orders can be compared,
training data for fine-tuning can be generated.

Without verification, you have no learning signal.

6) Iterative Refinement Driven by Verification

Explicit verification generates the metrics that make iteration possible.

Use those signals to:

merge tools used together frequently,
remove rarely needed tools,
clarify documentation so the agent learns faster,
refine planning prompts to avoid predictable failure points.

Iterations shrink the decision space and improve accuracy by progressively removing ambiguity.

7) Fine-Tune Only After Structure Stabilizes

After a reliable baseline (small tool set + consistent verification) exists, fine-tuning models makes sense:

specialize models for tool selection accuracy,
reduce hallucination around tool usage,
handle domain-specific language patterns,
shorten planning loops and improve decisiveness.

But fine-tuning alone can’t fix poor tooling or underspecified JTBDs — the verification-driven structure does.

Core Insight

Execution completes the job once.
Verification connects one run to the next.
A small, task-focused toolset makes those links visible.

This is exactly the pattern Neon ships in their MCP workflow — and why your article emphasized:

small, well-named tools,
encoded guardrails,
verifiable transitions,
and job-level sequencing taught via documentation and labels. :contentReference[oaicite:10]{index=10}

Quick Practical Checklist

State the JTBD with verifiable success criteria.
Define ≤10 task-oriented tools that cover the job end-to-end.
Label tools with verb-first, outcome-focused names.
Build planning prompts that naturally select and sequence tools.
Execute with built-in safeties (sandboxes, checks).
Verify outcomes explicitly and early.
Use verification signals to refine tools, prompts, and orchestration.
Fine-tune only once the core system is stable.

Archives

Census Americans

Blog Stats

Archives

Census Americans

Blog Stats

Overview

1) Define the Job to Be Done

2) Decompose the Job into Verifiable Sub-Jobs

3) Design Tools Around Sub-Jobs, Not APIs

4) Engineer Planning and Orchestration

5) Execute with Guardrails

6) Verify Outcomes Against the Job

7) Use Verification Signals to Iterate

8) Fine-Tune Only After the System Is Stable

Core Principle

1) Start with the Job to Be Done (JTBD)

2) Collapse Low-Level APIs Into Task-Oriented Tools

3) Build Planning/Orchestration That Reflects the Job

4) Execute with Explicit Safety and Stage Gates

5) Verification Is the Essential Feedback Loop

6) Iterative Refinement Driven by Verification

7) Fine-Tune Only After Structure Stabilizes

Core Insight

Quick Practical Checklist

Archives

Census Americans

Blog Stats

Overview

1) Define the Job to Be Done

2) Decompose the Job into Verifiable Sub-Jobs

3) Design Tools Around Sub-Jobs, Not APIs

4) Engineer Planning and Orchestration

5) Execute with Guardrails

6) Verify Outcomes Against the Job

7) Use Verification Signals to Iterate

8) Fine-Tune Only After the System Is Stable

Core Principle

Applied directly to NeonJS

1) Start with the Job to Be Done (JTBD)

2) Collapse Low-Level APIs Into Task-Oriented Tools

3) Build Planning/Orchestration That Reflects the Job

4) Execute with Explicit Safety and Stage Gates

5) Verification Is the Essential Feedback Loop

6) Iterative Refinement Driven by Verification

7) Fine-Tune Only After Structure Stabilizes

Core Insight

Quick Practical Checklist

Related Posts

ML Killer Breakfast Ambiguity

Sean Penn on Books

Humans Moving At The Inverse of Moore’s Law?

The 4 C’s of Security Professionals