Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.ascendgtm.net/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Quality Harness is V5’s self-improvement loop. It automatically samples tool call traces, evaluates them with three independent LLM judges, and surfaces regressions before they affect production. Agents can also send feedback directly via submit_feedback.

How grading works

Every tool call that passes the sampling threshold is evaluated by a tri-judge panel:
JudgeModelProvider
Alphaclaude-haiku-4-5-20251001Anthropic (via AI Gateway)
Betagpt-4o-miniOpenAI (via AI Gateway)
Gammagemini-1.5-flashGoogle (via AI Gateway)
All three judges run in parallel and evaluate the same trace against the same rubric. Their verdicts are stored in the grader_verdicts D1 table. Disagreements surface router errors — if judges disagree on the domain category, the categorical router has a bug.

Verdict schema

Each judge emits a structured verdict via tool-use (never free text):
{
  "reasoning": "The tool correctly retrieved pipeline data and formatted it...",
  "category": "crm_read",
  "quality": 4,
  "issues": [],
  "confidence": 0.92
}

Quality scale

ScoreMeaning
1Unusable / harmful — wrong data, safety issue, or destructive action
2Partially correct — user must manually repair the output
3Correct — minor polish needed
4Correct + clean — no issues

Domain categories

All 12 V5 domains that tool calls route through: crm_read · crm_write · ads_read · ads_mutate · analytics_report · content_generate · email_send · calendar_op · search_query · data_query · agent_orchestration · error_recovery

Issue taxonomy

A trace can have 0–9 issues flagged:
IssueDescription
incompletePartial/truncated result
hallucinationModel invented data not in upstream response
tool_misuseWrong tool or wrong arguments
missed_contextTenant/account/scope dropped or wrong
verboseUnnecessary preamble or commentary
wrong_domainCategory routed incorrectly (router error)
unsafe_actionMutation when read intended, or destructive without confirmation
format_violationOutput didn’t match declared schema or contract
regressionBehavior worse than prior known-good for same input class

The submit_feedback tool

Any connected AI tool (Claude, Cursor, ChatGPT, Codex, Hermes) can call submit_feedback when it encounters a blocker. This is the direct line from the wild back to the V5 roadmap.
{
  "tool": "submit_feedback",
  "input": {
    "attempted_action": "Retrieve HubSpot contacts created in the last 30 days",
    "expected_outcome": "Array of contact objects with email, name, and company",
    "actual_outcome": "Empty array returned even though contacts exist in HubSpot",
    "tool_called": "hubspot_crm",
    "error_seen": "status: 200, body: {contacts: []}",
    "severity": "med"
  }
}
Feedback is written to the agent_feedback D1 table. Mishaal triages weekly via GET /admin/feedback. Patterns become product features.

Severity levels

LevelWhen to use
lowMinor friction — workaround exists
medBlocked on this workflow, but other workflows fine
highV5 is unusable for the calling agent’s task

Risk gate

Traces that score quality ≤ 2 or flag unsafe_action trigger an alert to SLACK_WEBHOOK_URL. Three consecutive quality ≤ 2 verdicts on the same tool path open an automatic incident in the error_ledger.

Sampling rate

Not every tool call is graded — the harness samples based on:
  • Tool category (higher sampling for mutations: crm_write, ads_mutate, email_send)
  • Tenant tier (higher sampling for production tenants)
  • Recent quality trend (higher sampling after quality degradation)
The sampling configuration is in src/grader/sampling.ts. Adjust via GRADER_SAMPLE_RATE KV key (0.0–1.0, default 0.1 = 10%).

Viewing grader output

# Recent verdicts
curl https://ascend-gateway-v5.ascendgtm.workers.dev/admin/grader/verdicts?limit=20 \
  -H "Authorization: Bearer <admin-key>"

# Agent feedback inbox
curl https://ascend-gateway-v5.ascendgtm.workers.dev/admin/feedback \
  -H "Authorization: Bearer <admin-key>"