Documentation Index
Fetch the complete documentation index at: https://docs.ascendgtm.net/llms.txt
Use this file to discover all available pages before exploring further.
Starting a session
Send aPOST to the Hermes chat endpoint with your message and the agent type you want to use:
Continuing a session
Includesession_id to resume the conversation with full context:
Session state
Hermes persists session context in the gateway’s KV store viaagent_state:
- Sessions are scoped to
{tenant}:{session_id}— fully isolated across tenants - Session state TTL defaults to 24 hours (configurable up to 720 hours / 30 days)
- Tool call results within a session are included in the agent’s context window
Request parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
message | string | Yes | The user message / instruction |
agent_type | string | Yes | Agent config to use (e.g. gtm_analyst) |
session_id | string | No | Resume an existing session |
context | object | No | Additional context injected into the prompt |
max_iterations | integer | No | Override the config’s max_iterations (1–20) |
Response fields
| Field | Description |
|---|---|
session_id | Session identifier for follow-up turns |
response | Agent’s final text response |
tool_calls | Array of {tool, status, duration_ms} for observability |
iterations | How many reasoning loops the agent ran |
status | complete, max_iterations_reached, or error |
usage | Token counts {prompt, completion, total} |
Error handling
If the agent exceedsmax_iterations, it returns with status: "max_iterations_reached" and whatever partial response it assembled. This is not an error — it means the task was too complex for the configured limit.
For quota or upstream API errors, the tool call that failed is listed in tool_calls with status: "error" and an error_code. The agent continues with other tools.
Best practices
- Start with focused questions — narrow scope = fewer tool calls = faster responses
- Use sessions for workflows — chain analysis → write → send in one session rather than three separate calls
- Pin agent types — use
gtm_analystfor data questions,outreach_writerfor content; don’t route everything through the orchestrator - Set explicit max_iterations — for quick lookups,
max_iterations: 3; for deep analysis, let it run to the config default