Seekmodo developer docs
Reference for the REST shim, MCP JSON-RPC surface, and storefront connectors. Authenticate with HMAC; replay window is 5 minutes.
<seekmodo-chat> — conversational shopper
A drop-in chat widget that runs an LLM-driven tool loop on the gateway side. Each turn issues zero, one, or many sub-tool calls (search, suggest, recommend.*) — they meter the same way a shopper clicking a SERP search box would. The merchant gets one bill whether the shopper typed into the search box or asked the chat.
Overage behaviour
When a chat turn would tip the tenant past the searches bucket without prepaid credits, the gateway runs the turn to completion, logs it to numinix_telemetry_chat_suppressed, and returns meta.suppressed=true in the response envelope. The widget swaps the assistant message for the "upgrade to keep chat" affordance — no half-finished answers, no mid-turn cut-off.
Why log-then-suppress instead of decline-up-front? The agent doesn't know how many tool calls it will need before it starts; a pre-flight check would either under-bill (some calls land for free) or over-block (refuses turns that would have fit). Pay the LLM cost once, charge the merchant correctly, and let the widget render an upgrade nudge that converts.
Embed
After installing @seekmodo/web-components and wiring the meta tags:
<seekmodo-chat
intro="Hi! Tell me what you're looking for."
max-turns="6"
auto-citations
>
<!-- Optional: override the default upgrade affordance -->
<a slot="upgrade" href="/account/billing?upgrade=chat">
Need more chat? Upgrade your plan →
</a>
</seekmodo-chat>API
If you'd rather call the chat tool directly from the SDK (e.g. for a server-rendered transcript or a non-DOM client):
const res = await client.chat({
messages: [
{ role: "user", content: "Got any 4-person tents under $300?" },
],
});
if (res.meta?.suppressed) {
renderUpgradeCta(res.meta.suppression_reason);
} else {
renderAssistantMessage(res.message, res.citations);
}How chat-shopper meters
The chat tool maps to two buckets at metering time. Both already existed in the plan from Sprint 1 — this sprint just wires them up:
searches(master bucket) — counts against your monthly plan ceiling. The outerchatcall charges 1; each sub-tool call charges 1 more.searches_chat(display bucket) — surfaced on the /usage dashboard so the merchant can see chat fan-out separately from SERP loads. Charges 1 per sub-tool call (the outer chat call is counted on the master bucket only — display would double-count).
Worked example — 3-turn session, 5 sub-tool calls
A live shopper has the conversation below. Counters reset monthly; we'll assume an empty period for the math.
| Turn | Shopper says | Tool calls | Σ searches | Σ searches_chat |
|---|---|---|---|---|
| 1 | “Got any 4-person tents under $300?” |
| 3 | 2 |
| 2 | “Show me the lightest one.” |
| 5 | 3 |
| 3 | “What about a footprint or groundsheet for it?” |
| 8 | 5 |
After this session the merchant's /usage dashboard shows 8 searches consumed against the plan, of which 5 are tagged with the chat-shopper display bucket. The remaining 3 come from the outer chat-tool entry points (one per turn). The session's fan-out factor — 5 sub-tool calls / 3 shopper turns ≈ 1.67× — is recorded per-turn in numinix_telemetry_chat_turns so the operator can plot it over time.
Defence-in-depth ceilings
- 20 sub-tool calls per turn hard cap in
ChatTool::MAX_TOOL_CALLS_PER_TURN. Catches runaway agents before they cost the gateway compute. - 30 messages per conversation hard cap in
ChatTool::MAX_MESSAGES. Drops the oldest history beyond that. - Per-tenant LLM spend ceiling — see
SpendCeiling. When tripped, the chat tool returns a graceful "search assistant temporarily unavailable" fallback instead of charging the merchant for the unproductive turn.
Plan-tier eligibility
Chat shopper requires a Growth plan or higher. Hobby and Starter tenants get an HTTP 402 with error: feature_not_in_plan. The widget mounts but renders nothing — upgrade to enable.