Prompt Pipeline

The prompt rewrite described in REWRITE-TASK.md is now the live runtime architecture.

Synthetic Heart no longer treats prompt assembly as a single pretty-printed JSON blob that every engine must parse as text. The canonical pipeline is now:

core.plugin_instance calls core.prompt_engine.build_prompt_request().
The prompt builder assembles a typed PromptRequest plus compatibility data for legacy callers.
Engines render that typed request into their native transport format with a renderer from core.prompt_renderers.
The model returns either plain text or native tool/function calls, which are normalized back into SyntH’s {"actions": [...]} format.

Compatibility status

build_prompt_request() is the canonical prompt builder.
build_json_prompt() still exists as a deprecated alias for backward compatibility.
The legacy dict payload is still returned today, but it now carries the typed request under __prompt_request so migrated engines can use the new path immediately.
Engines may also accept a PromptRequest object directly.

PromptRequest

core.prompt_request.PromptRequest is the engine-agnostic intermediate representation. It splits prompt state by stability so renderers can preserve conversation structure and enable prompt caching where the provider supports it.

Stable fields:

system_instruction: persona, safety rules, and high-level response rules.
tool_declarations: tool manifests derived from the actions registry.

Moderately stable field:

context_summary: diary, memories, cross-chat recaps, and participant summaries formatted as plain text.

Dynamic fields:

conversation_history: parsed user / assistant turns for the active chat.
current_text: the current user turn.
runtime_ctx: timestamp, scope, language, tone, emotions, interface data, and grillo flags.
attachments: multimodal payload metadata for the current turn.
reply_to: optional reply metadata.

Modes

The rewrite is not chat-only. The builder now produces different PromptRequest.mode values for different runtime surfaces.

chat: Standard message processing with full conversation history and context.
grillo: Internal autonomous beats. These omit normal conversation history and use a minimal context summary.
delivery: Auto-response delivery prompts created by core.prompt_engine.build_delivery_request(). These contain persona, delivery instructions, action outputs, and only message_* tools.
live: Live voice prompts created by core.prompt_engine.build_live_prompt_request() and rendered to one flat instruction string for live sessions.

Renderers

core.prompt_renderers contains the provider-specific renderers.

OpenAIRenderer: Baseline renderer for OpenAI-compatible chat-completions APIs. Produces messages arrays and optional tool schemas.
AnthropicRenderer: Produces Anthropic Messages payloads. The stable system block is emitted with cache_control when ENABLE_PROMPT_CACHING is enabled.
GeminiRenderer: Produces Gemini-native system_instruction_text + contents payloads and Gemini function declarations.
TextRenderer: Compact fallback for engines that cannot consume structured conversation turns. This is still smaller than the old indented JSON blob path.
LiveRenderer: Flattens PromptRequest(mode='live') into the plain-text instruction used by live voice callers.

Where the new path is used

The rewrite is already active in the main engine families:

engines/external_engines/openapi.py uses OpenAIRenderer.
engines/external_engines/openrouter.py uses OpenAIRenderer.
engines/external_engines/anthropic.py uses AnthropicRenderer.
engines/external_engines/gemini_api.py uses GeminiRenderer.
core/external_endpoints/bridges/cortex_bridge.py uses OpenAIRenderer for external endpoint Cortex engines.
core.auto_response.AutoResponseSystem attaches PromptRequest(mode='delivery') to action-result deliveries.
core.prompt_engine.build_live_system_instruction() now renders a PromptRequest(mode='live') through LiveRenderer.

Multimodal handling

Attachments are no longer documented as a giant base64 blob living inside a: single user message by default. The typed pipeline keeps attachment metadata in PromptRequest.attachments and migrated engines extract native multimodal parts before sending the request to the provider.
This keeps text prompts smaller and avoids duplicating heavy binary payloads in: both the text and multimodal layers.

Operational notes

instructions_verbose and instructions still exist in the compatibility dict for legacy callers, but renderer-backed engines treat PromptRequest.system_instruction as the canonical system prompt.
system_message payloads are still used for correction and delivery flows. They coexist with __prompt_request during the transition.
Debugging may still show compatibility prompt dicts in logs, but the runtime decision point for modern engines is the typed prompt object.