Cortex
The Cortex is the canonical runtime-engine abstraction in Synthetic Heart. A
cortex represents a reasoning or execution backend and has a kind (for
example: llm, live, agent). Cortex engines are discovered and
registered at startup and can be switched at runtime.
Quick summary (for users)
What is a Cortex? — A pluggable runtime engine (text model, live/streaming model, or agentic runtime) that produces responses and/or executes actions.
How to switch? — Use the Web UI Components page or the
/cortexcommand.Common kinds:
llm(chat-style models),live(low-latency audio/video sessions),agent(action-capable runtimes), and legacy Selenium-based drivers.
User-facing notes
Cortex engines are first-class: when you read “engine” in the docs, think “Cortex”.
Use the Components page in the Web UI to enable/disable, login, or configure a Cortex engine.
Switching and configuration
Switch at runtime with the command:
/cortex <engine_name>(/llmis a deprecated alias).Each Cortex exposes its configuration fields (API keys, model selectors, limits) in the Components view so administrators can update settings without editing code.
Available Cortexes (user guide)
This section explains the Cortex types you can use and how to configure them from the Web UI (Components) — developer details belong in the Developer Guide (linked below).
Selenium (browser-driven) — what & how
What it is: a browser-automation connector that drives web UIs (ChatGPT, Gemini, Grok) when an official API is not used.
When to use: quick way to use a personal/web account or to access UI-only features.
Configure (user steps):
Open Web UI → Components and find the Selenium engine (examples:
selenium_chatgpt,selenium_gemini,selenium_grok).Click Enable, then click Login and follow the displayed URL (the page shows a Selkies login URL such as
https://{host}:{port}).Complete the interactive browser login; the component will show Logged when done.
Do I need to login? Yes — Selenium cortexes require an interactive login (no API key).
Common settings: model selectors like
CHATGPT_MODELorGEMINI_MODELcan be edited in the component settings.
Gemini API (official API)
What it is: the API-backed Cortex that talks to Google’s GenAI/Gemini.
Configure (user steps):
Obtain an API key from Google Cloud Console (enable the GenAI API or “Generative AI” product, then create an API key or service account).
In Web UI → Components set
GEMINI_API_KEY(and optionalGEMINI_MODEL).The engine will report Loaded once the key is valid.
Do I need API keys? Yes — Gemini API requires a Google API key (billing may be required on your Google project).
Where to get the key: Google Cloud Console → APIs & Services → Credentials (create API key or service account key; enable GenAI API; ensure billing).
Note
If you want Gemini API to participate in the Auris or Live subsystems, add it through the External Endpoints UI and enable the corresponding subsystem mapping. It will not be exposed automatically as an Auris provider unless configured explicitly.
Live (Gemini Live — real-time voice)
What it is: low-latency audio sessions (used for Discord voice and other real-time integrations). See the full guide: Gemini Live API — SyntH Discord Voice Integration.
How to enable & use (user steps):
Configure
GEMINI_API_KEY(Gemini Live requires it).If you want Discord voice, also set
DISCORD_BOT_TOKENand invite the bot to your server/voice channel.Enable the Gemini API cortex and the Discord interface in Web UI → Components.
Invite the bot to voice — the persona may start a Live session automatically, or you can ask it to join voice during chat.
Do I need API keys? Yes —
GEMINI_API_KEY(and a Discord token for Discord voice). See the Live integration page for audio/config details.
Note
Gemini Live support must be registered through the External Endpoints
workflow and mapped to the live subsystem before it is available
in the Components page.
Quick troubleshooting: make sure
google-genaiis installed, ffmpeg is available, and the Components page shows the Live/session manager as available.
Agent (action-capable runtimes)
What it is: Cortexes that can execute actions or tools on your behalf. Normal users only need to use actions — administration and permissions are handled through the Web UI and server configuration.
Quick FAQ
Where do I configure a Cortex? — Web UI → Components (recommended for non-developers). You can also set env vars in your deployment (
.env-dev) for advanced setups.Do I need API keys? — Depends on the Cortex: Gemini API / Live need
GEMINI_API_KEY; Selenium needs an interactive login (no API key).How do I verify it works? — Components shows status (Loaded / Logged); ask the persona a question or run the Quickstart examples.
See also
Quickstart — quick start and Components overview
Gemini Live API — SyntH Discord Voice Integration — Live (real‑time) integration
WebUI configuration control types — how settings appear in the Web UI
Developer guide (components & Cortex internals): /component_development_guide