Skip to main content

Overview

The voice server is started with moorcheh-edge voice serve. It runs on Linux edge hardware (for example Arduino UNO Q) and exposes mic, speaker, and RAG endpoints over HTTP. It is not part of the Moorcheh Edge Docker container (:8080). RAG calls are proxied to Moorcheh Edge on the same device (default http://127.0.0.1:8080).
Default URLhttp://<device-ip>:8766
Start commandmoorcheh-edge voice serve --port 8766
PlatformLinux with ALSA (not Windows/macOS)
Run moorcheh-edge voice setup once before starting the server.

GET /health

Check that the voice server is running.
curl -X GET "http://192.168.1.50:8766/health"
{
  "status": "ok",
  "service": "moorcheh-edge-voice"
}

POST /listen

Record from the device mic and return transcribed text.
seconds
number
Fixed recording length in seconds. When set, disables silence detection.
until_silence
boolean
default:"true"
When true (and seconds is omitted), stop recording after a pause in speech.
max_seconds
number
default:"30"
Maximum recording length when using silence detection (3–60).
curl -X POST "http://192.168.1.50:8766/listen" \
  -H "Content-Type: application/json" \
  -d '{"until_silence": true, "max_seconds": 30}'
{
  "heard": "Do you have oat milk?"
}

POST /speak

Synthesize and play text on the device speaker.
text
string
required
Text to speak.
curl -X POST "http://192.168.1.50:8766/speak" \
  -H "Content-Type: application/json" \
  -d '{"text": "Welcome to The Brew Corner."}'
{
  "spoke": true
}

POST /ask/stream

Stream a RAG answer as Server-Sent Events. Embeds the query on the edge device when query_vector is omitted; otherwise uses the vector you supply. Proxies POST /answer/stream on Moorcheh Edge. When speak: true, plays TTS sentence-by-sentence on the device speaker.
query
string
required
Question text.
query_vector
array
Optional precomputed embedding. When omitted, the server embeds locally with BGE (768-dim).
top_k
number
default:"5"
Passages to retrieve for context.
kiosk_mode
boolean
default:"true"
When true, filters passages below threshold.
threshold
number
default:"0.25"
Minimum search score when kiosk_mode is true.
header_prompt
string
Optional system instruction for RAG.
Optional instruction before the question.
chat_history
array
Prior turns: [{"role": "user"|"assistant", "content": "..."}].
speak
boolean
default:"false"
When true, enqueue sentence TTS on the device during the stream.
holding_enabled
boolean
default:"true"
When speak is true, play the cached kiosk holding welcome audio in parallel with RAG (requires voice cache-holding).
curl -N -X POST "http://192.168.1.50:8766/ask/stream" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Do you have oat milk?",
    "top_k": 2,
    "kiosk_mode": true,
    "threshold": 0.3,
    "speak": true
  }'

SSE events

Inherits Moorcheh Edge events from Answer stream:
EventDescription
metaModel, sources, and context count
tokenAnswer text delta ({"delta": "..."})
doneFull answer and metadata
errorLLM or upstream failure
Additional events when speak: true:
EventDescription
holdingKiosk welcome audio started ({"text": "...", "playing": true})
sentenceA completed sentence queued for TTS ({"text": "..."})
TTS continues in the background after the HTTP stream closes; the connection does not wait for playback to finish.

POST /ask and POST /ask/voice

Full voice loop: record from mic (unless query is provided), embed locally, call POST /answer on Moorcheh Edge, and speak the reply on the device. /ask/voice is an alias for /ask.
query
string
When set, skip mic capture and use this text as the question.
seconds
number
Fixed recording length when query is omitted.
until_silence
boolean
default:"true"
Stop recording after a pause when seconds is omitted.
max_seconds
number
default:"30"
Maximum recording length for silence detection.
top_k
number
default:"5"
Passages to retrieve for context.
kiosk_mode
boolean
default:"true"
Filter low-scoring passages when true.
threshold
number
default:"0.25"
Minimum score when kiosk_mode is true.
header_prompt
string
Optional system instruction.
Optional instruction before the question.
chat_history
array
Prior conversation turns.
speak
boolean
default:"true"
When true, play the answer on the device speaker.
curl -X POST "http://192.168.1.50:8766/ask" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are your hours?",
    "top_k": 2,
    "kiosk_mode": true,
    "threshold": 0.3,
    "speak": true
  }'
{
  "heard": "What are your hours?",
  "query": "What are your hours?",
  "answer": "We are open Monday through Friday, 7am to 6pm.",
  "model": "llama3.2:1b-instruct-q4_K_M",
  "context_count": 1,
  "spoke": true
}

Errors

ConditionStatusBody
Invalid JSON400{"error": "invalid JSON body"}
Missing query on /ask/stream400{"error": "query is required"}
Unknown route404{"error": "not found"}
Voice runtime / audio failure503{"error": "..."}
Unexpected server error500{"error": "..."}