Skip to main content

Synopsis

moorcheh-edge answer --query TEXT [options]
Embeds --query locally (BGE, 768-dim), searches the store, and calls POST /answer. The server uses Ollama with fixed model llama3.2:1b-instruct-q4_K_M.
Run moorcheh-edge up first so Ollama is installed and the LLM model is pulled. Use --skip-ollama on up only if you want search without RAG.

Options

FlagDefaultDescription
--queryRequiredQuestion to answer
--top-k5Passages to retrieve for context
--threshold0.0Minimum score when --kiosk-mode is set
--kiosk-modeoffFilter low-scoring passages
--header-promptCustom system instruction
--footer-promptInstruction before the question
--chat-history-jsonPrior turns as JSON array
--temperature0.2 (server)LLM temperature 0.0–2.0
--timeout120HTTP timeout in seconds
--base-urlhttp://localhost:8080API base URL

Examples

moorcheh-edge upload-documents --documents-file documents.json
moorcheh-edge answer --query "Who won the football match?" --top-k 5
With custom temperature and longer timeout (useful on edge hardware):
moorcheh-edge answer --query "Who won?" --top-k 5 --temperature 0.1 --timeout 300

Output

Prints JSON to stdout:
{
  "answer": "...",
  "model": "llama3.2:1b-instruct-q4_K_M",
  "query": "Who won the football match?",
  "context_count": 1,
  "sources": [...]
}
See API: Answer.