Skip to main content

answer.generate

Generate AI-powered answers with RAG (retrieve from a text namespace) or direct LLM calls. Configure the LLM provider (Ollama, OpenAI, or Cohere) with moorcheh configure before use.
client.answer.generate(
    *,
    namespace: str,
    query: str,
    top_k: int | None = None,
    threshold: float | None = None,
    kiosk_mode: bool = False,
    temperature: float | None = None,
    ai_model: str | None = None,
    header_prompt: str | None = None,
    footer_prompt: str | None = None,
    chat_history: list[dict[str, str]] | None = None,
    structured_response: dict | None = None,
) -> dict[str, Any]
API: POST /answer — see Generate AI Answer

Search Mode (RAG)

from moorcheh import MoorchehClient

with MoorchehClient("http://localhost:8080") as client:
    response = client.answer.generate(
        namespace="my-documents",
        query="What are the main benefits?",
        top_k=5,
    )
    print(response["answer"])
    print(response["context_count"])

Direct AI Mode

Use an empty namespace string:
with MoorchehClient("http://localhost:8080") as client:
    response = client.answer.generate(
        namespace="",
        query="Explain vector search in one paragraph",
        temperature=0.5,
        header_prompt="You are a concise technical writer.",
    )

With chat history

with MoorchehClient("http://localhost:8080") as client:
    response = client.answer.generate(
        namespace="",
        query="Can you give an example?",
        chat_history=[
            {"role": "user", "content": "What is RAG?"},
            {"role": "assistant", "content": "RAG combines retrieval with generation..."},
        ],
    )

Structured output

with MoorchehClient("http://localhost:8080") as client:
    response = client.answer.generate(
        namespace="docs",
        query="Summarize the return policy",
        structured_response={"enabled": True},
    )
    print(response.get("structured_data"))

Parameters

FieldTypeRequiredDescription
querystringYesUser question
namespacestringYesText namespace for RAG, or "" for direct LLM
top_knumberNoChunks to retrieve (default 10)
temperaturenumberNo0.02.0 (default 0.7)
ai_modelstringNoOverride configured LLM model
chat_historyarrayNoPrior turns
header_promptstringNoSystem instruction
footer_promptstringNoTrailing instruction
kiosk_modebooleanNoFilter by threshold
thresholdnumberNoRequired if kiosk_mode is true
structured_responseobjectNo{ "enabled": true, "schema": {...} }

Response

FieldTypeDescription
answerstringGenerated text
modelstringModel ID used
context_countnumberChunks used for RAG
querystringEcho of input query
structured_dataobjectWhen structured output is enabled