Skip to main content

Overview

Generate AI-powered answers to questions using your uploaded data as context. Moorcheh supports nine AI models for intelligent answer generation—the same catalog as the Answer API.
The Answer API supports two modes: Search Mode (with namespace) and Direct AI Mode (empty namespace for direct model calls). Available models include Claude Sonnet 4.6, Claude Opus 4.6, Llama 4 Maverick, Amazon Nova Pro, DeepSeek, Qwen, and others—see the table below and the API reference for full details.

Supported AI Models

Model IDNameProviderDescription
anthropic.claude-sonnet-4-6Claude Sonnet 4.6AnthropicFast flagship: coding, tools, long docs and RAG (~1M context).
anthropic.claude-opus-4-6-v1Claude Opus 4.6AnthropicDeepest reasoning and hardest tasks; pick when quality matters most (~1M context).
meta.llama4-maverick-17b-instruct-v1:0Llama 4 Maverick 17BMetaLong context, summarization, function calling, fine-tuning friendly.
amazon.nova-pro-v1:0Amazon Nova ProAmazonChat, math, and structured answers for AWS-style workloads.
deepseek.r1-v1:0DeepSeek R1DeepSeekStep-by-step reasoning; math, logic, and technical explanations.
deepseek.v3.2DeepSeek V3.2DeepSeekEfficient general Q&A, multilingual, everyday RAG (~164K context).
openai.gpt-oss-120b-1:0OpenAI GPT OSS 120BOpenAILarge generalist: research-style answers and long-form writing.
qwen.qwen3-32b-v1:0Qwen 3 32BQwenCode and bilingual (EN/ZH) tasks in a smaller footprint.
qwen.qwen3-next-80b-a3bQwen3 Next 80B A3BQwenMoE model for long chats, docs, and code at scale (~256K context).

Basic Usage

from moorcheh_sdk import MoorchehClient

client = MoorchehClient(api_key="your-api-key")

# Generate answer from your data
answer = client.get_answer(
    namespace="my-documents",
    query="What is Moorcheh?",
    ai_model="deepseek.r1-v1:0"
)

print(answer["answer"])

Search Mode vs Direct AI Mode

Search Mode (with namespace)

When you provide a namespace, the API searches your data for relevant context and uses it to generate contextual answers.
# Answer based on your data
answer = client.get_answer(
    namespace="my-documents",
    query="What are the main features?",
    ai_model="deepseek.r1-v1:0",
    top_k=5  # Number of documents to consider
)
Best for:
  • Q&A over your documents
  • Knowledge base queries
  • Context-aware responses

Advanced Parameters

Temperature Control

Control response creativity (0.0 - 2.0):
# More deterministic (lower temperature)
answer = client.get_answer(
    namespace="my-documents",
    query="List the API endpoints",
    ai_model="deepseek.r1-v1:0",
    temperature=0.1
)

# More creative (higher temperature)
answer = client.get_answer(
    namespace="my-documents",
    query="Write a blog post about our features",
    ai_model="deepseek.r1-v1:0",
    temperature=0.9
)

Custom Prompts

Add custom instructions for the AI:
answer = client.get_answer(
    namespace="my-documents",
    query="Explain our pricing",
    ai_model="deepseek.r1-v1:0",
    header_prompt="You are a helpful sales assistant. Be concise and friendly.",
    footer_prompt="Always end with a call to action."
)

Chat History

Maintain conversation context:
answer = client.get_answer(
    namespace="my-documents",
    query="What about the advanced features?",
    ai_model="deepseek.r1-v1:0",
    chat_history=[
        {"role": "user", "content": "What features do you offer?"},
        {"role": "assistant", "content": "We offer semantic search, vector storage..."}
    ]
)

Response Format

Successful responses match the Answer API, including the model field (the model ID used for the reply):
{
  "answer": "Moorcheh is a lightning-fast semantic search engine...",
  "model": "deepseek.r1-v1:0",
  "context_count": 3,
  "query": "What is Moorcheh?"
}

Model Selection Guide

General Purpose

Claude Sonnet 4.6 — Fast flagship for coding, tools, long docs, and RAG

Advanced Reasoning

Claude Opus 4.6 — Deepest reasoning when quality matters most

Long Context

Llama 4 Maverick — Long context, summarization, and function calling

Code & Logic

DeepSeek R1 — Step-by-step reasoning, math, and technical explanations

Best Practices

  • Use Claude Sonnet 4.6 for general RAG, tools, and long documents
  • Use DeepSeek R1 for math, logic, and step-by-step technical explanations
  • Use Llama 4 Maverick for very long context and summarization
  • Use 3-5 documents for focused answers
  • Use 8-10 documents for comprehensive responses
  • Higher values may include irrelevant context
  • 0.1-0.3 for factual, deterministic answers
  • 0.7 for balanced responses (default)
  • 0.9-1.0 for creative content generation
  • Add role context in header_prompt
  • Specify output format requirements
  • Keep prompts concise and clear

Next Steps

Search API

Learn about semantic search

Upload Data

Add documents to your namespace

API Reference

Complete AI generation API docs

Examples

See real-world examples