Overview
Generate AI-powered answers to questions using your uploaded data as context. Moorcheh supports nine AI models for intelligent answer generation—the same catalog as the Answer API.The Answer API supports two modes: Search Mode (with namespace) and Direct AI Mode (empty namespace for direct model calls). Available models include Claude Sonnet 4.6, Claude Opus 4.6, Llama 4 Maverick, Amazon Nova Pro, DeepSeek, Qwen, and others—see the table below and the API reference for full details.
Supported AI Models
| Model ID | Name | Provider | Description |
|---|---|---|---|
| anthropic.claude-sonnet-4-6 | Claude Sonnet 4.6 | Anthropic | Fast flagship: coding, tools, long docs and RAG (~1M context). |
| anthropic.claude-opus-4-6-v1 | Claude Opus 4.6 | Anthropic | Deepest reasoning and hardest tasks; pick when quality matters most (~1M context). |
| meta.llama4-maverick-17b-instruct-v1:0 | Llama 4 Maverick 17B | Meta | Long context, summarization, function calling, fine-tuning friendly. |
| amazon.nova-pro-v1:0 | Amazon Nova Pro | Amazon | Chat, math, and structured answers for AWS-style workloads. |
| deepseek.r1-v1:0 | DeepSeek R1 | DeepSeek | Step-by-step reasoning; math, logic, and technical explanations. |
| deepseek.v3.2 | DeepSeek V3.2 | DeepSeek | Efficient general Q&A, multilingual, everyday RAG (~164K context). |
| openai.gpt-oss-120b-1:0 | OpenAI GPT OSS 120B | OpenAI | Large generalist: research-style answers and long-form writing. |
| qwen.qwen3-32b-v1:0 | Qwen 3 32B | Qwen | Code and bilingual (EN/ZH) tasks in a smaller footprint. |
| qwen.qwen3-next-80b-a3b | Qwen3 Next 80B A3B | Qwen | MoE model for long chats, docs, and code at scale (~256K context). |
Basic Usage
Search Mode vs Direct AI Mode
- Search Mode
- Direct AI Mode
Search Mode (with namespace)
When you provide a namespace, the API searches your data for relevant context and uses it to generate contextual answers.- Q&A over your documents
- Knowledge base queries
- Context-aware responses
Advanced Parameters
Temperature Control
Control response creativity (0.0 - 2.0):Custom Prompts
Add custom instructions for the AI:Chat History
Maintain conversation context:Response Format
Successful responses match the Answer API, including themodel field (the model ID used for the reply):
Model Selection Guide
General Purpose
Claude Sonnet 4.6 — Fast flagship for coding, tools, long docs, and RAG
Advanced Reasoning
Claude Opus 4.6 — Deepest reasoning when quality matters most
Long Context
Llama 4 Maverick — Long context, summarization, and function calling
Code & Logic
DeepSeek R1 — Step-by-step reasoning, math, and technical explanations
Best Practices
Choose the right model
Choose the right model
- Use Claude Sonnet 4.6 for general RAG, tools, and long documents
- Use DeepSeek R1 for math, logic, and step-by-step technical explanations
- Use Llama 4 Maverick for very long context and summarization
Optimize top_k
Optimize top_k
- Use 3-5 documents for focused answers
- Use 8-10 documents for comprehensive responses
- Higher values may include irrelevant context
Set appropriate temperature
Set appropriate temperature
- 0.1-0.3 for factual, deterministic answers
- 0.7 for balanced responses (default)
- 0.9-1.0 for creative content generation
Use custom prompts wisely
Use custom prompts wisely
- Add role context in
header_prompt - Specify output format requirements
- Keep prompts concise and clear
Next Steps
Search API
Learn about semantic search
Upload Data
Add documents to your namespace
API Reference
Complete AI generation API docs
Examples
See real-world examples