AI Answer Generation

Overview

Generate AI-powered answers to questions using your uploaded data as context. Moorcheh supports 9 state-of-the-art AI models for intelligent answer generation.

The Answer API supports two modes: Search Mode (with namespace) and Direct AI Mode (empty namespace for direct model calls).

Supported AI Models

Model ID	Provider	Description
`anthropic.claude-sonnet-4-20250514-v1:0`	Anthropic	Hybrid reasoning, efficient code generation
`anthropic.claude-sonnet-4-5-20250929-v1:0`	Anthropic	Latest Claude with agentic search
`anthropic.claude-opus-4-5-20251101-v1:0`	Anthropic	Most advanced Claude with superior reasoning
`meta.llama4-maverick-17b-instruct-v1:0`	Meta	1M token context, function calling
`meta.llama3-3-70b-instruct-v1:0`	Meta	Advanced reasoning capabilities
`amazon.nova-pro-v1:0`	Amazon	300K context, complex reasoning
`deepseek.r1-v1:0`	DeepSeek	Advanced reasoning and code generation
`openai.gpt-oss-120b-1:0`	OpenAI	Hybrid reasoning, research
`qwen.qwen3-32b-v1:0`	Qwen	Text and code generation

Basic Usage

from moorcheh_sdk import MoorchehClient

client = MoorchehClient(api_key="your-api-key")

# Generate answer from your data
answer = client.get_answer(
    namespace="my-documents",
    query="What is Moorcheh?",
    ai_model="anthropic.claude-sonnet-4-20250514-v1:0"
)

print(answer["answer"])

Search Mode vs Direct AI Mode

Search Mode
Direct AI Mode

Search Mode (with namespace)

When you provide a namespace, the API searches your data for relevant context and uses it to generate contextual answers.

# Answer based on your data
answer = client.get_answer(
    namespace="my-documents",
    query="What are the main features?",
    ai_model="anthropic.claude-sonnet-4-20250514-v1:0",
    top_k=5  # Number of documents to consider
)

Best for:

Q&A over your documents
Knowledge base queries
Context-aware responses

Direct AI Mode (empty namespace)

Pass an empty string "" as namespace to make direct calls to the AI model without searching your data.

# Direct AI model call
answer = client.get_answer(
    namespace="",  # Empty for direct mode
    query="Explain quantum computing",
    ai_model="anthropic.claude-sonnet-4-20250514-v1:0"
)

Best for:

General knowledge questions
Code generation
Content creation

Advanced Parameters

Temperature Control

Control response creativity (0.0 - 2.0):

# More deterministic (lower temperature)
answer = client.get_answer(
    namespace="my-documents",
    query="List the API endpoints",
    ai_model="anthropic.claude-sonnet-4-20250514-v1:0",
    temperature=0.1
)

# More creative (higher temperature)
answer = client.get_answer(
    namespace="my-documents",
    query="Write a blog post about our features",
    ai_model="anthropic.claude-sonnet-4-20250514-v1:0",
    temperature=0.9
)

Custom Prompts

Add custom instructions for the AI:

answer = client.get_answer(
    namespace="my-documents",
    query="Explain our pricing",
    ai_model="anthropic.claude-sonnet-4-20250514-v1:0",
    headerPrompt="You are a helpful sales assistant. Be concise and friendly.",
    footerPrompt="Always end with a call to action."
)

Chat History

Maintain conversation context:

answer = client.get_answer(
    namespace="my-documents",
    query="What about the advanced features?",
    ai_model="anthropic.claude-sonnet-4-20250514-v1:0",
    chatHistory=[
        {"role": "user", "content": "What features do you offer?"},
        {"role": "assistant", "content": "We offer semantic search, vector storage..."}
    ]
)

Response Format

{
  "answer": "Moorcheh is a lightning-fast semantic search engine...",
  "sources": [
    {
      "id": "doc-123",
      "score": 0.95,
      "text": "Source content..."
    }
  ]
}

Model Selection Guide

General Purpose

Claude Sonnet 4 - Best balance of speed and quality

Advanced Reasoning

Claude Sonnet 4.5 - Latest model with agentic capabilities

Long Context

Llama 4 Maverick - 1M token context window

Code Generation

DeepSeek R1 - Specialized for coding tasks

Best Practices

Choose the right model

Use Claude Sonnet 4 for general queries
Use DeepSeek R1 for code-related questions
Use Llama 4 Maverick for very long documents

Optimize top_k

Use 3-5 documents for focused answers
Use 8-10 documents for comprehensive responses
Higher values may include irrelevant context

Set appropriate temperature

0.1-0.3 for factual, deterministic answers
0.7 for balanced responses (default)
0.9-1.0 for creative content generation

Use custom prompts wisely

Add role context in headerPrompt
Specify output format requirements
Keep prompts concise and clear

Next Steps

Search API

Learn about semantic search

Upload Data

Add documents to your namespace

API Reference

Complete AI generation API docs

Examples

See real-world examples

Getting Started

Core Concepts

Advanced

Overview

Supported AI Models

Basic Usage

Search Mode vs Direct AI Mode

Search Mode (with namespace)

Direct AI Mode (empty namespace)

Advanced Parameters

Temperature Control

Custom Prompts

Chat History

Response Format

Model Selection Guide

General Purpose

Advanced Reasoning

Long Context

Code Generation

Best Practices

Next Steps

Search API

Upload Data

API Reference

Examples

Getting Started

Core Concepts

Advanced

​Overview

​Supported AI Models

​Basic Usage

​Search Mode vs Direct AI Mode

​Search Mode (with namespace)

​Direct AI Mode (empty namespace)

​Advanced Parameters

​Temperature Control

​Custom Prompts

​Chat History

​Response Format

​Model Selection Guide

General Purpose

Advanced Reasoning

Long Context

Code Generation

​Best Practices

​Next Steps

Search API

Upload Data

API Reference

Examples

Overview

Supported AI Models

Basic Usage

Search Mode vs Direct AI Mode

Search Mode (with namespace)

Direct AI Mode (empty namespace)

Advanced Parameters

Temperature Control

Custom Prompts

Chat History

Response Format

Model Selection Guide

Best Practices

Next Steps