Skip to main content
POST
/
v1
/
answer
curl -X POST "https://api.moorcheh.ai/v1/answer" \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key-here" \
  -d '{
    "namespace": "my-document-collection",
    "query": "What are the main benefits of Moorcheh?",
    "type": "text",
    "top_k": 5
  }'
{
  "answer": "Serverless architecture offers several benefits, including reduced operational costs as you only pay for what you use, automatic scaling to handle workload changes, and faster time-to-market since developers can focus on code instead of infrastructure management.",
  "model": "deepseek.r1-v1:0",
  "context_count": 3,
  "query": "What are the main benefits of using serverless architecture?"
}

Overview

Generate AI-powered answers to questions with two modes: Search Mode (with namespace) and Direct AI Mode (empty namespace). The API supports context-aware generation using your data or direct AI model calls.
Supports multiple AI models including Claude Sonnet 4.6, Claude Opus 4.6, Llama 4 Maverick, Amazon Nova Pro, DeepSeek, Qwen, and others. Use empty string "" as namespace for direct AI calls.

Authentication

x-api-key
string
required
Your API key for authentication
Content-Type
string
required
Must be application/json

Body Parameters

query
string
required
The user’s question or query to be answered
namespace
string
required
Namespace name for Search Mode, or empty string "" for Direct AI Mode
top_k
number
Number of top relevant chunks for your query across given namespace. Default is 10.
threshold
number
Minimum relevance score threshold (0-1) to filter out chunks below this relevance level. Required when kiosk_mode is true.
temperature
number
AI creativity level (0.0-2.0, default: 0.7). Higher = more creative
type
string
Search type: “text” (default)
ai_model
string
AI model ID (see Available Models table below)
kiosk_mode
boolean
Enable kiosk mode to filter chunks below certain relevance. When kiosk mode is on, threshold is required.
chat_history
array
Previous conversation turns for context (default: [])
header_prompt
string
Custom instruction for AI behavior
Custom instruction to append (default: “Provide a clear and concise answer.”)
structured_response
object
When set with enabled: true, the API returns a JSON object in structured_data matching a schema (default or custom). Optional: schema, tool_name, tool_description (legacy camelCase aliases still accepted). Schemas may use snake_case property names; the service normalizes them for Bedrock. See Structured Output below.
Use snake_case fields in requests/responses. Legacy camelCase aliases are still accepted for backward compatibility and return deprecation headers (Deprecation, Sunset, Warning). CamelCase support is deprecated and scheduled for removal on 1 May 2026.

Available Models

Model IDNameProviderDescriptionCredits
anthropic.claude-sonnet-4-6Claude Sonnet 4.6AnthropicFast flagship: coding, tools, long docs and RAG (~1M context).3
anthropic.claude-opus-4-6-v1Claude Opus 4.6AnthropicDeepest reasoning and hardest tasks; pick when quality matters most (~1M context).3
meta.llama4-maverick-17b-instruct-v1:0Llama 4 Maverick 17BMetaLong context, summarization, function calling, fine-tuning friendly.3
amazon.nova-pro-v1:0Amazon Nova ProAmazonChat, math, and structured answers for AWS-style workloads.2
deepseek.r1-v1:0DeepSeek R1DeepSeekStep-by-step reasoning; math, logic, and technical explanations.1
deepseek.v3.2DeepSeek V3.2DeepSeekEfficient general Q&A, multilingual, everyday RAG (~164K context).2
openai.gpt-oss-120b-1:0OpenAI GPT OSS 120BOpenAILarge generalist: research-style answers and long-form writing.3
qwen.qwen3-32b-v1:0Qwen 3 32BQwenCode and bilingual (EN/ZH) tasks in a smaller footprint.2
qwen.qwen3-next-80b-a3bQwen3 Next 80B A3BQwenMoE model for long chats, docs, and code at scale (~256K context).1

Field Restrictions

Empty Namespace Mode: Only these fields are allowed: namespace, query, temperature, chat_history, footer_prompt, header_prompt, ai_model, structured_responseProvided Namespace Mode: All fields are allowed: namespace, query, top_k, threshold, type, kiosk_mode, ai_model, chat_history, header_prompt, footer_prompt, temperature, structured_response
curl -X POST "https://api.moorcheh.ai/v1/answer" \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key-here" \
  -d '{
    "namespace": "my-document-collection",
    "query": "What are the main benefits of Moorcheh?",
    "type": "text",
    "top_k": 5
  }'
{
  "answer": "Serverless architecture offers several benefits, including reduced operational costs as you only pay for what you use, automatic scaling to handle workload changes, and faster time-to-market since developers can focus on code instead of infrastructure management.",
  "model": "deepseek.r1-v1:0",
  "context_count": 3,
  "query": "What are the main benefits of using serverless architecture?"
}

Response Fields

answer
string
The AI-generated answer based on the provided context and query
model
string
The ID of the AI model used to generate the answer
context_count
number
Number of context chunks retrieved and used for generating the answer
query
string
The original query that was submitted
used_context
boolean
When structured output is enabled: whether retrieved context was used (RAG vs Direct AI). Omitted when structured_response is not used.
structured_data
object
When structured_response.enabled is true: the JSON object matching your schema (or the default). Contains e.g. answer, confidence, sources, summary, topics, follow_up_questions. Omitted otherwise. Property names follow your schema; the default schema uses snake_case (e.g. follow_up_questions).

Structured Output

Include structured_response: { enabled: true } in the request to get a type-safe JSON object in structured_data instead of (or in addition to) the plain answer string. Works in both Search Mode and Direct AI Mode. Custom schema objects may use snake_case keys; they are normalized for Bedrock (camelCase) internally.

Enabling

{
  "namespace": "my-namespace",
  "query": "What are the main benefits of product X?",
  "structured_response": { "enabled": true }
}
With only enabled: true, the default schema is used.

structured_response fields

FieldTypeRequiredDefaultDescription
enabledbooleanYesMust be true to enable structured output.
schemaobjectNo(default schema)JSON Schema for the structured object. Omit or null to use the default. Prefer snake_case property names under properties; they are converted for Bedrock.
tool_namestringNo"structured_response"Internal tool name for structured output.
tool_descriptionstringNo"Generate a structured response based on the user's query and provided context"Description for the model.

Default schema

If schema is omitted or null, the default schema is used:
{
  "type": "object",
  "properties": {
    "answer": {
      "type": "string",
      "description": "The main answer to the user's query"
    },
    "confidence": {
      "type": "number",
      "description": "Confidence score from 0 to 1 indicating how confident the model is in the answer",
      "minimum": 0,
      "maximum": 1
    },
    "sources": {
      "type": "array",
      "description": "List of source references used to generate the answer",
      "items": {
        "type": "object",
        "properties": {
          "id": {
            "type": "string",
            "description": "The identifier of the source document/chunk"
          },
          "relevance": {
            "type": "string",
            "description": "How relevant this source was to the answer",
            "enum": ["high", "medium", "low"]
          }
        },
        "required": ["id"]
      }
    },
    "summary": {
      "type": "string",
      "description": "A brief summary of the answer (max 200 characters)",
      "maxLength": 200
    },
    "topics": {
      "type": "array",
      "description": "Key topics or themes identified in the query/answer",
      "items": { "type": "string" }
    },
    "follow_up_questions": {
      "type": "array",
      "description": "Suggested follow-up questions the user might want to ask",
      "items": { "type": "string" }
    }
  },
  "required": ["answer", "confidence"]
}
  • Required: answer, confidence
  • Optional: sources, summary, topics, follow_up_questions

Custom schema

Pass your own JSON Schema in structured_response.schema. Use type, properties, required, items, enum, minimum, maximum, maxLength, etc. Keep required minimal so the model can fill all fields.
{
  "namespace": "docs",
  "query": "What is the return policy?",
  "structured_response": {
    "enabled": true,
    "schema": {
      "type": "object",
      "properties": {
        "answer": { "type": "string" },
        "risk_level": { "type": "string", "enum": ["low", "medium", "high"] }
      },
      "required": ["answer", "risk_level"]
    }
  }
}

Structured output errors

SituationHTTPNotes
enabled not trueNormal text response; no structured_data.
Invalid structured_response or disallowed body field400Rejected by validation.
Model does not return structured data500"Error generating structured AI response: Model did not return structured data via tool use".

API Modes

Search Mode (with namespace)

When you provide a namespace, the API searches your data for relevant context and uses it to generate contextual answers.

Direct AI Mode (empty namespace)

When you pass an empty string "" as namespace, the API makes a direct call to the AI model without searching your data.

Temperature Guide

  • 0.0-0.5: Conservative, factual responses - best for technical documentation
  • 0.5-1.0: Balanced creativity - good for general Q&A
  • 1.0-2.0: More creative and varied responses - use carefully for factual content

Relevance Score Threshold

Results are scored using Information Theoretic Similarity (ITS), providing nuanced relevance measurements:
LabelScore RangeDescription
Close Matchscore ≥ 0.894Near-perfect relevance to the query
Very High Relevance0.632 ≤ score < 0.894Strongly related content
High Relevance0.447 ≤ score < 0.632Significantly related content
Good Relevance0.316 ≤ score < 0.447Moderately related content
Low Relevance0.224 ≤ score < 0.316Minimally related content
Very Low Relevance0.1 ≤ score < 0.224Barely related content
Irrelevantscore < 0.1No meaningful relation to the query

Important Notes

  • Search Mode: The namespace must exist and contain indexed data for meaningful results
  • Search Mode: Higher top_k values provide more context but may increase response time
  • Search Mode: The threshold parameter can be used to filter low-relevance results
  • Direct AI Mode: Use empty string "" as namespace for direct AI model calls
  • Field Restrictions: Empty namespace mode only allows basic AI fields, not search-specific fields
  • Field Restrictions: Provided namespace mode allows all fields including search parameters
  • Chat history enables conversational context across multiple queries
  • Custom prompts allow fine-tuning of AI behavior and response format
  • Temperature controls creativity: 0.0 for deterministic, 2.0 for highly creative responses
  • Failed requests still count towards usage limits and are tracked in statistics
  • Some models may have different token limits and capabilities

Use Cases

  • Customer Support: Answer customer questions using your documentation
  • Internal Q&A: Help employees find answers in company knowledge bases
  • Educational Tools: Create AI tutors using educational content
  • Research Assistance: Get insights from research papers and publications
  • Technical Support: Provide technical answers based on documentation
  • Content Creation: Generate content based on existing materials