AI Generation
Generate AI Answer
Generate AI-powered answers to questions using your uploaded data as context or direct AI model calls.
POST
Overview
Generate AI-powered answers to questions with two modes: Search Mode (with namespace) and Direct AI Mode (empty namespace). The API supports context-aware generation using your data or direct AI model calls.Supports multiple AI models including Claude Sonnet 4.6, Claude Opus 4.6, Llama 4 Maverick, Amazon Nova Pro, DeepSeek, Qwen, and others. Use empty string "" as namespace for direct AI calls.
Authentication
Your API key for authentication
Must be
application/jsonBody Parameters
The user’s question or query to be answered
Namespace name for Search Mode, or empty string "" for Direct AI Mode
Number of top relevant chunks for your query across given namespace. Default is 10.
Minimum relevance score threshold (0-1) to filter out chunks below this relevance level. Required when kiosk_mode is true.
AI creativity level (0.0-2.0, default: 0.7). Higher = more creative
Search type: “text” (default)
AI model ID (see Available Models table below)
Enable kiosk mode to filter chunks below certain relevance. When kiosk mode is on, threshold is required.
Previous conversation turns for context (default: [])
Custom instruction for AI behavior
Custom instruction to append (default: “Provide a clear and concise answer.”)
When set with
enabled: true, the API returns a JSON object in structured_data matching a schema (default or custom). Optional: schema, tool_name, tool_description. Schemas may use snake_case property names; the service normalizes them for Bedrock. See Structured Output below.Use snake_case fields in requests and responses only. Legacy camelCase aliases were removed in platform version 1.5.10 (May 2026). If you still send camelCase field names, update your integration to snake_case.
Available Models
| Model ID | Name | Provider | Description | Credits |
|---|---|---|---|---|
| anthropic.claude-sonnet-4-6 | Claude Sonnet 4.6 | Anthropic | Fast flagship: coding, tools, long docs and RAG (~1M context). | 3 |
| anthropic.claude-opus-4-6-v1 | Claude Opus 4.6 | Anthropic | Deepest reasoning and hardest tasks; pick when quality matters most (~1M context). | 3 |
| meta.llama4-maverick-17b-instruct-v1:0 | Llama 4 Maverick 17B | Meta | Long context, summarization, function calling, fine-tuning friendly. | 3 |
| amazon.nova-pro-v1:0 | Amazon Nova Pro | Amazon | Chat, math, and structured answers for AWS-style workloads. | 2 |
| deepseek.r1-v1:0 | DeepSeek R1 | DeepSeek | Step-by-step reasoning; math, logic, and technical explanations. | 1 |
| deepseek.v3.2 | DeepSeek V3.2 | DeepSeek | Efficient general Q&A, multilingual, everyday RAG (~164K context). | 2 |
| openai.gpt-oss-120b-1:0 | OpenAI GPT OSS 120B | OpenAI | Large generalist: research-style answers and long-form writing. | 3 |
| qwen.qwen3-32b-v1:0 | Qwen 3 32B | Qwen | Code and bilingual (EN/ZH) tasks in a smaller footprint. | 2 |
| qwen.qwen3-next-80b-a3b | Qwen3 Next 80B A3B | Qwen | MoE model for long chats, docs, and code at scale (~256K context). | 1 |
Field Restrictions
Response Fields
The AI-generated answer based on the provided context and query
The ID of the AI model used to generate the answer
Number of context chunks retrieved and used for generating the answer
The original query that was submitted
When structured output is enabled: whether retrieved context was used (RAG vs Direct AI). Omitted when
structured_response is not used.When
structured_response.enabled is true: the JSON object matching your schema (or the default). Contains e.g. answer, confidence, sources, summary, topics, follow_up_questions. Omitted otherwise. Property names follow your schema; the default schema uses snake_case (e.g. follow_up_questions).Structured Output
Includestructured_response: { enabled: true } in the request to get a type-safe JSON object in structured_data instead of (or in addition to) the plain answer string. Works in both Search Mode and Direct AI Mode. Custom schema objects may use snake_case keys; they are normalized for Bedrock (camelCase) internally.
Enabling
enabled: true, the default schema is used.
structured_response fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
enabled | boolean | Yes | - | Must be true to enable structured output. |
schema | object | No | (default schema) | JSON Schema for the structured object. Omit or null to use the default. Prefer snake_case property names under properties; they are converted for Bedrock. |
tool_name | string | No | "structured_response" | Internal tool name for structured output. |
tool_description | string | No | "Generate a structured response based on the user's query and provided context" | Description for the model. |
Default schema
Ifschema is omitted or null, the default schema is used:
- Required:
answer,confidence - Optional:
sources,summary,topics,follow_up_questions
Custom schema
Pass your own JSON Schema instructured_response.schema. Use type, properties, required, items, enum, minimum, maximum, maxLength, etc. Keep required minimal so the model can fill all fields.
Structured output errors
| Situation | HTTP | Notes |
|---|---|---|
enabled not true | - | Normal text response; no structured_data. |
Invalid structured_response or disallowed body field | 400 | Rejected by validation. |
| Model does not return structured data | 500 | "Error generating structured AI response: Model did not return structured data via tool use". |
API Modes
Search Mode (with namespace)
When you provide a namespace, the API searches your data for relevant context and uses it to generate contextual answers.Direct AI Mode (empty namespace)
When you pass an empty string"" as namespace, the API makes a direct call to the AI model without searching your data.
Temperature Guide
- 0.0-0.5: Conservative, factual responses - best for technical documentation
- 0.5-1.0: Balanced creativity - good for general Q&A
- 1.0-2.0: More creative and varied responses - use carefully for factual content
Relevance Score Threshold
Results are scored using Information Theoretic Similarity (ITS), providing nuanced relevance measurements:| Label | Score Range | Description |
|---|---|---|
| Close Match | score ≥ 0.894 | Near-perfect relevance to the query |
| Very High Relevance | 0.632 ≤ score < 0.894 | Strongly related content |
| High Relevance | 0.447 ≤ score < 0.632 | Significantly related content |
| Good Relevance | 0.316 ≤ score < 0.447 | Moderately related content |
| Low Relevance | 0.224 ≤ score < 0.316 | Minimally related content |
| Very Low Relevance | 0.1 ≤ score < 0.224 | Barely related content |
| Irrelevant | score < 0.1 | No meaningful relation to the query |
Important Notes
- Search Mode: The namespace must exist and contain indexed data for meaningful results
- Search Mode: Higher top_k values provide more context but may increase response time
- Search Mode: The threshold parameter can be used to filter low-relevance results
- Direct AI Mode: Use empty string
""as namespace for direct AI model calls - Field Restrictions: Empty namespace mode only allows basic AI fields, not search-specific fields
- Field Restrictions: Provided namespace mode allows all fields including search parameters
- Chat history enables conversational context across multiple queries
- Custom prompts allow fine-tuning of AI behavior and response format
- Temperature controls creativity: 0.0 for deterministic, 2.0 for highly creative responses
- Failed requests still count towards usage limits and are tracked in statistics
- Some models may have different token limits and capabilities
Use Cases
- Customer Support: Answer customer questions using your documentation
- Internal Q&A: Help employees find answers in company knowledge bases
- Educational Tools: Create AI tutors using educational content
- Research Assistance: Get insights from research papers and publications
- Technical Support: Provide technical answers based on documentation
- Content Creation: Generate content based on existing materials
Related Endpoints
- Search - Search for relevant context documents
- Upload Text Data - Add documents for AI context
- List Namespaces - View available namespaces for generation