Skip to main content
POST
https://api.moorcheh.ai
/
v1
/
namespaces
/
{namespace_name}
/
documents
curl -X POST "https://api.moorcheh.ai/v1/namespaces/my-documents/documents" \
  -H "Content-Type: application/json" \
  -H "x-api-Key: your-api-key-here" \
  -d '{
    "documents": [
      {
        "text": "Machine learning is a subset of artificial intelligence that enables computers to learn and improve from experience without being explicitly programmed.",
        "title": "Introduction to Machine Learning",
        "metadata": {
          "category": "education",
          "difficulty": "beginner",
          "author": "Dr. Smith"
        }
      }
    ]
  }'
{
  "status": "success",
  "message": "2 documents uploaded successfully to namespace 'technical-docs'",
  "upload_id": "upload_1234567890",
  "namespace_name": "technical-docs",
  "documents_processed": 2,
  "processing_status": "in_progress",
  "estimated_completion": "2024-01-15T10:35:00Z",
  "uploaded_documents": [
    {
      "id": "doc_001",
      "status": "processing",
      "character_count": 89
    },
    {
      "id": "doc_002",
      "status": "processing",
      "character_count": 112
    }
  ]
}

Overview

Upload text documents to a text namespace, enabling semantic search, similarity matching, and AI-powered question answering. The API supports various formats and automatic text processing with metadata enrichment.
Documents are automatically processed to generate embeddings using Amazon Bedrock for optimal search performance.

Authentication

x-api-Key
string
required
Your API key for authentication
Content-Type
string
required
Must be application/json

Path Parameters

namespace_name
string
required
Name of the text namespace to upload documents to

Body Parameters

documents
array
required
Array of document objects to upload

Document Object

documents[].id
string
Optional unique identifier for the document. Auto-generated if not provided.
documents[].text
string
required
The main text content of the document
documents[].metadata
object
Optional metadata object for filtering and organization
documents[].title
string
Optional document title
curl -X POST "https://api.moorcheh.ai/v1/namespaces/my-documents/documents" \
  -H "Content-Type: application/json" \
  -H "x-api-Key: your-api-key-here" \
  -d '{
    "documents": [
      {
        "text": "Machine learning is a subset of artificial intelligence that enables computers to learn and improve from experience without being explicitly programmed.",
        "title": "Introduction to Machine Learning",
        "metadata": {
          "category": "education",
          "difficulty": "beginner",
          "author": "Dr. Smith"
        }
      }
    ]
  }'
{
  "status": "success",
  "message": "2 documents uploaded successfully to namespace 'technical-docs'",
  "upload_id": "upload_1234567890",
  "namespace_name": "technical-docs",
  "documents_processed": 2,
  "processing_status": "in_progress",
  "estimated_completion": "2024-01-15T10:35:00Z",
  "uploaded_documents": [
    {
      "id": "doc_001",
      "status": "processing",
      "character_count": 89
    },
    {
      "id": "doc_002",
      "status": "processing",
      "character_count": 112
    }
  ]
}

Response Fields

Success Response (202)

status
string
Status of the upload (“success” for successful uploads)
message
string
Human-readable confirmation message
upload_id
string
Unique identifier for tracking this upload batch
namespace_name
string
Name of the namespace where documents were uploaded
documents_processed
number
Number of documents successfully processed
processing_status
string
Current status: “in_progress”, “completed”, or “failed”
estimated_completion
string
Estimated ISO 8601 timestamp when processing will complete
uploaded_documents
array
Array of uploaded document status objects

Document Status Object

id
string
Document identifier
status
string
Processing status: “processing”, “completed”, or “failed”
character_count
number
Number of characters in the document text

Processing Pipeline

1

Upload Validation

Documents are validated for format, size, and content requirements
2

Text Processing

Content is cleaned, normalized, and prepared for embedding generation
3

Embedding Generation

High-quality embeddings are generated using Amazon Bedrock
4

Index Updates

Documents are added to the search index for immediate availability
5

Metadata Enrichment

Optional metadata processing and enrichment

Document Limits

Text Length

Min: 10 characters Max: 50,000 characters per document

Batch Size

Max: 100 documents per request Recommended: 25-50 documents for optimal performance

Metadata Size

Max: 2KB per document Keys: Up to 50 metadata keys

Processing Time

Typical: 1-5 seconds per document Large batches: 30-120 seconds

Best Practices

  • Keep documents focused on a single topic
  • Include meaningful titles and metadata
  • Use consistent metadata schemas across documents
  • Break large documents into logical chunks
  • Use consistent key naming conventions
  • Include searchable categories and tags
  • Add temporal metadata (created_at, updated_at)
  • Consider user access levels in metadata
  • Upload in batches of 25-50 documents
  • Use meaningful document IDs for easier management
  • Monitor processing status for large uploads
  • Implement retry logic for failed uploads

Use Cases

  • Knowledge Base: Build searchable documentation and knowledge repositories
  • Content Management: Store and organize articles, blog posts, and content
  • Customer Support: Upload support documents for AI-powered assistance
  • Research: Organize and search through research papers and publications
  • Legal Documents: Store and search legal documents with metadata filtering
  • Training Materials: Upload educational content for learning applications