Upload Text Data

Overview

Upload text documents to a text namespace, enabling semantic search, similarity matching, and AI-powered question answering. The API supports various formats and automatic text processing with metadata enrichment.

Documents are automatically processed to generate embeddings using Amazon Bedrock for optimal search performance.

Authentication

x-api-key

string

required

Your API key for authentication

Content-Type

string

required

Must be application/json

Path Parameters

namespace_name

string

required

Name of the text namespace to upload documents to

Body Parameters

documents

array

required

Array of document objects. Each object in the array is a flat object with id, text, and optional metadata fields as direct properties.

Document Object Properties

Each object in the documents array is a flat object with these properties:

documents[].id

string

required

Unique identifier for the document. Must be a non-empty string or number. This is a direct property of the document object, not a nested object.

documents[].text

string

required

The main text content of the document. This is a direct property of the document object, not a nested object.

documents[].*

any

Optional metadata fields for filtering and organization. Any additional fields beyond id and text are treated as metadata.

Metadata:

All key-value pairs other than id and text are considered as metadata
Metadata is optional, but recommended
You can add any additional metadata fields as key-value pairs according to your preference for filtering and organization

curl -X POST "https://api.moorcheh.ai/v1/namespaces/demo-namespace/documents" \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key-here" \
  -d '{
    "documents": [
      {
        "id": "doc_001",
        "text": "Machine learning is a subset of artificial intelligence that enables computers to learn and improve from experience without being explicitly programmed.",
        "title": "Introduction to Machine Learning",
        "category": "education",
        "difficulty": "beginner",
        "author": "Dr. Smith"
      }
    ]
  }'

{
  "status": "success",
  "message": "2 documents uploaded successfully to namespace 'technical-docs'",
  "upload_id": "upload_1234567890",
  "namespace_name": "technical-docs",
  "documents_processed": 2,
  "processing_status": "in_progress",
  "estimated_completion": "2024-01-15T10:35:00Z",
  "uploaded_documents": [
    {
      "id": "doc_001",
      "status": "processing",
      "character_count": 89
    },
    {
      "id": "doc_002",
      "status": "processing",
      "character_count": 112
    }
  ]
}

Response Fields

Success Response (202)

status

string

Status of the upload (“success” for successful uploads)

message

string

Human-readable confirmation message

upload_id

string

Unique identifier for tracking this upload batch

namespace_name

string

Name of the namespace where documents were uploaded

documents_processed

number

Number of documents successfully processed

processing_status

string

Current status: “in_progress”, “completed”, or “failed”

estimated_completion

string

Estimated ISO 8601 timestamp when processing will complete

uploaded_documents

array

Array of uploaded document status objects

Document Status Object

string

Document identifier

status

string

Processing status: “processing”, “completed”, or “failed”

character_count

number

Number of characters in the document text

Processing Pipeline

Upload Validation

Documents are validated for format, size, and content requirements

Text Processing

Content is cleaned, normalized, and prepared for embedding generation

Embedding Generation

High-quality embeddings are generated using Amazon Bedrock

Index Updates

Documents are added to the search index for immediate availability

Metadata Enrichment

Optional metadata processing and enrichment

Document Limits

Text Length

Min: 10 characters Max: 50,000 characters per document

Batch Size

Max: 100 documents per request Recommended: 25-50 documents for optimal performance

Metadata Size

Max: 2KB per document Keys: Up to 50 metadata keys

Processing Time

Typical: 1-5 seconds per document Large batches: 30-120 seconds

Best Practices

Optimal Document Structure

Keep documents focused on a single topic
Include meaningful titles and metadata
Use consistent metadata schemas across documents
Break large documents into logical chunks

Metadata Strategy

Use consistent key naming conventions
Include searchable categories and tags
Add temporal metadata (created_at, updated_at)
Consider user access levels in metadata

Performance Optimization

Upload in batches of 25-50 documents
Use meaningful document IDs for easier management
Monitor processing status for large uploads
Implement retry logic for failed uploads

Use Cases

Knowledge Base: Build searchable documentation and knowledge repositories
Content Management: Store and organize articles, blog posts, and content
Customer Support: Upload support documents for AI-powered assistance
Research: Organize and search through research papers and publications
Legal Documents: Store and search legal documents with metadata filtering
Training Materials: Upload educational content for learning applications

Search - Search uploaded text documents
Get Documents - Retrieve document information
Delete Data - Remove uploaded documents
Create Namespace - Create text namespaces for uploads

Getting Started

Namespace Management

Search & Discovery

AI Generation

Data Operations

Overview

Authentication

Path Parameters

Body Parameters

Document Object Properties

Response Fields

Success Response (202)

Document Status Object

Processing Pipeline

Document Limits

Text Length

Batch Size

Metadata Size

Processing Time

Best Practices

Use Cases

Getting Started

Namespace Management

Search & Discovery

AI Generation

Data Operations

​Overview

​Authentication

​Path Parameters

​Body Parameters

​Document Object Properties

​Response Fields

​Success Response (202)

​Document Status Object

​Processing Pipeline

​Document Limits

Text Length

Batch Size

Metadata Size

Processing Time

​Best Practices

​Use Cases

​Related Endpoints

Overview

Authentication

Path Parameters

Body Parameters

Document Object Properties

Response Fields

Success Response (202)

Document Status Object

Processing Pipeline

Document Limits

Best Practices

Use Cases

Related Endpoints