Upload Text Data

documents.upload

Uploads text documents to a text namespace. Moorcheh will process and embed these asynchronously.

Parameters

namespace_name

str

required

The name of the target text namespace.

documents

List[Dict]

required

A list of dictionaries. Each dict requires an id and text key.

Returns: Dict[str, Any] - A dictionary confirming the documents were queued. Raises: NamespaceNotFound, InvalidInputError.

Example

Upload Documents Example

from moorcheh_sdk import MoorchehClient

with MoorchehClient() as client:
    documents_to_upload = [
        {
            "id": "faq-1",
            "text": "To reset your password, go to the account settings page.",
            "category": "account"
        },
        {
            "id": "faq-2",
            "text": "Our return policy allows returns within 30 days of purchase.",
            "category": "shipping"
        }
    ]

    status = client.documents.upload(
        namespace_name="my-faq-documents",
        documents=documents_to_upload
    )
    print(status)

Document Structure

Each document in the documents array is a flat object with these properties:

id (required): Unique identifier for the document (string or number)
text (required): The main text content of the document
Additional fields: Any other fields are treated as metadata

Well-Structured Documents

documents = [
    {
        "id": "article-123",
        "text": "Full article content here...",
        # Metadata fields - all other kwargs are considered as metadata
        "title": "Introduction to Machine Learning",
        "author": "Dr. Smith",
        "category": "education",
        "publish_date": "2024-01-15",
        "tags": ["ml", "ai", "tutorial"],
        "difficulty": "beginner"
    }
]

Complete Example

Complete Data Management Workflow

from moorcheh_sdk import MoorchehClient
import time

with MoorchehClient() as client:
    # 1. Create a namespace
    client.namespaces.create("my-data", type="text")

    # 2. Upload documents
    docs = [
        {
            "id": "doc-1",
            "text": "This is the first document",
            "category": "tutorial",
            "author": "John Doe"
        },
        {
            "id": "doc-2",
            "text": "This is the second document",
            "category": "guide",
            "author": "Jane Smith"
        }
    ]

    upload_result = client.documents.upload(namespace_name="my-data", documents=docs)
    print(f"Upload status: {upload_result}")

    # 3. Wait for processing (text documents need time to be embedded)
    print("Waiting for document processing...")
    time.sleep(5)

Important Notes

Asynchronous Processing: Text documents are processed asynchronously. Allow a few seconds after upload before searching.

ID Uniqueness: Document IDs must be unique within their namespace. Uploading with an existing ID will overwrite the previous entry.

Batch Processing: For large datasets, upload documents in batches of 100-1000 items for optimal performance.

Best Practices

Keep documents focused on a single topic
Include meaningful titles and metadata
Use consistent metadata schemas across documents
Break large documents into logical chunks
Upload in batches of 25-50 documents for optimal performance
Use meaningful document IDs for easier management

Document Limits

Text Length: Min 10 characters, Max 50,000 characters per document
Batch Size: Max 100 documents per request, Recommended 25-50
Metadata Size: Max 2KB per document, Up to 50 metadata keys

Get Documents - Retrieve uploaded documents
Delete Data - Remove specific documents
Search - Search uploaded documents

Getting Started

Namespace Management

Data Operations

Search & AI

Best Practices

Legacy

documents.upload

Parameters

Example

Document Structure

Complete Example

Important Notes

Best Practices

Document Limits

Getting Started

Namespace Management

Data Operations

Search & AI

Best Practices

Legacy

​documents.upload

​Parameters

​Example

​Document Structure

​Complete Example

​Important Notes

​Best Practices

​Document Limits

​Related Operations

documents.upload

Parameters

Example

Document Structure

Complete Example

Important Notes

Best Practices

Document Limits

Related Operations