Data Management

documents.upload

Uploads text documents to a text namespace. Moorcheh will process and embed these asynchronously.

Parameters

namespace_name

str

required

The name of the target text namespace.

documents

List[Dict]

required

A list of dictionaries. Each dict requires an id and text key.

Returns: Dict[str, Any] - A dictionary confirming the documents were queued. Raises: NamespaceNotFound, InvalidInputError.

Upload Documents Example

documents_to_upload = [
    {"id": "faq-1", "text": "To reset your password, go to the account settings page.", "category": "account"},
    {"id": "faq-2", "text": "Our return policy allows returns within 30 days of purchase.", "category": "shipping"}
]

status = client.documents.upload(
    namespace_name="my-faq-documents",
    documents=documents_to_upload
)
print(status)

vectors.upload

Uploads pre-computed vectors to a vector namespace. This is a synchronous operation.

Parameters

namespace_name

str

required

The name of the target vector namespace.

vectors

List[Dict]

required

A list of dictionaries. Each dict requires an id and a vector key.

Returns: Dict[str, Any] - A dictionary confirming the upload status. Raises: NamespaceNotFound, InvalidInputError.

Upload Vectors Example

vectors_to_upload = [
    {"id": "image_001.jpg", "vector": [0.12, -0.45, ...], "metadata": {"source": "product_database"}},
    {"id": "image_002.jpg", "vector": [-0.22, 0.81, ...], "metadata": {"source": "product_database"}}
]

status = client.vectors.upload(
    namespace_name="my-image-embeddings",
    vectors=vectors_to_upload
)
print(status)

documents.delete

Deletes specific documents from a text namespace by their IDs.

Parameters

namespace_name

str

required

The name of the target text namespace.

ids

List[Union[str, int]]

required

A list of document IDs to delete.

Returns: Dict[str, Any] - A dictionary confirming the deletion status.

Delete Documents Example

# Delete specific documents by ID
result = client.documents.delete(
    namespace_name="my-faq-documents",
    ids=["faq-1", "faq-3", "faq-5"]
)
print(f"Deletion result: {result}")

vectors.delete

Deletes specific vectors from a vector namespace by their IDs.

Parameters

namespace_name

str

required

The name of the target vector namespace.

ids

List[Union[str, int]]

required

A list of vector IDs to delete.

Returns: Dict[str, Any] - A dictionary confirming the deletion status.

Delete Vectors Example

# Delete specific vectors by ID
result = client.vectors.delete(
    namespace_name="my-image-embeddings",
    ids=["image_001.jpg", "image_002.jpg"]
)
print(f"Deletion result: {result}")

Complete Data Management Example

Complete Data Management Workflow

from moorcheh_sdk import MoorchehClient
import time

with MoorchehClient() as client:
    # 1. Create a namespace
    client.namespaces.create("my-data", type="text")

    # 2. Upload documents
    docs = [
        {
            "id": "doc-1",
            "text": "This is the first document",
            "category": "tutorial",
            "author": "John Doe"
        },
        {
            "id": "doc-2",
            "text": "This is the second document",
            "category": "guide",
            "author": "Jane Smith"
        }
    ]

    upload_result = client.documents.upload(namespace_name="my-data", documents=docs)
    print(f"Upload status: {upload_result}")

    # 3. Wait for processing (text documents need time to be embedded)
    print("Waiting for document processing...")
    time.sleep(5)

    # 4. Delete specific documents if needed
    delete_result = client.documents.delete(namespace_name="my-data", ids=["doc-1"])
    print(f"Delete result: {delete_result}")

Document Structure Best Practices

Required Fields

id: Unique identifier for the document (string or number)
text: The main content to be searched (string)

Optional Metadata

You can include any additional fields as metadata:

Well-Structured Documents

documents = [
    {
        "id": "article-123",
        "text": "Full article content here...",
        # Metadata fields : The other kwargs are considered as metadata
        "title": "Introduction to Machine Learning",
        "author": "Dr. Smith",
        "category": "education",
        "publish_date": "2024-01-15",
        "tags": ["ml", "ai", "tutorial"],
        "difficulty": "beginner"
    }
]

Vector Data Structure

For vector uploads, ensure your vectors match the namespace dimension:

Vector Structure

vectors = [
    {
        "id": "embedding-1",
        "vector": [0.1, 0.2, 0.3, ...],  # Must match namespace dimension
        # Optional metadata
        "source": "image_database",
        "category": "product",
        "confidence": 0.95
    }
]

Important Notes

Asynchronous Processing: Text documents are processed asynchronously. Allow a few seconds after upload before searching.

ID Uniqueness: Document and vector IDs must be unique within their namespace. Uploading with an existing ID will overwrite the previous entry.

Batch Processing: For large datasets, upload documents in batches of 100-1000 items for optimal performance.

Getting Started

Core Operations

Best Practices

Old Version

documents.upload

Parameters

vectors.upload

Parameters

documents.delete

Parameters

vectors.delete

Parameters

Complete Data Management Example

Document Structure Best Practices

Required Fields

Optional Metadata

Vector Data Structure

Important Notes

Getting Started

Core Operations

Best Practices

Old Version

​documents.upload

​Parameters

​vectors.upload

​Parameters

​documents.delete

​Parameters

​vectors.delete

​Parameters

​Complete Data Management Example

​Document Structure Best Practices

​Required Fields

​Optional Metadata

​Vector Data Structure

​Important Notes

documents.upload

Parameters

vectors.upload

Parameters

documents.delete

Parameters

vectors.delete

Parameters

Complete Data Management Example

Document Structure Best Practices

Required Fields

Optional Metadata

Vector Data Structure

Important Notes