Get Documents - Moorcheh Documentation

documents.get

Retrieves specific documents by their IDs from a namespace. This endpoint allows you to fetch documents that have been previously uploaded and indexed.

This method retrieves documents that have been previously uploaded and indexed in the specified namespace. For semantic search and similarity-based retrieval, use the Search API.

Parameters

namespace_name

str

required

The name of the namespace containing the documents.

ids

List[Union[str, int]]

required

A list of document IDs to retrieve (max 100 IDs per request).

Returns: Dict[str, Any] - A dictionary containing the retrieved documents. Raises: NamespaceNotFound, InvalidInputError.

Example

Get Documents Example

from moorcheh_sdk import MoorchehClient

with MoorchehClient() as client:
    # Get specific documents by ID
    result = client.documents.get(
        namespace_name="my-faq-documents",
        ids=["faq-1", "faq-2", "faq-3"]
    )
    
    for item in result.get('items', []):
        print(f"ID: {item['id']}")
        print(f"Text: {item['text']}")
        print(f"Metadata: {item.get('metadata', {})}")

Response Structure

The response contains:

status (str): “success” or “partial”
message (str): Human-readable message
requested_ids (int): Number of document IDs requested
found_items (int): Number of documents successfully found
items (list): Array of retrieved document objects
not_found_ids (list, optional): IDs that were not found (for partial success)

Complete Example

from moorcheh_sdk import MoorchehClient

with MoorchehClient() as client:
    namespace = "my-documents"
    
    # Retrieve multiple documents
    result = client.documents.get(
        namespace_name=namespace,
        ids=["doc-1", "doc-2", "doc-3", "doc-4", "doc-5"]
    )
    
    print(f"Requested: {result.get('requested_ids', 0)}")
    print(f"Found: {result.get('found_items', 0)}")
    
    # Process retrieved documents
    for item in result.get('items', []):
        print(f"\nDocument ID: {item['id']}")
        print(f"Text: {item['text'][:100]}...")  # First 100 chars
        if item.get('metadata'):
            print(f"Metadata: {item['metadata']}")
    
    # Handle partial success
    if result.get('status') == 'partial':
        not_found = result.get('not_found_ids', [])
        if not_found:
            print(f"\nDocuments not found: {not_found}")

Key Features

Batch Retrieval: Retrieve up to 100 documents in a single request
Partial Success: Non-existent document IDs are ignored without causing errors
Efficient Processing: Uses optimized batch retrieval for performance
Flexible IDs: Document IDs can be strings or numbers

Best Practices

Use the maximum batch size (100 documents) when possible
Group related document retrievals to minimize API calls
Always check the found_items count vs requested_ids
Handle partial success responses gracefully
Cache frequently accessed documents client-side

Use Cases

Document Retrieval: Fetch specific documents by ID for display or processing
Content Management: Access and manage previously uploaded documents
Data Export: Extract documents for backup or migration purposes
Quality Assurance: Review uploaded content for accuracy and completeness
Integration: Sync document data with external systems and applications

Fetch Text Data - List text and summary chunks in a namespace
Upload Text Data - Add new text documents
Upload Vector Data - Add new vector embeddings
Delete Data - Remove specific documents
Search - Find documents using semantic search

​documents.get

​Parameters

​Example

​Response Structure

​Complete Example

​Key Features

​Best Practices

​Use Cases

​Related Operations

documents.get

Parameters

Example

Response Structure

Complete Example

Key Features

Best Practices

Use Cases

Related Operations