Skip to main content

vectors.upload

Uploads pre-computed vectors to a vector namespace. This is a synchronous operation.

Parameters

namespace_name
str
required
The name of the target vector namespace.
vectors
List[Dict]
required
A list of dictionaries. Each dict requires an id and a vector key.
Returns: Dict[str, Any] - A dictionary confirming the upload status. Raises: NamespaceNotFound, InvalidInputError.

Example

Upload Vectors Example
from moorcheh_sdk import MoorchehClient

with MoorchehClient() as client:
    vectors_to_upload = [
        {
            "id": "image_001.jpg",
            "vector": [0.12, -0.45, 0.23, 0.67, ...],  # Must match namespace dimension
            "source": "product_database",
            "category": "electronics"
        },
        {
            "id": "image_002.jpg",
            "vector": [-0.22, 0.81, -0.34, 0.12, ...],
            "source": "product_database",
            "category": "electronics"
        }
    ]

    status = client.vectors.upload(
        namespace_name="my-image-embeddings",
        vectors=vectors_to_upload
    )
    print(status)

Vector Data Structure

For vector uploads, ensure your vectors match the namespace dimension:
Vector Structure
vectors = [
    {
        "id": "embedding-1",
        "vector": [0.1, 0.2, 0.3, ...],  # Must match namespace dimension
        # Optional metadata
        "source": "image_database",
        "category": "product",
        "confidence": 0.95
    }
]

Complete Example

Complete Vector Upload Example
from moorcheh_sdk import MoorchehClient
import numpy as np

with MoorchehClient() as client:
    # Create a vector namespace with dimension 768
    client.namespaces.create(
        namespace_name="product-embeddings",
        type="vector",
        vector_dimension=768
    )
    
    # Generate or load your vectors (example with random vectors)
    vectors = []
    for i in range(10):
        # Your actual embedding generation code here
        vector = np.random.rand(768).tolist()  # Example: 768-dimensional vector
        vectors.append({
            "id": f"product_{i}",
            "vector": vector,
            "category": "electronics",
            "price": 99.99 + i * 10
        })
    
    # Upload vectors
    result = client.vectors.upload(
        namespace_name="product-embeddings",
        vectors=vectors
    )
    print(f"Uploaded {len(vectors)} vectors")

Important Notes

Synchronous Processing: Vector uploads are processed immediately and are available for search right away.
Dimension Match: Vectors must match the exact dimension specified when the namespace was created. All vectors in a batch must have the same dimension.

Vector Requirements

  • Dimension Match: Must match namespace dimension exactly
  • Common Dimensions: 384, 768, 1536, 3072
  • Value Range: Normalized vectors preferred (typically -1.0 to 1.0)
  • Batch Size: Max 1000 vectors per request, Recommended 100-500
  • Precision: Float32 precision, up to 7 decimal places

Common Embedding Models

  • OpenAI text-embedding-3-large: 3072 dimensions
  • OpenAI text-embedding-3-small: 1536 dimensions
  • OpenAI text-embedding-ada-002: 1536 dimensions
  • Sentence-BERT: 384 or 768 dimensions
  • Universal Sentence Encoder: 512 dimensions

Best Practices

  • Use high-quality, domain-appropriate embedding models
  • Normalize vectors to unit length for cosine similarity
  • Ensure consistent preprocessing and tokenization
  • Test with sample searches before large uploads
  • Upload 100-500 vectors per request for best performance
  • Use meaningful IDs for easier management and updates
  • Include original text when possible for result display