Skip to main content

similarity_search.query

Performs a semantic search across one or more namespaces.

Parameters

namespaces
List[str]
required
A list of one or more namespace names to search within.
query
Union[str, List[float]]
required
The search query (text or a vector).
top_k
int
default:"10"
Number of top relevant chunks for your query across given namespaces. Default is 10.
threshold
Optional[float]
Minimum relevance score threshold (0-1) to filter out chunks below this relevance level. Required when kiosk_mode is true.
kiosk_mode
bool
default:"False"
Enable kiosk mode to filter chunks below certain relevance. When kiosk mode is on, threshold is required.
Returns: Dict[str, Any] - A dictionary containing the search results under the results key. Raises: NamespaceNotFound, InvalidInputError.

Basic Example

Search Example
from moorcheh_sdk import MoorchehClient

with MoorchehClient() as client:
    results = client.similarity_search.query(
        namespaces=["my-faq-documents"],
        query="How long do I have to return an item?",
        top_k=5
    )
    
    for result in results.get('results', []):
        print(f"Score: {result['score']:.3f}")
        print(f"Text: {result['text'][:100]}...")
        print("---")

Advanced Examples

Multi-Namespace Search
from moorcheh_sdk import MoorchehClient

with MoorchehClient() as client:
    results = client.similarity_search.query(
        namespaces=["faq-documents", "policy-documents"],
        query="return policy",
        top_k=5,
        threshold=0.7
    )
    
    for result in results['results']:
        print(f"ID: {result['id']}")
        print(f"Score: {result['score']:.3f}")
        print(f"Text: {result['text'][:100]}...")
        print("---")
Vector Search
from moorcheh_sdk import MoorchehClient

with MoorchehClient() as client:
    # Search using a vector query
    query_vector = [0.1, 0.2, 0.3, 0.4, ...]  # Your query vector
    
    results = client.similarity_search.query(
        namespaces=["vector-embeddings"],
        query=query_vector,
        top_k=10,
        kiosk_mode=True,
        threshold=0.5
    )
    
    for result in results.get('results', []):
        print(f"Similarity: {result['score']:.3f}")

Complete Example

Complete Search Workflow
from moorcheh_sdk import MoorchehClient
import time

with MoorchehClient() as client:
    namespace = "customer-support"
    
    # 1. Create namespace and upload support documents
    client.namespaces.create(namespace, type="text")
    
    support_docs = [
        {
            "id": "policy-1",
            "text": "Our return policy allows returns within 30 days of purchase with original receipt.",
            "category": "returns"
        },
        {
            "id": "policy-2",
            "text": "We offer free shipping on orders over $50. Standard shipping takes 3-5 business days.",
            "category": "shipping"
        }
    ]
    
    client.documents.upload(namespace, support_docs)
    print("Documents uploaded, waiting for processing...")
    time.sleep(5)
    
    # 2. Perform searches
    print("\n=== SEARCH RESULTS ===")
    search_results = client.similarity_search.query(
        namespaces=[namespace],
        query="return policy",
        top_k=2
    )
    
    for result in search_results['results']:
        print(f"Score: {result['score']:.3f} | ID: {result['id']}")
        print(f"Text: {result['text'][:80]}...")
        print()

Search Result Structure

Search results contain the following fields:
Search Result Format
{
    'results': [
        {
            'id': 'document-id',
            'score': 0.85,  # Similarity score (0-1)
            'label': 'High Relevance',  # Human-readable relevance
            'text': 'Document content...',
            'metadata': {  # Your custom metadata
                'category': 'faq',
                'author': 'support-team'
            }
        }
    ],
    'execution_time': 0.123,
    'timings': {...},  # Detailed timing breakdown
    'optimization_info': {...}  # Search optimization details
}

ITS Scoring System

Results are scored using Information Theoretic Similarity (ITS), providing nuanced relevance measurements:
LabelScore RangeDescription
Close Matchscore ≥ 0.894Near-perfect relevance to the query
Very High Relevance0.632 ≤ score < 0.894Strongly related content
High Relevance0.447 ≤ score < 0.632Significantly related content
Good Relevance0.316 ≤ score < 0.447Moderately related content
Low Relevance0.224 ≤ score < 0.316Minimally related content
Very Low Relevance0.1 ≤ score < 0.224Barely related content
Irrelevantscore < 0.1No meaningful relation to the query

Best Practices

  • Use specific, clear queries for better results
  • Set appropriate thresholds to filter low-quality results
  • Use multiple namespaces for comprehensive searches
  • Consider kiosk_mode for production applications
  • Use appropriate top_k values - higher values provide more context but may increase response time

Error Handling

Robust Search with Error Handling
from moorcheh_sdk import MoorchehClient, NamespaceNotFound, InvalidInputError

try:
    with MoorchehClient() as client:
        results = client.similarity_search.query(
            namespaces=["my-namespace"],
            query="search query",
            top_k=5
        )
        
        if results['results']:
            print(f"Found {len(results['results'])} results")
        else:
            print("No results found")

except NamespaceNotFound:
    print("One or more namespaces don't exist")
except InvalidInputError as e:
    print(f"Invalid search parameters: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Use Cases

  • Document Retrieval: Find relevant documents across knowledge bases
  • Content Discovery: Explore related content with semantic understanding
  • Customer Support: Find relevant answers from support documentation
  • Research & Analysis: Search through research papers and technical documents
  • E-commerce: Product similarity and recommendation engines