Skip to main content
POST
/
search
curl -X POST "http://localhost:8080/search" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "on prem retrieval",
    "namespaces": ["my-documents"],
    "top_k": 5
  }'
{
  "results": [
    {
      "id": "a1b2c3d4e5f67890_chunk_0",
      "namespace": "my-documents",
      "score": 0.566222,
      "label": "High Relevance",
      "metadata": {
        "file_id": "a1b2c3d4e5f67890",
        "filename": "document.pdf",
        "department": "engineering",
        "summary_chunk_id": "a1b2c3d4e5f67890_summary_0",
        "summary_text": "This batch summary covers the main topics in document.pdf..."
      },
      "text": "Sample document content about product features and architecture..."
    }
  ],
  "execution_time": 0.108234,
  "timings": {
    "parse_validate": 0.0,
    "prepare_vector": 0.107252,
    "fetch_data": 0.0,
    "calculate_distance": 0.000336,
    "select_candidates": 0.0,
    "calculate_scores": 0.000072,
    "reorder": 0.0,
    "format_response": 0.0,
    "total": 0.108234
  }
}

Overview

Search stored items by semantic similarity and return ranked results with scores and relevance labels.
  • Text query — query string is embedded via your configured provider; search text namespaces
  • Vector query — pass a numeric array; search vector namespaces (length must match each namespace’s vector_dimension)
Text queries support metadata and keyword filters using #key:value and #keyword syntax at the end of the query (same as cloud Moorcheh).
Search errors return "status": "error" (not "failure") with HTTP 400.

Headers

Content-Type
string
required
Must be application/json

Body

query
string | array
required
Text string or array of numbers. For text queries, include metadata and keyword filters using #key:value and #keyword syntax at the end of the query.
namespaces
array
required
Non-empty list of namespace names to search. Each namespace must exist and match the query type (text vs vector).
top_k
number
default:"10"
Maximum number of results to return. Clamped to 1–100. Default is 10 if omitted.
threshold
number
default:"0"
Minimum score threshold (0–1). Used when kiosk_mode is true.
kiosk_mode
boolean
default:"false"
When true, threshold is required and results below the threshold are filtered out.

Advanced filtering

Metadata filters

Use #key:value at the end of the query to filter by stored metadata:
  • #department:engineering — items where department equals engineering (case-insensitive)
  • #category:tech — items where category equals tech

Keyword filters

Use #keyword (no colon) to require that word in the result text:
  • #important — result text must contain important
  • #python — result text must contain python

Combined example

authentication #category:security #important
Embeds authentication, filters metadata category=security, then keeps only hits whose text contains important.
  • Filters must be placed at the end of your query
  • Use hyphens instead of spaces in filter values (e.g. #author:john-doe)
  • Vector queries do not support # filters (text queries only)
curl -X POST "http://localhost:8080/search" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "on prem retrieval",
    "namespaces": ["my-documents"],
    "top_k": 5
  }'
For vector search, the query array length must match vector_dimension for each namespace (for example 5 or 768 depending on how the namespace was created).

Response fields

results
array
Ranked search hits, highest score first. Empty array when nothing matches (including strict metadata or keyword filters).
results[].id
string
Item id in the namespace (for file upload: {file_id}_chunk_{n} or {file_id}_summary_{batch}).
results[].namespace
string
Namespace that owns this result.
results[].score
number
Similarity score between 0 and 1, rounded to 6 decimal places.
results[].label
string
Human-readable relevance label derived from the score (see table below).
results[].metadata
object
Metadata stored with the item. For the top relevant content chunk, may include summary_chunk_id (bare id) and summary_text (batch summary fetched automatically). Uploaded file chunks also include file_id, filename, source, etc.
results[].text
string
Document text for text namespace hits. Empty string "" for vector namespace hits.
execution_time
number
Total request time in seconds.
timings
object
Detailed timing breakdown for each search phase, in seconds.
status
string
"error" on validation or search failures (HTTP 400).
message
string
Error description when the request fails.
{
  "results": [
    {
      "id": "a1b2c3d4e5f67890_chunk_0",
      "namespace": "my-documents",
      "score": 0.566222,
      "label": "High Relevance",
      "metadata": {
        "file_id": "a1b2c3d4e5f67890",
        "filename": "document.pdf",
        "department": "engineering",
        "summary_chunk_id": "a1b2c3d4e5f67890_summary_0",
        "summary_text": "This batch summary covers the main topics in document.pdf..."
      },
      "text": "Sample document content about product features and architecture..."
    }
  ],
  "execution_time": 0.108234,
  "timings": {
    "parse_validate": 0.0,
    "prepare_vector": 0.107252,
    "fetch_data": 0.0,
    "calculate_distance": 0.000336,
    "select_candidates": 0.0,
    "calculate_scores": 0.000072,
    "reorder": 0.0,
    "format_response": 0.0,
    "total": 0.108234
  }
}

Relevance labels

Score rangeLabel
≥ 0.894Close Match
≥ 0.632Very High Relevance
≥ 0.447High Relevance
≥ 0.316Good Relevance
≥ 0.224Low Relevance
≥ 0.1Very Low Relevance
< 0.1Irrelevant

Summary enrichment

For the top search hit (or the second hit when the first is a summary chunk), the API fetches the linked batch summary and adds summary_text to that result’s metadata. Only one result per search is enriched. /answer uses the same behavior for RAG context.

Important notes

  • Text search requires your configured embedding provider (Ollama, OpenAI, or Cohere)
  • You cannot search text namespaces with a vector query or vice versa
  • #key:value metadata filters are case-insensitive; keyword filters match substrings in result text
  • With kiosk_mode: true, set threshold (0–1) to control minimum relevance